Yes, but I’ve been focused on RM, not modelling, and things are different when it comes to these two
When it comes to your question, it’s a difficult one. Some of the points Tom made in response to you deserve another dedicate response, but I’m not sure if I can find the time for that, so I’ll try to respond to you (mainly).
IHMO the criteria for choosing one of inheritance and composition over the other changes between the software development and data modelling contexts for openEHR. Even though the terms have mainly the same meanings, their mechanics are different when we’re talking about object oriented programming languages and openEHR models.
I think I can only share some thoughts I consider to be decision criteria, and I’d expect someone to compose those (no pun intended) when making modelling decisions. These are the thoughts of a programmer, not a clinician, but I’d expect my arguments to make sense to clinicians as well.
openEHR archetypes are meant to be maximal datasets. They’re inclined to grow in content in time, and that growth is meant to ensure clinical data created years later to be compatible with earlier data. Breaking changes in models introduce the essence of “this lump of data is not compatible with that lump of data” problem which we’re trying to decrease as much as possible with openEHR, though the problems introduced by breaking changes in archetypes are a lot more isolated and controllable than the case of chasing a retired nurse to ask for the source code of the program they wrote in Delphi, which has been running for 18 years in a department…
So a criteria for a modeller is: how important it is for them to ensure data compatibility between versions.
This criteria interacts with another one : how much freedom the modeller would like to give to other modellers for reusing their model? If we were to keep adding more data points to an archetype, then inheritance turns that into a convenient set of data points for any archetype specialising it (inheriting from it). This is where the second level of openEHR’s design has an advantage over its first level and mainstream OO languages: specialisation can throw away the data points that are not useful/relevant via templating, but there’s no such option in implementations of RM or in mainstream OO programming languages: you have to live with what you inherit.
So archetype modelling is more robust then data modelling in programming languages, because it can deal with the antipattern known as the god class thanks to templating mechanism.
There are gotchas though. If templating and specialisation were all that we needed, openEHR would have one archetype, with an ever growing number of data points, and we’d have templates eliminating everything they did not need, just to keep some data points.
Leaving aside the difficulty of navigating such a semantic beast, there’s one other criteria that stops this from happening, a modelling criteria which also interacts with the others: is this a mandatory data point?
See, mandatory data points the problem from OO languages, because you can’t get rid of them by templating, so if you inherit the archetype, you have to live with that data point. Our modeller now has a choice. If they put the mandatory data point into an archetype, the reuse via inheritance comes with a a price. Not only that, but also data based on future versions of that archetype must populate this field, so there’s a responsibility to bear for other modellers even if they never inherit from it. (to conclude my point above: if openEHR had a god archetype, it’d have so many mandatory data points, it’d be impossible for downstream users to use templating to produce anything sensible.)
Most of the the benefits of composition over inheritance in the OO programming languages land come from avoiding the problems of having to deal with stuff you inherited and cannot omit. My humble opinion is, if you don’t have a strong conviction about a data point being mandatory, the combination of specialisation and templating is a nice way of offering reuse for your models.
Composition can still come to your help though One situation is, when you’re making various optional data points available to future specialisations, but there is some semantic cohesion between a subset of your data points. I.e. they’re meant to be inherited together to be useful, they relate to each other, or there are some invariants that apply that must hold when a combination of optional data points are used together.
You cannot express these without explicitly identifying that semantically coherent group of data points, and if they don’t have any dependence (cohesion) with the rest of their siblings, then you may want to switch over to composition over inheritance but introducing a new archetype with these data points, which’d let you make the implicit points explicit.
The same applies when a data point being mandatory is conditional upon use or values of other data points. In that case, there’s no need to leave a mandatory data point high up in the archetype inheritance/specialisation hierarchy. You can pack data points relevant-to-the-mandatory-one into an archetype, use composition (slots) to reuse it (optionally), but still keep the mandatory data point mandatory within that archetype, but now you’ve isolated that strong condition to a smaller model rather than forcing it as a contract on all specialisations. I think I saw a comment from @siljelb hinting at this direction, though not entirely:
I think if the element mentioned in the quote above had some relevant data points, there could have been another way to make it available to AQL queries. If the data point and its kin had enough significance to become an archetype, then using composition (slots) to include it in models would make it possible to do
SELECT cls/..data_point_we_want/.../value FROM ... CONTAINS CLUSTER cls[that_extracted_cluster_id]
because the cluster archetype would provide a semantic root from which we can acess the data point, no matter where that semantic root is in any other model including it. Happy to be corrected on this one.
So if I was a clinical modeller, these would be the things I’d keep an eye on when making decisions. They may be entirely rubbish of course, in which case I more than welcome some education
ps: event though I said I’ll write a dedicated post, I have to say @thomas.beale : to be specific, subtype polymorphism is undefined in AQL as things stand, as far as I know. If any vendors are implementing it, it’d be interesting to know
ps2: lots of grammar errors and typos, but I’m really busy, sorry