Modelling pattern - imaging examination - what do you think?

Not sure about this. AFAIK the query engine doesn’t need access to any archetype library. This is needed in design-time for templates and other artefacts for input, and also for the query developer for tooling and guidance. The query engine can work without any such archetype knowledge. It must somehow know about the RM, but that’s another story.

I would be happy to know if other implementers do this. This will certainly break most of our applications. Navigation though a hierarchy of archetyped data either with AQL or traversing paths within a composition tend to use the archetype root id as is. Adding some regexp around this adds new level of complexity.

What are the experiences from other vendors building clinical applications on this?

1 Like

We’ve found that Archetype Designer doesn’t change an element’s occurence in the child archetype, if it has been changed in the parent. Also an error when there has been changes of ‘Included’ Clusters in a SLOT element in the parent. Better is informed. Otherwise it seems to work.

One challenge we might experience in the review of specialised archetypes, is to make the reviewers understand that the inherited elements are “not for review”, only the specialised (or added) elements. To clinicians not knowing the way specialisations works, that can be a mystery, I guess. We have to deal with that in the invitation text, and explain.

3 Likes

Well that requires good visualisation. You can see in the screen shots I posted above in this thread how the ADL Workbench does it - new nodes are in blue, and new or changed constraints in red; nodes inherited unchanged are in grey. So this view shows you the totality of the archetype, but also the subset that you could review. Any visualisation that does the same kind of thing should help users understand specialisation I would think.

2 Likes

Yes, and luckily the CKM has a visual marking of which elements are inherited and which are new or altered. We’ll see if the reviewers will catch it! :slight_smile:

This makes me nervous. I don’t have the technical insight or experience from implementation to evaluate what is right and wrong, or the implications. To my knowledge, there is very few specialised archetypes published - I only know of TNM Pathological classification Clinical Knowledge Manager (openehr.org) which is specialised from TNM Clinical classification. It would surprise me if these aren’t already in use somewhere. How have vendors solved this? Calling for @ian.mcnicoll @borut.fabjan @joostholslag @Seref . Any others?

2 Likes

I forgot to mention something important - in ADL 1.4-based systems (currently most vendors today), the query engine can work out if there is data from specialised archetypes just by searching on the top archetype id with any extended form of the concept part of the id, i.e. if the parent is

openEHR-EHR-CLUSTER.exam.v1

in ADL 1.4, children are named like

openEHR-EHR-CLUSTER.exam-palpation.v1

and

openEHR-EHR-CLUSTER.exam-palpation-cervix.v1

So the query engine just has to know to search for openEHR-EHR-CLUSTER.exam%.v1 or similar (maybe there is something smarter you can do - need to check with SQL experts) - no need to have access to archetype repository.

Yes, but I’ve been focused on RM, not modelling, and things are different when it comes to these two :slight_smile:

When it comes to your question, it’s a difficult one. Some of the points Tom made in response to you deserve another dedicate response, but I’m not sure if I can find the time for that, so I’ll try to respond to you (mainly).

IHMO the criteria for choosing one of inheritance and composition over the other changes between the software development and data modelling contexts for openEHR. Even though the terms have mainly the same meanings, their mechanics are different when we’re talking about object oriented programming languages and openEHR models.

I think I can only share some thoughts I consider to be decision criteria, and I’d expect someone to compose those (no pun intended) when making modelling decisions. These are the thoughts of a programmer, not a clinician, but I’d expect my arguments to make sense to clinicians as well.

openEHR archetypes are meant to be maximal datasets. They’re inclined to grow in content in time, and that growth is meant to ensure clinical data created years later to be compatible with earlier data. Breaking changes in models introduce the essence of “this lump of data is not compatible with that lump of data” problem which we’re trying to decrease as much as possible with openEHR, though the problems introduced by breaking changes in archetypes are a lot more isolated and controllable than the case of chasing a retired nurse to ask for the source code of the program they wrote in Delphi, which has been running for 18 years in a department…

So a criteria for a modeller is: how important it is for them to ensure data compatibility between versions.
This criteria interacts with another one : how much freedom the modeller would like to give to other modellers for reusing their model? If we were to keep adding more data points to an archetype, then inheritance turns that into a convenient set of data points for any archetype specialising it (inheriting from it). This is where the second level of openEHR’s design has an advantage over its first level and mainstream OO languages: specialisation can throw away the data points that are not useful/relevant via templating, but there’s no such option in implementations of RM or in mainstream OO programming languages: you have to live with what you inherit.

So archetype modelling is more robust then data modelling in programming languages, because it can deal with the antipattern known as the god class thanks to templating mechanism.

There are gotchas though. If templating and specialisation were all that we needed, openEHR would have one archetype, with an ever growing number of data points, and we’d have templates eliminating everything they did not need, just to keep some data points.

Leaving aside the difficulty of navigating such a semantic beast, there’s one other criteria that stops this from happening, a modelling criteria which also interacts with the others: is this a mandatory data point?
See, mandatory data points the problem from OO languages, because you can’t get rid of them by templating, so if you inherit the archetype, you have to live with that data point. Our modeller now has a choice. If they put the mandatory data point into an archetype, the reuse via inheritance comes with a a price. Not only that, but also data based on future versions of that archetype must populate this field, so there’s a responsibility to bear for other modellers even if they never inherit from it. (to conclude my point above: if openEHR had a god archetype, it’d have so many mandatory data points, it’d be impossible for downstream users to use templating to produce anything sensible.)

Most of the the benefits of composition over inheritance in the OO programming languages land come from avoiding the problems of having to deal with stuff you inherited and cannot omit. My humble opinion is, if you don’t have a strong conviction about a data point being mandatory, the combination of specialisation and templating is a nice way of offering reuse for your models.

Composition can still come to your help though :slight_smile: One situation is, when you’re making various optional data points available to future specialisations, but there is some semantic cohesion between a subset of your data points. I.e. they’re meant to be inherited together to be useful, they relate to each other, or there are some invariants that apply that must hold when a combination of optional data points are used together.

You cannot express these without explicitly identifying that semantically coherent group of data points, and if they don’t have any dependence (cohesion) with the rest of their siblings, then you may want to switch over to composition over inheritance but introducing a new archetype with these data points, which’d let you make the implicit points explicit.

The same applies when a data point being mandatory is conditional upon use or values of other data points. In that case, there’s no need to leave a mandatory data point high up in the archetype inheritance/specialisation hierarchy. You can pack data points relevant-to-the-mandatory-one into an archetype, use composition (slots) to reuse it (optionally), but still keep the mandatory data point mandatory within that archetype, but now you’ve isolated that strong condition to a smaller model rather than forcing it as a contract on all specialisations. I think I saw a comment from @siljelb hinting at this direction, though not entirely:

I think if the element mentioned in the quote above had some relevant data points, there could have been another way to make it available to AQL queries. If the data point and its kin had enough significance to become an archetype, then using composition (slots) to include it in models would make it possible to do
SELECT cls/..data_point_we_want/.../value FROM ... CONTAINS CLUSTER cls[that_extracted_cluster_id] because the cluster archetype would provide a semantic root from which we can acess the data point, no matter where that semantic root is in any other model including it. Happy to be corrected on this one.

So if I was a clinical modeller, these would be the things I’d keep an eye on when making decisions. They may be entirely rubbish of course, in which case I more than welcome some education :slight_smile:

ps: event though I said I’ll write a dedicated post, I have to say @thomas.beale : to be specific, subtype polymorphism is undefined in AQL as things stand, as far as I know. If any vendors are implementing it, it’d be interesting to know :slightly_smiling_face:
ps2: lots of grammar errors and typos, but I’m really busy, sorry

2 Likes

The naming of the specialised archetypes has all (AFAIK) the same pattern as you’ve found and shown above. If the modellers keep on being as clever as until today and stick to that naming convention, your suggestion will work.

Hi @bna,

EHRbase currently does not allow to use wildcards for the archetype ID and therefore would not catch any entries based on a parent archetype. I think we could add this without breaking things, but its not on the roadmap, yet.

To be clear: my intended meaning is not that the query author has to think of putting in the wildcard; the AQL engine should always do it automatically.

1 Like

What would happen in the case someone only wants finds data within the specialized archetype?

To my experience, databases rarely work with inheritance. For example, when creating a SQL query to find employees, I might not want to get other entries from a person table but the ones which are relevant for the set of employees. From a clinical safety point of view, I think being explicit about the information need would be desirable.

1 Like

That would be a breaking change in our CDR. Such a requirement (if we agree on it) must be stated clear in the specifications.

1 Like

I would say it needs a switch to change the processing. We have to remember: if the query processor fails to pick up data of specialised versions of any archetype, it’s a real error, and it may have real-world consequences.

Not sure if this is true. Have a look at the Postgresql documentation. You can see that you have to do something special to get instances of the parent kind but not the children kinds (here: CITY is the parent type and CAPITAL is the specialisation):

SELECT name, elevation
    FROM ONLY cities
    WHERE elevation > 500;
1 Like

Ah, sorry, you are right and I misunderstood the case:

I was referring to the case where someone queries explicitly for the specialization. I think we have a common understanding that this should only retrieve data from instances of the specialized archetype.

For the query on the “parent”, it should retrieve data from the specialized instances, too.

Fully agree, it should be done this way.

2 Likes

Yep, that is certainly true.

I agree, wild-carding should be supported but not as a default.

Example for Better Docs.

SELECT
    bp/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value as systolic,
    bp/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value as diastolic
FROM
    EHR e[ehr_id/value='e119f88b-36b7-4537-9914-22bb9396e101']
CONTAINS COMPOSITION c
CONTAINS OBSERVATION bp[openEHR-EHR-OBSERVATION.blood_pressure.v*] 
LIMIT 5
```

and

[openEHR-EHR-OBSERVATION.blood_pressure*]
[openEHR-EHR-OBSERVATION.*pressure.v1]
[openEHR-EHR-OBSERVATION.blood*]

so can be used for 1.4 syntactic specialisations.

[openEHR-EHR-OBSERVATION.blood_pressure*]

Depends on where the wildcarding is. In the ‘v*’ bit you are right - hidden wildcards would be an error. Also if the engine were to do *pressure - that also has to be a user choice.

But for any concept of the form xxxxx, not automatically retrieving data for the archetype xxxxx-* is an error. Note - the ‘-’ has to be there.

For ADL2 systems, archetype lookup is needed, since concept names don’t follow the xxxx-yyyy-zzzz pattern.

1 Like

From a safety issue, I think I disagree. I would want to explicitly include specialisations, not have them automatically included by default.

From a safety issue, it is the other way around - it can easily be the case that some template is recording exactly the data in a more basic template, using a specialised version of the archetype, and the important data is in the inherited items - which could be the case for CLUSTER.exam-* archetypes. For the query processor not to pick up this data in response to a query using the parent archetype id is unequivocally an error - it’s the same data. So the users are just not seeing some of the data defined by the parent archetype, just because it happens to be inherited into a specialised child.

We need to talk about this, it’s a serious issue! I’ve put it on the SEC agenda for next meeting.

3 Likes

Safety in this manner is IMHO about being consistent. A given query on the same data should give the same vendor neutral result, and also be consistent across versions of the CDRs.

There are some important rules to agree upon here. I agree.

1 Like