ADL 1.5 - relaxing a conformance rule - feedback sought

I am in the middle of ADL/AOM 1.5 testing. There is a validity rule I defined in the current draft specficatich reads as fllows:

VSONIR: specialised archetype object node redefinition: if it exists, the node identifier of an object node in a specialised archetype must be redefined into its specialised form if either reference model type or occurrences of the immediate object constraint is redefined.

Translation: change of occurrences or change of RM type (e.g. redefine into descendant type) requires a specialised at-code, e.g. at0002 → at0002.1 or similar.

In processing real archetypes and creating new templates, I am inclined to remove this rule, and say that the at-code only has to be specialised if the archetype author wishes to do so for semantic reasons OR if the parent node is redefined into multiple children (e.g. a node at0013 meaning ‘panel item’ gets specialised into at0013.1 (serum sodium), at0013.2, (serum potassium), at0013.3, etc).

I will experiment with removing this rule for the moment, and see if anything bad happens, but as far as I can see, nothing will. If we throw it away, it means that at-code specialisation really is only for semantic reasons, which would be nice and clean.

I am interested in any opinions on this.

By way of news: I am very close to a working implementation of AOM/ADL 1.5, and will release a new version of the ADL Workbench soon/

  • thomas beale

Hi Thomas,

A few concerns that come to my mind - I am not so sure that removing/changing this rule is a good thing:

  • It puts an additional burden on tools to support both ways of creating a specialised node wither with or without specialised at code.

  • It puts an additional burden on users that need to decide whether a specialised node should be created because the semantics have changed - I don’t think this decision is always that clear cut.

  • What if the non-semantically specialised node is LATER redefined into multiple children? Then a new version of the archetype is needed instead of just a revison?

  • With the stricter rule it a lot easier to recognize what has changed from parent to child archetype (this may not apply to 1.5 source ADL, but to the flat files anyway)

  • In general, I believe that these validity rules should be simple and straightforward, rather than complex “if this and that and then this, you need a specialised code”-statements.

Sebastian

Thomas Beale wrote:

I think I am mostly with Sebastian here.

We know that one of the lessons learnt from the early Template .oet definition is that it can be difficult to separate re-naming of nodes for semantic reasons vs. redifintions simple local synonyms ‘displayname’ in CDA speak.

I agree, too, that maintaining a specialised node at the end of the chain, allows for further specialisation, without a major version change.

A relaxation that might be worth considering is for ‘occurrences’ . I am not sure whether we would lose much if these did not need specialisation if redefined? They are likely to be the most common sort of redefinition, in Templates, constrained down to 0..0. In comparison name and datatype redefinitions will be comparatively rare.

So, I would prefer to keep the original rule for name and datatype redefinitions but relax it for occurences.

Ian

Dr Ian McNicoll
office / fax +44(0)141 560 4657
mobile +44 (0)775 209 7859
skype ianmcnicoll
ian.mcnicoll@oceaninformatics.com
ian@mcmi.co.uk

Clinical Analyst Ocean Informatics openEHR Archetype Editorial Group
Member BCS Primary Health Care SG Group www.phcsg.org / BCS Health Scotland

Sebastian,

I messed around with the compiler to allow this change, and came to the conclusion that it is probably not such a good idea, for some of the reasons you indicate below. I did think about the later redefinition of a single child into multiples, and this particularly would be very annoying to handle.

So the question is whether to go back to the previous rule, which was:

  • any change to RM type or occurrences means node_id must be specialised; in addition node_id may be specialised on its own for semantic reasons.
    Do we relax this to allowing change to occurrences, as Ian suggested (and this was the original reason I considered this change) or do we stay 100% strict. If we stay strict, it means that if you create a specialised archetype and only change (i.e. reduce) the occurrences of a node, you (the tool) are forced to specialise the at-code. Now, a purist might argue that you have indeed changed the meaning in some way. For example if you reduce the number of allowed events from 0..* to 1, then you have subtly redefined the event meaning from ‘any event’ to ‘single measurement event’ or so. Maybe the tools should just redefine the at-code automatically and create a meaning like ‘xxxx (redefined occurrences)’. But then this has to be managed in all the translations as well. It would not be a major problem, and probably we should just see it as a normal part of redefinition of archetypes. Note that in ADL 1.5, this reasoning applies to templates as well.

I will experiment with the setting where occurrences can change without forcing at-code change for the moment.

all thoughts welcome.

  • thomas
(attachments)

OceanInformaticsl.JPG

Hi Thomas,

I like the idea to generate specialized at-code when occurrences is changed. This is especially useful when an optional-multiple node [0..*] from a parent archetype is specialized into several single required nodes in a child archetype or a template. Without this rule, the only way to reach these specialized nodes is by using a path with a combination of the original at-code and a name predicate. Something like the following:

“…/items[at0004 and name/value=‘specialized node one’]”
“…/items[at0004 and name/value=‘specialized node two’]”

Note that this path is language-dependent thus not very useful if we want to build queries that could be used independent of language translations.

With this rule of enforcing specialized at-code in the child archetype, the same paths could look like these:

“…/items[at0004.1]”
“…/items[at0004.2]”

and they are not bound to any specific language, which means queries built on these path will survive in different language translations.

Besides, having specialized at-code in child archetypes and templates can facilitate archetype/template comparison(as Sebastian rightly pointed out) and archetype/template-based data validation based on our implementation experiences.

In fact, I would like to go further on simplification of the rules around at-codes - that is to separate the structural and semantical roles of at-codes. Currently these node identifiers have two very different tasks: 1) to differentiate nodes by unique at-codes so archetype paths can be built unambiguously; 2) to serve as a handle for semantic definition. Because of this mixture of two types of concerns, the discussions on whether or not we need an at-code become unclear and sometimes overloaded.

If we could separate these two roles of at-codes, the considerations could be simpler. Effectively what I am proposing here are 1) at-code (plain or specialized) is required whenever a node needs to be uniquely identified through a path without involving a language-dependent label; 2) only if the node needs to be semantically defined or re-defined, the same at-code could appear as part of the term_definition in the archetype ontology.

If the proposed rules are followed, it becomes easier to decide on the use case of specialized nodes. The same goes for whether or not to introduce new at-code for multiple choice of different data types defining a single RM attribute. In both cases, we do need extra at-codes because we want to reach them by a path without involve language-dependent text.

Because of the proposed rule 2, the burden to define each every at-codes in the ontology is gone. The concern over adding “superfluous codes in the archetype ontology” is not relevant in the considerations any more. A nice side-effect of this rule is that we can now remove some of these meaningless codes about “tree/list/table” in the archetype ontology and their translations. =)

/Rong

(attachments)

OceanInformaticsl.JPG

Hi Thomas,

I like the idea to generate specialized at-code when occurrences is changed. This is especially useful when an optional-multiple node [0..*] from a parent archetype is specialized into several single required nodes in a child archetype or a template. Without this rule, the only way to reach these specialized nodes is by using a path with a combination of the original at-code and a name predicate. Something like the following:

as soon as a parent node is specialised into multiple children in a specialised archetype, all of the children have to carry specialised at-codes anyway, to satisfy the basic rule that all sibling nodes of the same reference class under a particular attribute must always be distinguishable on node id.

“…/items[at0004 and name/value=‘specialized node one’]”
“…/items[at0004 and name/value=‘specialized node two’]”

Note that this path is language-dependent thus not very useful if we want to build queries that could be used independent of language translations.

the above kind of thing happens (to my knowledge at least) only when multiple instances of a given node defined in a system of archetypes and templates, are created at runtime. These are then only distinguishable by name or some other attribute. The current rule in openEHR is that the name field has to be made unique across children of an attribute; this needs to be relaxed so that name only needs to be unique across those siblings having the same at-code. Now, one of the big changes in ADL 1.5 is that anything you change in a template (+/- the discussion we are now having about occurrences etc) forces a new at-code specialisation, just like in an archetype, whereas in the current de facto template standard, it doesn’t - a lot of overriding of the name field is done in current templates. So a large proportion of his problem will go away with ADL 1.5.

With this rule of enforcing specialized at-code in the child archetype, the same paths could look like these:

“…/items[at0004.1]”
“…/items[at0004.2]”

and they are not bound to any specific language, which means queries built on these path will survive in different language translations.

Besides, having specialized at-code in child archetypes and templates can facilitate archetype/template comparison(as Sebastian rightly pointed out) and archetype/template-based data validation based on our implementation experiences.

yes, this was one of the motivations to change the templates design to force at-code specialisation just as for archetypes. In ADL 1.5, a template really is a kind of archetype.

In fact, I would like to go further on simplification of the rules around at-codes - that is to separate the structural and semantical roles of at-codes. Currently these node identifiers have two very different tasks: 1) to differentiate nodes by unique at-codes so archetype paths can be built unambiguously; 2) to serve as a handle for semantic definition. Because of this mixture of two types of concerns, the discussions on whether or not we need an at-code become unclear and sometimes overloaded.

If we could separate these two roles of at-codes, the considerations could be simpler. Effectively what I am proposing here are 1) at-code (plain or specialized) is required whenever a node needs to be uniquely identified through a path without involving a language-dependent label; 2) only if the node needs to be semantically defined or re-defined, the same at-code could appear as part of the term_definition in the archetype ontology.

I started having similar radical thoughts over the last few weeks as well. I am not yet clear in my own mind how to define the criteria for when an at-code needs a definition in the ontology, so that tools could police it. But if we could come up with something here, it could be very useful indeed; under the current rules, all at-codes have to appear in the ontology. Actually, I am inclined to stick to this rule in the literal sense, since weakening it will break everyone’s tools and certainly the reference compiler; instead maybe we could have a standard way of defining the ‘text’ and ‘description’ fields for ‘non-semantic’ at-codes (e.g. you could imagine having no ‘description’ at all for non-semantic at-codes). Now, just to take the devil’s advocate position, an ontological analysis would probably find that ‘everything is semantic’ and has to be defined/redefined accordingly. Anyway, I am interested in more ideas on this issue, if anyone has them.

If the proposed rules are followed, it becomes easier to decide on the use case of specialized nodes. The same goes for whether or not to introduce new at-code for multiple choice of different data types defining a single RM attribute. In both cases, we do need extra at-codes because we want to reach them by a path without involve language-dependent text.

yes, I agree - we need to minimise this. It will still happen a bit, but the other way to look at it is that with a single path (not containing any name/value=‘x’) you might just get back >1 data node, which is not wrong either.

Because of the proposed rule 2, the burden to define each every at-codes in the ontology is gone. The concern over adding “superfluous codes in the archetype ontology” is not relevant in the considerations any more. A nice side-effect of this rule is that we can now remove some of these meaningless codes about “tree/list/table” in the archetype ontology and their translations. =)

the latter are already removed automatically in the ADL 1.5 compiler - by removing the at-codes from the nodes completely, since they are not required, and they make paths longer and harder to understand.

With respect to having some nodes with an at-code but nothing in the ontology, I would need to see some rules that can actually work for this.

- thomas

Thomas,

With respect to having some nodes with an at-code but nothing in the ontology, I would need to see some rules that can actually work for this.

__*[HKF: ]*__ 

To me the rules when a node ID is not required in the ontology are pretty simple:

  • a node is not of type LOCATABLE

  • a node is the root ITEM of an ITEM_STRUCTURE (e.g. items CLUSTER or item ELEMENT)

The problem with not having a ontology item for the root ITEM of the ITEM_STRUCTURE is that we have nothing to use as the default name of that ITEM. This demonstrates how this level in the RM is semantically redundant.

We also need to be careful about your removal of node IDs on single attributes such as description. In a template, a description maybe filled with an ITEM_STRUCTURE archetype. Therefore in an operational template, this description node must retain the archetype ID of the filler ITEM_STRUCTURE.

Another case for optionally maintaining a node ID of single attributes is where it is desired to name this node. It is common in templates to rename the description to something like Medication description which results in an XML element with this name in a template data schema.

Regards

Heath

Thomas,

the above kind of thing happens (to my knowledge at least) only when multiple instances of a given node defined in a system of archetypes and templates, are created at runtime. These are then only distinguishable by name or some other attribute. The current rule in openEHR is that the name field has to be made unique across children of an attribute; this needs to be relaxed so that name only needs to be unique across those siblings having the same at-code. Now, one of the big changes in ADL 1.5 is that anything you change in a template (+/- the discussion we are now having about occurrences etc) forces a new at-code specialisation, just like in an archetype, whereas in the current de facto template standard, it doesn’t - a lot of overriding of the name field is done in current templates. So a large proportion of his problem will go away with ADL 1.5.

[HKF: ]

I would suggest that this unique name constraint is relaxed altogether. It is a relatively undocumented constraint that is not formally expressed in the RM, it is alluded to in one place in the COMMON Directory package.

This constraint is an artificial constraint enforced by the RM, which no other information model has, purely to ensure unique data paths. Although this reasoning is valid, I do not believe that enforcing this in the reference model is reasonable and is overly restrictive. It requires either the system or user to provide a unique name on every multiple occurrence of a node. In the case of the user this is onerous and makes user interfaces unfriendly. For systems to generate a unique name simply results in names such as Blood Pressure #1, or complex algorithms such as a series of concatenated data items to produce a unique name that may not conflict with another node, which long and replicates data specified elsewhere and still doesn’t guarantee uniqueness, e.g. Blood Pressure 2010-05-35 08:55:30.

My biggest concern is the overloading of the responsibilities of the name attribute, originally it was seen as the local name of archetyped concept, allowing data to be rendered with captions using these local names. Auto generated naming algorithms make these impossible to use for display purposes.

I think we have two choices:

  • if an application context requires uniqueness then it is the responsibility of the modellers and implementers to ensure there is some combination of attributes that can be used to identify data nodes uniquely using the attributes that make sense in that context. This may be name, uid, event time, an archetyped node.

  • provide a real identifier attribute designed for the purpose that can optionally be used when uniqueness is required.

The benefit of allowing non-unique paths is that we have a solution for the multi-value problem without any need to change the existing reference model (except for removing this informal unique name rule), allowing us to use a non-unique path to retrieve all answers for a multi-value question.

Regards

Heath

This makes a lot of sense from a clinical perspective, as there will be an unlimited number of replicated archetyped nodes/clusters that are just different instances of the same thing i.e. blood pressure, pregnancies, medications, diagnoses, adverse reactions, etc etc etc.

Thomas,

With respect to having some nodes with an at-code but nothing in the ontology, I would need to see some rules that can actually work for this.

__*[HKF: ]*__

To me the rules when a node ID is not required in the ontology are pretty simple:

  • a node is not of type LOCATABLE

at the moment, this is always true anyway. In the future, we will most likely need to make PARTICIPATION archetyped, even though it is not LOCATABLE. So I will keep this in mind in the specification and reference tool.

  • a node is the root ITEM of an ITEM_STRUCTURE (e.g. items CLUSTER or item ELEMENT)

The problem with not having a ontology item for the root ITEM of the ITEM_STRUCTURE is that we have nothing to use as the default name of that ITEM. This demonstrates how this level in the RM is semantically redundant.

Heath, I am not sure of what the issue is here - can you clarify?

We also need to be careful about your removal of node IDs on single attributes such as description. In a template, a description maybe filled with an ITEM_STRUCTURE archetype. Therefore in an operational template, this description node must retain the archetype ID of the filler ITEM_STRUCTURE.

that is true. Note that I have not ‘removed’ anything; the ADL workbench just treats these single-attribute nodes with only a single constraint in the same manner as XML - there is nothing to distinguish, so it doesn’t waste an at-code, ontology entry, or space in paths unnecessarily. But as soon as their are either multiple alternatives, or it is a multiply-valued attribute, it does require an at-code. But of course you are right about the archetype id needing to be retained in all circumstances, so I will also specifically indicate this in the specification and take care of it in the tool.

Another case for optionally maintaining a node ID of single attributes is where it is desired to name this node. It is common in templates to rename the description to something like Medication description which results in an XML element with this name in a template data schema.

In an ADL 1.5 template, this would cause a new at-code to come into existence at the specialisation depth of the template in question, e.g. it will be a code like at0.0.1. So this would effectively be a redefinition of a non-coded node into a coded one, which is a slightly special circumstance which I also need to take care of in the specification and tool.

These are all good points, and just go to show how important implementation is in telling us how to move forward. Would it be unkind of me to say that paper standards developed by committees don’t generally manage too well on this score?!

  • thomas beale

Heath,

I suspect that the way forward is to do as you say, and remove name-uniqueness, and simply accept and document the fact that there could be multiple values at any given path. I am not sure if this deals with all aspects of the multi-value problem, I need to re-analyse that one…

  • thomas
(attachments)

OceanInformaticsl.JPG