I spent a bit of time trawling back through that last discussion on C_OBJECT.node_id (the property that carries at-codes) and whether it should be mandatory or optional, and also whether empty is valid.
Currently it is mandatory, and can’t be empty.
We need to make the spec work for 2 distinct styles of modelling:
-
13606 style - requires an at-code on every node, but only some have to be defined in the ontology section
-
in this case, the node codes really are like identifiers only, some of which happen to have a ‘meaning’ define for them
-
the result of this is the ability to do more mechanical processing of structures, since absolutely every node has an id.
-
openEHR / CIMI style - at-codes required according to rules needed to ensure uniqueness; if present, always defined in ontology section; they are optional on any other node;
-
in this case, the node codes (where they exist) really are codes, and are always defined as terms
-
the result of this, especially in larger archetypes is far fewer node ids (since very few at leaves), and shorter paths.
In theory, in both approaches the ontology section comes out about the same, since the LinkEHR-style ontology only contains at-codes with some ‘interesting’ or meaningful definition.
I would think our goal would be that archetypes written by anyone should work in anyone else’s tool. At the moment this probably isn’t the case:
- the AE and ADL workbench would both complain about archetypes with at-codes with no definition in the ontology section
- LinkEHR would complain about archetypes that had some nodes with no at-codes
As Pablo Pazos said in that previous discussion, we should make this work.
Can we arrive at a single set of rules that can accommodate everyone?
The thing that I think is the greatest point of difficulty isn’t node_id optionality (since a tool built with this assumption will accommodate archetypes with at-codes everywhere), but the question of whether there can be codes with no definitions in the ontology.
The original idea of archetypes was to ‘overload’ the basic types available in a generic information model to have domain level meanings. To my mind that is still the basis of the approach, as per this example from the blood_match archetype:

The driver for node codes isn’t primarily uniqueness, it is ‘meaning’. So above we have 3 ELEMENTs representing ‘ABO’, ‘Rhesus’ and ‘Anti-bodies detected’, and two others representing ‘Antibody’ and ‘Details’ - that’s the idea of domain ‘overloading’ of a reference model.
So it seems logical that:
-
any ‘overloaded’ node should have a definition of its id, otherwise, why overload?
-
Additionally, it seems obvious that where there are multiple sibling object nodes under an attribute, they all should have distinct codes and meanings, since otherwise … you don’t know what they mean, and the archetype isn’t doing it’s job.
-
It can also be the case that for some nodes, the RM default meaning (e.g. something like ‘OBSERVATION.protocol’) needs to be overloaded by a more specific meaning. So an at-code is added there as well.
So far so good. I think both camps agree on this - which nodes are ‘semantically overloaded’ should always come out the same in a proper modelling discussion.
Now, we also agree (as far as I know) that it’s a good idea to be able to identify any node in an archetype by a unique path, so that archetype nodes can reliably be associated and re-associated with data instance nodes (and also processed in all kinds of ways inside design time tools). Due to the above requirements, this is more or less guaranteed to be satisfied anyway.
However, LinkEHR has an additional requirement an id must exist on every single object node - including nodes with no special ‘meaning’, nor requiring a uniqueness marker - and so it adds that. It doesn’t require such nodes to have ontology section definitions, so the ontology isn’t affected, but it does now mean that there are node ids with no definition in the ontology. This is the point of breakage between tools.
I know the UPV guys have probably explained this before (maybe some years ago) but I am struggling to remember why this is needed, and what it adds. Can someone provide a summary of the explanation here (and any comments on the above)? I think that would help to know where we should go next with this.
- thomas