Greetings,
According to adl 1.5 document on the openEHR web site (issued 25 Jan 2012), Section 5.3.6.3, the runtime paths for single valued attributes can omit node identifer.
The example given in the document uses miles per hour and km per hour alternatives. The thing is, if the runtime path is what is going to be persisted (and I can’t see any other practical cases), the persisted data will have no information to mark the semantics of the selection of an option among alternatives.
In case of a query such as get me all Xs where value is expressed as km per hour, the system can not know what which option was used: kmph or mph, because there is not node identifier.
In decision support/ machine learning use cases, having different units of measurements would lead to mayhem etc etc.
So there is a potential loss of information here, in exchange for flexibility, but what is the use of that flexibility? Should not the node identifier be mandatory in runtime paths when there is a decision among alternatives?
To comment on my own query:
Section 5.3.12 of the same document says that node identifiers are mandatory for the case I’ve referred to in 5.3.6.3, but there is no explicit metion of runtime paths. So does this rule cover runtime paths too? I think it should, not due to ambiguity, but due to loss of semantics
actually, this text is a bit misleading. If we have the archetype ELEMENT[at0004] matches { – speed limit value matches { QUANTITY[at0022] matches { – miles per hour magnitude matches {|0..55|} property matches {“velocity”} units matches {“mph”} } QUANTITY[at0023] matches { – km per hour magnitude matches {|0..100|} property matches {“velocity”} units matches {“km/h”} } } } then the data instance created from the at0022 form of the QUANTITY will be (in dADL): items = < [“1”] = < – ELEMENT archetype_node_id = <“at0004”> value = < – QUANTITY archetype_node_id = <“at0022”> magnitude = <25> > > [“2”] = <…> etc > so the path will choose the quantity, although would do just as well. (Remember, the Xpath equivalents are etc - the [at0022] is just a shorthand selection predicate.) The paths are not ‘persisted’ as such - just the data. The paths are always derivates of the data. in this case, use the path . - thomas
Since this will be handled by tools, I don’t see the point of having to worry about if the node has an id or not: the tool just put some node ID on each node and us as developers use that fact to query and process data. It seems so much simple to have only one criteria, and we don’t lose flexibility or expresiveness.
Hi Thomas, thanks for the answer. (now I see the problem of doublign the number of node ids)
I understood the Seref’s problem as a case that could not be decided automatically by a system, i.e. when to use and when not use the nodeID to query and to get the desired node.
Re-reading your response, I believe this should be part of the specs as a rule: “…in this case, use the path items[at0004]/value[at0022]…”, i.e. when you have alternatives but, in the data, one of them was choosen.
Hi Pablo,
My problem was not what you’ve described. I thought there was a gap in the spec that let data exist without a node Id when there is need for one. That was not the case.
As long as the archetypeNodeId is in the data, there is no need for a rule that enforces node id based path usage, because node Id is there to find out if necessary.
(this is not the easiest thing to communicate over e-mail )
Node IDs at0022 and at0023 have no semantic significance, they are just a value of a speed limit element no matter if they are in km/h or mph. These are just alternative value constraints on the value due to different units allowed and range when using that unit. When you query you would want to get all speed limit values and if you needed to compare then you need to convert based on the units.
The instance should actually look like the following
items = <
[“1”] = < – ELEMENT
archetype_node_id = <“at0004”>
value = < – QUANTITY
units = <“mph”>
magnitude = <25>
[“2”] = <…>
etc
However, one area that is problematic is in the validator, how do we know which constraint we should use when the constraints are ambiguous. The example provided previously does no have this issue but if you consider an element with an alternative values of type DV_TEXT allowing free text and DV_CODED_TEXT in some specified terminology.
ELEMENT[at0004] matches {
value matches {
DV_TEXT matches { * }
DV_CODED_TEXT matches { – km per hour
defining_code matches { |SNOMED-CT: }
}
}
}
This cases doesn’t require at-codes because they are different types, but they are still ambiguous due to the inheritance allowing any coded term to be used, not just the specified term.
Here it would be nice to have an at-code in the instance to differentiate which alternative is being used.
Hi Heath,
Maybe semantics is not the right word for it, but it is what would help me/my code easily express that the interest is in a particular element, given a bunch of options. The lack of node identifier is thus at least lack of information, if not semantics.
Not to suggest that you’re wrong in this specific example, but the following quote from page 47 of the adl 1.5 spec makes one inclined to assign the responsibility of expressing semantics to node identifiers, or at least that is my impression:
“The node identifier also performs a semantic function, that of giving a design-time meaning to the
node, by equating the node identifier to some description”
Maybe I’ve incorrectly generalized this statement.
Hi Heath, you are no doubt assuming that the ‘QUANTITY’ type in the RM doesn’t carry archetype node+id information, which is indeed the case for DV_QUANTITY as used in openEHR, but I was not assuming that for the example. In that case you are right of course - querying would have to take account of it in another way. One thing we potentially should do is include the ‘property’ attribute in DV_QUANTITY, for just this purpose. I will add something to the documentation to indicate these subtleties. as it happens, due to another discussion I have already changed this rule to say force at-codes even if the RM types are different under a single-valued attribute. This will be annoying for some current archetypes, but that’s life. - thomas
The quote you have reference seems to indicate that it is part of a greater explanation of the purpose of node ID. Semantics is one, and as a distinguisher is another. For me, the speed limit example is the second and because DV_QUANTITY has a unit property that provides the querying requirement you are looking for there is no need for the node ID to be provided in the instance, which as Thomas points out doesn’t exist. The free text/coded text example, which Ian constantly battles with, is one case where the distinguish does add value to a validator, but again we have nowhere to put.
I actually didn’t even consider the fact that DV_QUANTITY didn’t have a node ID property, but good point. My point was that the units property provides the information required to determine if a speed limit value was mph or km/h.
Not sure I like the idea of adding the property attribute to DV_QUANTITY, this is metadata that can be derived from the archetype if required but I just don’t see the use case. I think we have more than enough metadata in the instance already.