Named element - and occurences

bna · 4 April 2022 10:32

A customer of us are using Archetype Designer (tools.openher.org) to create templates. It is possible to give the element a specific name and still have the element occurrences to unlimited.

In the generated OPT the definition is like :

<item xsi:type="C_STRING">
       <list>NEW_NAME_OF_ELEMENT</list>
</item>

As we understand this the name of the element is constrained to only allow a name like NEW_NAME_OF_ELEMENT, which in turn makes it impossible to repeat the element and still have it uniquely addressable with a path.

There is ticket in process to add a LOCATABLE.sequence_id [SPECRM-63] - openEHR JIRA . Before this is resolved and implemented in the specifications and software it will not be possible to change the name of an element and still have multiple occurrences.

Any other views on this ?

Seref · 4 April 2022 11:15

Just to make sure I understand the issue/question:

You mean: it should be possible to constraint the value for the name of an element to a specific value, but have unlimited instances of that element in the data, each valid according to constraint on the value of name.
Did I get it right?

Interesting one. I think this depends on the interpretation of a constraint on the name of an element. If we’re interpreting the the specificity of the value of the name as uniqueness, i.e. there should be only one element at this RM path with name having this value , then the tool’s behaviour sounds correct.

If the specificity of the value of the name means elements at this path can have only the following name, regardless of how many of elements may exist, then the tool’s behaviour is not correct.

I’d expect the latter to be the semantics of the constraint on the value of name. A constraint assigned to a string value of a property having side effects on cardinality does not sound right to me. I.e. specificity implying uniqueness.

I’m not sure if this is the kind of view you were expecting, but for what it’s worth this is mine

bna · 4 April 2022 11:50

I agree about this. This is the intention by the clinical modellerer. Still the RM implementation have a problem then by creating unique paths to the LOCATABLE elements. In our implementation we use the # annotation to give each repeating locatable a unique name, like this NAME#<item_number>

How do other implementers handle this?

yampeku · 4 April 2022 13:38

I think this is the case for temporal series of repeated measures (for example)

bna · 4 April 2022 14:56

Yes. This is one of the usecases for repeated data like elements, clusters or entry archetypes.

The problem here is how to interpret the constraint on the name. Should it be interpreted as given and not to be changed? There are several usecases where the name is defined as fixed by the modelerer.

Or should it be interpreted as the prefix for the name? AFAIK many implementations use a postfix with #id for repeatable structures. So do we. But for this kind of templates we assume the modelerer has set a fixed name.

yampeku · 4 April 2022 15:22

If it’s in an archetype then I assume it’s fixed to that value. If it’s in a template you are probably redefining the semantics of the node, so an atXXXX.1 , atXXXX.2 , atXXXX.n would be needed, where you can fix the value of the name to all the alternatives you need

Seref · 4 April 2022 19:00

Before I forget, this is my take from today’s discussion (which took place elsewhere), kind of a set of meeting minutes:

There’s nothing in the openEHR specs that indicates that constraining the specificity of a name of an element should imply uniqueness of that element in terms of cardinality, if the element is under a type that allows anything beyond 0…1
Modelling tools however, interpret a constraint on a name as a constraint on its cardinality as well. Ocean template designer and Better’s Archetype Designer both modify the cardinality of an element in this case.
Users are reportedly manually modifying the generated models to relax the cardinality constraint, therefore reverting the interpretation of the tools.
Bjorn’s users do not care about cardinality, they just need the element to have only one name.
(I think) this is to ensure queries can fetch this data element

There are two ways the spec can clarify this ambiguity: at the RM level, or at the ADL (constraint) level. This is discussion to be had in the future in SEC.

The most pragmatic thing for @bna may be to follow other users and modify the tooling output, and also suggest to Better that their tool displays a warning that constraining the name will modify the cardinality as well (@erik.sundvall 's suggestion as far as I can remember)

@heath.frankel I suspect you have some valuable historical background for this, as usual, so feel free to educate me here.
(as the various edits prove, I’m tired, so I’ll go have some tea now)

yampeku · 4 April 2022 20:06

I was a bit lost during the meeting, but this has lost me completely:
Name attribute on Element has cardinality of 1…1
The value of a DV_TEXT inside of a name has also 1…1 cardinality.
When we talk about “lists” in this context (and fixing to a specific value), are we talking about list constraints of the string inside the DV_TEXT? Alternatives of DV_TEXT? Fixing to a specific text is selecting a valid string for an actual list of strings coming from the parent archetype or is selecting a DV_TEXT alternative?

heath.frankel · 4 April 2022 22:24

Quick answer for now. There is much history here, and from memory you are correct, there was always an undocumented rule that the name of a node needed to be unique within container. This applied at the RM level.
In order for templates to be implementable against the RM, not just logical, when a constrain was applied to the name attribute of a node, the maxOccurs for that name constrained (cloned) node was explicit changed set to 1.
Although this ensured templates would be implementable against this undocumented RM rule, it made logical modelling difficult where the name was constrained purely for labelling the node and not intended to constrain the occurrences.
The cause of this problem was that in ADL1.4, we did not have any means to reference a constrained node other than using a name constraint. I understand this has been resolved in ADL2.
I believe there was a Jira card raised for this issue to make it explicit in the RM that names do NOT need to be unique within a container, and Ocean proceeded to change their EHR libraries to allow this.
However, the TemplateDesigner was not changed, partial to ensure this was done in a coordinated manner across the openEHR tooling chain including CKM and Archetype Designer, etc.
Hope this helps.

joostholslag · 5 April 2022 06:36

Am I correct this is no longer an issue in adl2? Since I think I’ve done this multiple times without issue. (Only issue was redefinition of a mandatory node; that’s been fixed recently)

Seref · 5 April 2022 07:52

You have every right to feel lost here, it took us a few rounds of communication to land on the same page

When we talk about lists, we’re talking about containers, as Heath correctly said in his message.

So we’re talking about things that can contain more than one ELEMENT, i.e. subtypes of ITEM_STRUCTURE.

The situation at a high level is: when a constraint is introduced for the name of an ELEMENT, the ITEM_STRUCTURE subtype containing the ELEMENT with that constraint cannot contain more than 1 instance. I don’t know what the tools do exactly for ITEM_TREE, ITEM_list etc but that’s the general behaviour we’re talking about. Bjorn can clarify where the ELEMENT in his archetype was sitting when this problem emerged

yampeku · 5 April 2022 12:01

I don’t think we enforce this behavior BTW
Probably related that we usually fill name attribute on the mapping phase to allow multilinguality

thomas.beale · 5 April 2022 13:15

We have not quite resolved it - to do so would mean allowing the formation of paths using Xpath-like predicates, e.g. items[3]/events[1] and so on, if we wanted to access the Nth item / event.

The original idea of requiring LOCATABLE.name to be unique among siblings was that runtime names should have clinical meaning. An runtime path of the form items[at0004, 'name1'] (meaning items[at0004 and name/value = 'name1']) will access the at0004 node that has its nameattribute = name1. It has been historically assumed that these names had to be unique in order to construct such paths.

If we forget the unique name requirement, which we gave up on some years, ago, it means a path like the above could select two or more nodes. Thus, some other kind of predicate is needed to uniquely select nodes in runtime data. The most recent (5y ago) was LOCATABLE.sequence_id would allow this to work, or we could just use plain list order, as above, i.e. items[3] etc.

yampeku · 5 April 2022 13:32

CDA uses that kind of paths in implementation guides by the way. If you can ensure that objects are ordered inside (which you can mandate in ADL) then paths using position() are just fine

Seref · 5 December 2022 15:58

Hello @thomas.beale @bna @yampeku
I’m pinging again about this because I just came across some data today that was committed with …name_1, …name_2 values and I found this topic while searching the forum which I’ve completely forgotten. Tom: I have a question. Your comment above is interesting:

What’s the problem here? There are many cases in which a path can return N results, COMPOSITION\content being an obvious one. What’s the problem with assuming a path may return N results? It’s all over the RM in the form of fields with cardinality greater than 1 and we’re already living with it in AQL implementations.

Can we progress with this in SEC? I think it’s a pretty fundamental thing and I’m in the mind that we should remove the unofficial rule Heath mentioned above, but I’m happy to have a conversation going as a starting point

thomas.beale · 5 December 2022 20:47

Yep - that’s not the problem (and it’s why there are functions returning multiple items in PATHABLE). The question is: is there a way to build a path that refers to a specific leaf deep in a hierarchy? I.e. a so-called runtime path.

The original problem posed by @bna was whether you can have the name field constrained to X and still have multiple occurrences of that archetype node. Answer: you can (that’s the rule we changed), but it means you now can’t count on the name field as a guaranteed node identifier to form a run-time unique path. Which means such paths need to use some other field, or else ordinal position (i.e. items[3] or similar).

We have a long-term idea currently known as sequence_id (see this CR) that would provide another field guaranteed to have a unique id (without killing the system with UIDs or whatever), to help the formation of such paths. Evidently, solving this issue (not too hard technically) hasn’t been a high priority, or we would have done it.

yampeku · 5 December 2022 21:00

But it says it’s mandatory in the description, I would assume at least make it optional and use it only where needed

thomas.beale · 5 December 2022 21:22

Yep the idea is that sequence_id (sometimes known as an ‘accession number’) would always be set on every item added under a container attribute, so then it’s reliable. We didn’t implement it to date, because it gets tricky over versioning with node deletions, and keeping track of the highest code. Needs a bit more design thinking.

Seref · 6 December 2022 09:27

Ok, this was the bit that was not clear to me. I didn’t realise this change took place. Is it in the spec/documented now?

that is now a relevant, but separate requirement IMHO. Personally, I apologise for being pedantic, but I disagree with this:

the path is always unique since there is only one name property in the RM. It’s the value at the path that may not be singular, and that is not a problem because we know we’re pointing at the name of something that is sitting in a container thanks to RM.
So I’ll be even more annoying and redefine your comment as no longer being able to piggyback the artificial cardinality constraint as index access [0]. I am not sure how much real world value we could get from sequential access to containers: why would [i] be of interest to someone compared to [i-1] or [i+1] ?
So if the change you mentioned above is made, then this issue is actually solved. Indexed access and its benefits is another discussion IMHO.

thomas.beale · 6 December 2022 10:47

https://openehr.atlassian.net/browse/SPECRM-27

I used to, but was convinced otherwise. I blame @heath.frankel But more seriously, various people close to implementation said similar things about that rule being too restrictive.

Not sure what you mean by ‘the path is always unique’ - to me that means: a path with fixed ‘name’ values always picks out exactly one item in a tree. Which it is not guaranteed to, due to the relaxation of the unique name rule.

The only interest is that once you have such a path and know which node it corresponds to, that correspondence is guaranteed unique and correct forever. Otherwise, I agree - that’s why we have not actioned that CR, and maybe we never will. It’s really just a placeholder for the problem of unique paths that are not difficult to construct.

Personally, I would still prefer if the name field were unique across sibling objects in a container - that was its intent, and it makes for easy unique runtime path construction.

The problem of managing uniqueness within archetypes while still being able to constrain the name to something could have been achieved by allowing name field constraint to be a regex or pattern with wildcards e.g. "sample*". It would be up to the runtime system to ensure unique name values, e.g. like "sample#1", "sample#2", etc. Another alternative was to allow the fixing of the name field, and also occurrences = 1, but then to allow more clones of the same object, with no name field constraint, or a pattern-based one. All possible in ADL2. But the current solution is ok as well.