Ordering of items within a CLUSTER

richard.kavanagh · 19 October 2023 12:30

I might be taking this too literally, but just in case…

In the RM documentation, a CLUSTER is defined as follows

Yet in most (all?) real-world examples you see this

So, is the list ordered or unordered?

richard.kavanagh · 19 October 2023 12:32

Actually, I see it can be either

richard.kavanagh · 19 October 2023 12:56

Hmm, the further I dig the more confused I become.

I see that the items list has cardinality and also can be structured/unstructured
And then each item with the list also has cardinality.

Where are these properties defined within the model?

Seref · 19 October 2023 13:38

These bits always do my head in, so you’re not alone @richard.kavanagh . It’s been eons since I looked at this stuff, so I’ll trust others to correct me if I’m wrong.

To partially (and hopefully somewhat correctly) answer your question: you’re looking at a bunch of constraint definitions when you’re looking at ADL. Even though the syntax and semantics is pretty consistent, not every constraint declared in and ADL file has a corresponding RM attribute that constrain can apply to. Some constraints have semantics that must be interpreted by the software processing ADL. If I understood your question correctly:

In case of cardinality and occurences these are conditions that must hold for actual data having the RM type of the field items, which is ITEM. I.e. These are not properties in the RM that must have various values, but conditions that must be asserted for actual RM based data to be valid according to given ADL.
So these properties are not defined within the model, assuming model means the RM (Reference Model). Their semantics can be found under container attributes section of ADL, at least for 1.4. (I didn’t check 2.0)

As it says in that section, ordered/unordered aims to assert the kind of container data should have. In this case, which is a logical bag: the concept of order of items in the container does not apply, in the context of this ADL snippet it means the ordering of the actual data under this container does not have to match the order of declarations in this ADL. (which follows from having not notion of ordering in a set or bag anyway)
(I’m not sure about this next one: If it was ordered, then actual data would have to have these potential elements in the container in the order they were declared. Due to definition of ordered in the specs, see below)

Now, with that in mind, the cardinality constraint on the container ends up saying that this container should have at least one element, (1..*) but note that this cannot express a constraint such as if there is an ELEMENT[AT0001] under this container, there should be only one of that. That constraint is expressed by setting the occurences to {0..1} meaning when this exists, it should only exists once. Occurrences is not an RM attribute either, just like cardinality it something to be asserted by the tool/code processing the ADL.

Finally, the definition of ordered in the RM is a bit vague for my taste. In case of containers, (LIST, SET etc) it reads like ordered is meant to imply indexable, i.e. element at index(position) x semantics apply to the container. That is the case for a LIST, which can have an nth element, but not for a SET.

I hope this helps (I also hope it is correct. As I said, eons since I visited this part).

thomas.beale · 19 October 2023 23:05

Everything @Seref said is correct… just to return to the question of ordering…

The default of ‘ordered’ corresponds to the data structure List<T>, although many people would say, well we always used types like ArrayedList<T> or whatever your favourite language provides as the default container, but we don’t necessarily assume ordering.

So (remembering back aeons myself) the items under a Cluster are assumed to be (by default) ordered, that is the order is significant. That’s the reason the constraint modifiers before and after can be used.

If there is an unordered constraint like the one you showed, it just means that the order is not considered significant. The main import of this is that :

app developers could display that info in whatever order they like on the screen
definers of specialised children of that archetype would not bother with the before and after constraints.

We probably should improve the documentation to that effect.

sebastian.garde · 20 October 2023 06:21

Adding this as an explanation, and stating “ordered by default” would be very helpful, yes.

If I follow your explanation, I assume that neither a change over time from unordered to ordered, nor vice versa would be considered an incompatible change to the archetype?

Seref · 20 October 2023 06:46

That may get a bit tricky. If you go from ordered to unordered, existing data would still be valid. The other direction would make existing data invalid because you layed them out in the container to your heart’s content and now they must have strict order.

Also, this ordered/unordered semantics always gave us a hard time in ITS, especially on the XML side, where we use XSD. It is not easy expressing this semantics in XSD (combination of all/choice/sequence at XSD level) If you have existing data in XML format and your ser/deser is using schemas, well, the schema has just changed(!) I can’t spare the glucose to write down the problem scenario in detail but I have a gut feeling telling me there’s an edge case there for someone using XML that will ruin their day

siljelb · 20 October 2023 06:50

I’m not 100% sure I understand the implications of this discussion, but if I do:
In most cases it’s important that there’s no specific order imposed on how the elements within a CLUSTER (or other archetype class) are persisted as a composition, but we usually do want a specific order within the archetype in its ADL form.

sebastian.garde · 20 October 2023 06:58

Yes, this is kind of what I am getting at - if the data might no longer be valid, we need to consider this as an incompatible change… but then having a more restrictive default (ordered) is a bit tricky too.

damoca · 20 October 2023 07:08

That is my main doubt. In LinkEHR (@yampeku correct me if I’m wrong) we assume unordered by default. I can’t remember what led us to that implementation decision. But at least, in UML, unordered is the default, and at that time UML was our main reference:

If multiplicity element is multivalued and specified as ordered , then the collection of values in an instantiation of this element is sequentially ordered. By default, collections are not ordered .

Seref · 20 October 2023 07:14

This is why I said the definition of ordered is a bit problematic IMHO. The details in the ADL document I linked associated this property to a LIST vs a SET and as I said that’s more in the lines of having an indexed based container or not, at least to me. What we’re talking about here is a stricter constraint than that, not only that a container should have a notion of one element having a position (a location index) in the container which means it can come before or after another element, it should also support a particular sorted state so that elements of various attributes always have smaller or greater positions relative to others.

The issue as I see it is that ordered alone is not clear enough to express that second constraint, and this constraint is actually implicitly defined by the syntactic declaration of the constraint in ADL: the order (oh boy) of individual element constraints(!) under the container constraint.

The thing is, from a computational point of view, the constraint for having a sorted-within-container semantics (potentially) defeats the purpose of having an orderable container because of … cardinality constraints of container elements. If you have X {0…1} followed by Y{0…*) and Z{1…1} then you know that this is an indexable container but based on the cardinality constraints on the elements, that information is no use to you, because you cannot assume you’ll find X/Y/Z at a particular index. So why do you have an ordered (as in indexable) container? What’s the use?

The only case that I can see is it is for the humans, not the machines(!) I can understand clinicians wanting to see things in a certain order on the screen, but that’s not an aspect we put into models, just like persistence, or so I suggested/thought.

To conclude: I’d suggest that we may need to consider exactly what is meant by ordered and if it is needed at all given the above. Any need for order in a container would probably be served better by a mechanism such as annotations that are more suitable to associate GUI-ish requirements.

That’s my 2 pennies. Another day in openEHR, another can of worms

damoca · 20 October 2023 07:31

I think there should not be problems for XML instances with either semantics, at least for the RM validation.

Yes, in the XML Schema the contents of CLUSTER.items are defined as a sequence. But it’s a sequence of the parent ITEM class. So items can contain any combination of the children classes of CLUSTER or ELEMENT, in any order.

	<xs:complexType name="CLUSTER">
		<xs:complexContent>
			<xs:extension base="ITEM">
				<xs:sequence>
					<xs:element name="items" type="ITEM" maxOccurs="unbounded"/>
				</xs:sequence>
			</xs:extension>
		</xs:complexContent>
	</xs:complexType>

The same happens to all the other containers (FOLDER containing FOLDER or COMPOSITION, and SECTION containing SECTION or ENTRY).

So the problem could only come when validating the additional archetype constraints, depending on the default ordered/unordered semantics we choose.

damoca · 20 October 2023 07:44

Now I looked at the BMM specs, which should be now the current source of truth, and as @thomas.beale said, ordered is the default:

And nothing else is constrained at the BMM definition:

	["CLUSTER"] = <
		name = <"CLUSTER">
		ancestors = <"ITEM">
		properties = <
			["items"] = (P_BMM_CONTAINER_PROPERTY) <
				name = <"items">
				type_def = <
					container_type = <"List">
					type = <"ITEM">
				>
				cardinality = <|>=1|>
				is_mandatory = <True>
			>
		>
	>

richard.kavanagh · 20 October 2023 07:59

Thanks for the comprehensive explanation @Seref it has greatly improved my understanding and allowed me to get past the blocker I had on my project

yampeku · 20 October 2023 09:02

Yes, unordered is the default, as ordered is a constraint. You cannot specialize an ordered attribute into an unordered one, so if you specify it as ordered you cannot change it later on. If ordered es the default the property would be redundante. We actually use ordering when we compare data instances, when data is ordered, the types must appear in the same order