AOM/TOM semantics for cardinality constraints and occurrences in children

Hi all, I’m generating some templates for testing data validation. While checking my code I realized I might be generating invalid OPTs, since my script is updating the cardinality constraints of each multiple attribute in the OPT (content, events, items, etc.), but was not checking the occurrences in the children objects defined for the multiple attribute.

After discussing a little bit with @pieterbos and @yampeku I ended up with a question:

If there is only ONE child object in the multiple attribute in the OPT, and the attr.cardinality = 1…5, is the attr.children[0].occurrences = 0…1 | 0…3 | 0…5 | 0…10 | 0… valid?*

My interpretation of the spec is: as long as both, cardinality and occurrences, don’t contradict, any combination is allowed, since the goal is runtime validation of the data.

A contradiction would be cardinality = 0…5 and occurrences 10…*

But if we know in the OPT that the cardinality is 1…5, wouldn’t be easier to also constraint the child occurrences accordingly? Like allowing only 1…1, 1…3, 1…5, 3…5, 5…5 (see the cardinality contains the occurrences in this case).

Pieter referenced me to the validity rules in AOM2 which I couldn’t find in AOM1.4 and narrow the space for interpretation of the semantics of cardinality, occurrences and their relationship. Specially VACMCU, VACMCO and WACMCL.

For VACMCU I’m questioning if this part doesn’t make implementation more complex: “where a cardinality with a finite upper bound is stated on an attribute, for all immediate child objects for which an occurrences constraint is stated, the occurrences must either have an open upper bound (i.e. n…*) which is interpreted as the maximum value allowed within the cardinality…”

If we know occurrences 1…* will be interpreted as 1…5 when the cardinality of the parent is 0…5, why not using 1…5 directly?

One last item, in AOM1.4 the only definitions for occurrences are:

  1. Members_valid: members /= Void and then members.for_all(co: C_OBJECT | co.occurrences.upper <= 1)

  2. Occurrences of this object node in the data, under the owning attribute. Upper limit can only be greater than 1 if owning attribute has a cardinality of more than 1).

I think 1. is wrong, it should be “>= 1” instead of “<= 1”.

In 2., related to my initial question, can’t the occurrences.upper be whatever for multiple attributes because at runtime that is constrained by the container cardinality?

Also I’m not sure about the meaning of “owning attribute has a cardinality of more than 1”, does it mean the cardinality.lower >= 1? Because cardinality is an interval not an integer. Or it is trying to reference the attribute as a “multiple attribute”?

Besides checking 1. & 2. for correctness in AOM1.4, we need the validity rules from AOM2 applied to AOM1.4 too. Without those rules, IMO there is a lot of space for interpretation, which could affect how data validation is implemented, then having the same data validated in different ways in systems from different verndors.

1 Like

This is just to make it easy to maintain archetypes (e.g. with 10 child objects and you change the owning attribute’s cardinality from 1…4 to 1…5 or whatever), and to also indicate that occurrences is always subordinate to cardinality.

Yes I think that is wrong. Generally you should only trust the AOM2/ADL2 specs for proper semantics - I did not backport everything in to ADL1.4. The runtime approach these days should always be to parse ADL1.4 archetypes into AOM2, like Archetype Designer and ADL WB do, and then all proper rules will be applied.

Gotcha! I didn’t think about that aspect.

Maybe doing a 1.4.1 release with the validity rules and the invariant fix would be useful. Also including a clarification to what this means “owning attribute has a cardinality of more than 1” (definition of occurrences field here Archetype Object Model 1.4 (AOM1.4)).

BTW in AOM2, for the VACMCO validity rule:

it must be possible for at least one instance of one optional child object (i.e. an object for which the occurrences lower bound is 0) and one instance of every mandatory child object (i.e. object constraints for which the occurrences lower bound is >= 1) to be included within the cardinality range.

I think this part is not totally correct “one instance of every mandatory child object”, since objects with occurrences.lower = 3 would need 3 instances not one, so the cardinality of the parent container should be able to contain at least the number of instances equals to the occurrences.lower of mandatory objects plus one if there are optional objects.

Is that correct?

Good catch. It should say something like: ‘… at least N instances (where N is the lower occurrences bound) of every …’.

Can you raise a PR for that?

Done

https://openehr.atlassian.net/browse/SPECPR-371

https://openehr.atlassian.net/browse/SPECPR-372

1 Like

@thomas.beale @pieterbos I have found the ADL 1.0.2 spec has the validity rules for AOM 1.4, which is strange that the syntax defines the rules for the model, but anyway…

https://specifications.openehr.org/releases/1.0.2/architecture/am/adl.pdf see section 5.3.4.2, page 56.

VCOC: cardinality/occurrences validity: the interval represented by: (the sum of all occurrences minimum values) … (the sum of all occurrences maximum values) must be inside the interval of the cardinality.

The rule makes sense though is preventing some cases that I think are valid at runtime, for instance:

occurrences of children:
1..*
1..1
0..1
0..*

cardinality of parent:
3..*  // this violates VOC because the sum of all occurrences.lower is 2 and 2 is not in this interval, though, at runtime it is possible to enforce the container to have 3 items even only 2 are mandatory in the children.

What cardinality should allow is to create as many child objects as their occurrences can hold and that should be in the cardinality interval. For instance this example is not valid in any case:

occurrences of children:
1..1
0..1

cardinality of parent:
3..*

A) The to avoid this case, the sum of occurrences.upper should be >= cardinality.lower

And this shows the opposite case:

occurrences of children:
5..10
3..5

cardinality of parent:
3..6

B) To avoid this case, the sum of occurrences.lower should be <= cardinality.upper

I think combining A and B, with the rule to allow at least one optional object (C), makes all the cases work.

C) If there are optional children in a container, the sum of all children’s occurrences.lower + 1 should be = min(cardinality.lower). For instance:

  • Note the min(cardinality.lower) means the minimal valid value for…
occurrences of children:
5..*
3..5
0..3

cardinality of parent:
9..*

This determines that the cardinality.lower should be at least 9, but could be 10, 11, …because of A). Since the first occurrences is infinite, any finite cardinality.lower is allowed.

If we have all finite intervals in the occurrences:

occurrences of children:
5..6
3..5
0..3

cardinality of parent:
9..*

Something similar happens here, cardinality.lower could be 9, 10, 11… etc, but applying A), the max(cardinality.lower) is 14 (sum(occurrences.upper)).

if the finite interval is the cardinality, and there is an infinite occurrences, it is valid too:

occurrences of children:
5..*
3..5
0..3

cardinality of parent:
9..15

cardinality.lower could be 9, …, 14 (there is a rule avoiding cardinality.lower = cardinality.upper in BASE 1.2.0 on Proper_Interval which applies to cardinality for AOM2 which I think should apply to AOM1.4 too, check Foundation Types).

Finally with all finite intervals:

occurrences of children:
5..6
3..5
0..3

cardinality of parent: (some valid values at runtime)
9..10 // allows less than the sum upper occurrences
9..20 // allows more than the sum upper occurrences
14..15 // lower complies with A)

A) 14 >= cardinality.lower (=> 14 = max(cardinality.lower))
B) 8 <= cardinality.upper (=> 8 = min(cardinality.upper))
C) 9 = min(cardinality.lower)

There is an implicit 4th rules that applies to all intervals: lower < upper.

Would those rules be enough for all possible cases?

This is a very old version of the spec, and those validity rules were mostly revised and also moved to the AOM2. To work out valid rules for ADL1.4 today, we should use the modern validity rules in the current AOM2, in this case, VACMCU and VACMCO as you quoted earlier. The VCOC rule you quoted is actually still in the current ADL1.4 spec, but we should remove it and use the AOM2 rules.

Currently VACMCO has the definition: cardinality/occurrences orphans: it must be possible for at least one instance of one optional child object (i.e. an object for which the occurrences lower bound is 0) and one instance of every mandatory child object (i.e. object constraints for which the occurrences lower bound is >= 1) to be included within the cardinality range.

However, thinking about this now, and taking into account the ‘one instance’ error you already found, I would say it is still too restrictive. There is no real reason why every single optional child object should have to be simultaneously present in a real data instance - often such objects are alternatives, e.g. the case of systolic + diastolic pressure | mean arterial pressure | pule pressure. One one of these is ever used at runtime.

If we also agree with VACMCU, i.e. that occurrences of N..* can exist within a cardinality whose upper limit is some finite number (5 or whatever), with the meaning that the cardinality limit is the one that takes precedence at run time, then I think a better definition of VACMCO would be:

the cardinality range of an attribute and the occurrences of all of its object children must be such that both the former and the latter can be satisfied in some way at runtime. Formally, this means that there is at least one combination of occurrences ranges (Individually or summed) that allows the cardinality range to be satisfied, and secondly that it is possible that at least one instance of each object, taken individually can occur in the data (i.e. that the occurrences and cardinality constraints do not prevent any child objects ever appearing in data).

An example that would fail this check is:

  • Attribute cardinality matches {10…20}
    • Object[id10] occurrences matches {0…2}
    • Object[id11] occurrences matches {0…2}
    • Object[id12] occurrences matches {0…2}

Here, even if the max of each child object were created in the data, we would only get 6 items, which falls short of 10. Noone ever creates archetypes like this of course (at least not in health), but nevertheless, we need a formal definition so this rule can be implemented.

A relatively simple algorithm needs to be developed, and added to the spec as the implementation of VACMCO.

In terms of some of your examples, I would say that the general principle is as follows:

  • if the occurrences.upper of some of all of the child objects is finite, then the cardinality would normally be left open, i.e. {0..*} since defining finite limits doesn’t achieve anything, but can create maintenance problems over time.
  • if the overall cardinality.upper of the attribute is finite, then the occurrences.upper of the child objects need only be limited when it is actually required for specific objects, since otherwise, the cardinality upper limit will correctly control the total number of runtime data objects.
  • using finite limits on both cardinality and occurrences is likely to be for very special cases, and should be used sparingly.

In all cases, the VACMCU and revised VACMCO (above) should define the formal validity.

Agreed. My point is these rules shouldn’t be in the ADL spec but in the AOM spec, as they are in the AOM2. If there is any fix to the AOM2 rules, I would vote to release an AOM 1.4.1 with these rules integrated and an ADL 1.0.3? without those rules.

Though the examples provided apply for both AOM1.4 and AOM2.

Did you check rules A), B) and C) from my previous message?

Yes, the missing part is if mandatory objects have occurrences.lower > 1, that is not “one instance”. Though I think “at least one instance of any optional object” is correct.

This is rule C) in my previous message. Which could be written as: min(attr.cardinality.lower) >= sum(attr.children.occurrences.lower) + (attr.has_optional_child() ? 1 : 0)

I’ll check the rest ASAP.

I don’t think precedence should be considered here. Something that Pieter mentioned and makes sense is: both cardinality and occurrences constraints to be verified separately, and if the data violates any, return an error.

That is analogous to my second invalid example:

occurrences of children:
1..1
0..1

cardinality of parent:
3..*

where cardinality.lower should be <= sum(children.occurrences.upper) (Rule A)

Which allows also to have: (added a child with occurs 0…* and fixed cardinality.lower to comply)

occurrences of children:
1..1
0..1
0..*

cardinality of parent:
2..*

That is exactly my concern: not what modelers do but what the formal rules allow, in order to have a consistent representation of models in archetypes and templates, which leads to consistent implementation of those rules in modeling tools.

I think all rules should be represented by an invariant formula than free text, which is closer to a mathematical notation and avoids interpretation issues.

I think that might be too restrictive. In some scores/scales, a modeler might want an exact number of objects in the multiple attribute, having children with occurrences.upper all finite.

I’m concerned of defining something as a general rule that might impose unwanted constraints on modelers. I prefer just to set the minimal set of rules, generic enough, that leads to a consistent model.

I interpret your second point as not specifying a rule.

Current VACMCU has a restrictive second statement:

VACMCU: cardinality/occurrences upper bound validity: where a cardinality with a finite upper bound is stated on an attribute, for all immediate child objects for which an occurrences constraint is stated, the occurrences must either have an open upper bound (i.e. n…*) which is interpreted as the maximum value allowed within the cardinality, or else a finite upper bound which is ⇐ the cardinality upper bound.

IMO we can allow that, like in:

occurrences of children:
2..12
0..5

cardinality of parent:
2..10

The 12 is like having an “*”: it will always be limited by the 10 (note it’s not precedence, but just applying both rules at the same time, the cardinality one is violated e.g. if there are more than 10 instances of the 1st object).

Another case would be to have finite lower occurrences:

occurrences of children:
2..12
2..5

cardinality of parent:
2..10

On this case you can never reach 10 instances of the first object in the container because at least 2 should be of the second object, so the children[0].occurrences.upper is really constrained by cardinality.upper - children[0].occurrences.lower - children[1].occurrences.lower, ant that is when things get tricky on defining general rules.

Something like this would be mathematically correct when cardinality.upper is set to 10, and decreasing children[0].occurrences.upper to the max allowed value:

occurrences of children:
2..6
2..5

cardinality of parent:
2..10

One thing related to my rule C) and what VACMCO says about the optional objects. I was thinking about the cardinality.lower in that case but is incorrect:

C) If there are optional children in a container, the sum of all children’s occurrences.lower + 1 should be = min(cardinality.lower).

That should really be:

C) min(cardinality.lower) <= sum(children.occurrences.lower) and min(cardinality.upper) >= sum(children.occurrences.lower) + 1

The +1 allows the interval to be a proper interval, and accommodate any optional objects, if there are any. I think that applies in any case. Note min(cardinality.upper) when upper is infinite, the whole expression is infinite. That is kind of a corrected VACMCO.

1 Like

Argh, I’m still testing cases and fixing rules:

IMO rule C is not needed, e.g. this could happen, what I think is valid because there are valid combinations at runtime that comply with both occurrences and cardinality constraints:

occurrences of children:
0..*
0..*
0..*

cardinality of parent;
4..10

Which passes rules A and B:

A) sum(occurs.upper) = * >= cardinality.lower = 4
B) sum(occurs.lower) = 0 < cardinality.upper = 10

Note on B I changed <= to < because that generates a proper interval for the cardinality (lower < upper), since sum(occurs.upper) could be equals to sum(occurs.lower) (when all occurrences are 1…1).

I think that prevents all the cases that prevents from generating a valid runtime combination of objects in the container that are compliant with both occurrences and cardinality constraints, which IMO is the central problem to solve: avoid to specify something in AOM that can’t be held at runtime, like a logic “bottom” or contradiction.