Option to record free text as choice against DV_CODED_TEXT - do we need DV_PLAIN_TEXT

thomas.beale · 1 April 2021 17:56

Following recent discussions in the real world relating to solving this problem in tools (mainly Better’s AD tool), I have re-read this topic, and have the following synthesis to offer.

Please review and comment w.r.t your various tools: @pieterbos , @borut.fabjan , @ian.mcnicoll , @sebastian.garde , @yampeku … anyone else interested.

Assumptions

The following are assumptions we should keep in mind in existing openEHR tools & systems.

A rule that is only mentioned briefly in the ADL2 spec is that runtime-validation of data instances uses the most specific RM class it can, for the given instance. The example in the spec is a PARTY hierarchy.

With respect to our pattern of interest, i.e.

    value ∈ {
       DV_CODED_TEXT[id4] ∈ {[ac1]}
       DV_TEXT[id5] ∈ { ... }
    }

If a DV_CODED_TEXT data instance is created at runtime, it must conform to the DV_CODED_TEXT constraint. If this were not the case, the more specific RM type constraints have no weight.

We sometimes forget this (even I do), but it is almost a second ‘golden rule’ of archetyping.

In general, multiple constraints for the same RM type are allowed (and are normal). Example:

    --
    -- pre-tuple style of handling different units
    --
    value ∈ {
        DV_QUANTITY [id14] ∈ {
            property ∈ {[openehr::151|temperature|]}
            units ∈ {"deg F"}
            magnitude ∈ {|32.0..212.0|}
        }
        DV_QUANTITY [id15] ∈ {
            property ∈ {[openehr::151|temperature|]}
            units ∈ {"deg C"}
            magnitude ∈ {|0.0..100.0|}
        }
    }

More generally, any typical constraint within a container attribute is also often a bunch of ‘alternative’ constraints on the same RM type, e.g.

    HISTORY[id2] occurrences ∈ {1} ∈ {
        periodic ∈ {False}
        events cardinality ∈ {*} ∈ {
            EVENT[id3] occurrences ∈ {0..1} ∈ {    }           -- 1 min sample
            EVENT[id4] occurrences ∈ {0..1} ∈ {    }           -- 2 min sample
            EVENT[id5] occurrences ∈ {0..1} ∈ {    }           -- 3 min sample
        }
    }

It’s just that more than one alternative could be used within a container attribute, whereas in a single-valued attribute, you have to pick one.

Problem (re)statement

Allow a pattern like DV_TEXT + DV_CODED_TEXT to be defined in the child of a parent that has just DV_TEXT, such that any DV_CODED_TEXT instance provided at runtime must conform to only the DV_CODED_TEXT constraint, and is not allowed to ‘sneak through’ as an instance of the DV_TEXT constraint. This was @borut.fabjan 's original worry.

Problem Interpretation

According to assumption #1 above, the intended effect will be achieved anyway, as long as alternative DV_CODED_TEXT constraints are not added in further children in order to escape the DV_CODED_TEXT defined in the first child. So the thing we really need is to be able to define a constraint on an RM type that is a descendant of another type, such that it is the only constraint for that RM type - no alternatives allowed.

Candidate solutions

From the original long discussion, we have 3 possible kinds of solution:

A: fix the RM, using DV_PLAIN_TEXT as a sibling of DV_CODED_TEXT, and make DV_TEXT an abstract type. This fixes the problem by getting rid of the situation of two concrete classes in a parent-child inheritance relation in the RM; impractical from a backwards compatibility point of view;
B: my initial solution - ‘closed’ attribute + ‘RM type final’. This might work, but is clunky and not too intuitive.
C: @pieterbos 's solution (here) - a smart rule like this: if a DV_CODED_TEXT constraint is present and its constraint_status is required, it is not allowed to create any other DV_CODED_TEXT in a child archetype or in corresponding RM data that does not conform to that DV_CODED_TEXT, unless that new DV_CODED_TEXT is a specialisation of the parent DV_CODED_TEXT (generalised to any RM types obviously).

Of the above @pieterbos 's solution is the nicest - it’s minimal and reasonably intiutive - except that it cannot be applied generally, since it breaks assumption #2 mentioned at the top. It’s close to what my ‘rm_final’ flag was trying to do. So to have a rule like this means it needs a keyword to indicate it is operating, possibly something like this:

    value ∈ {
       DV_CODED_TEXT[id4] unique ∈ { ... }
       DV_TEXT[id5] ∈ { ... }
    }

That ‘unique’ keyword means: this is the sole constraint specified for any DV_CODED_TEXT instance (including any instance of child RM types, if such existed). A child archetype could redefine this, but can’t add another independent DV_CODED_TEXT constraint - i.e. it is saying to apply Pieter’s rule ‘here’. EDIT: if people think ‘frozen’ or ‘final’ makes more sense than ‘unique’, fine with me. We can vote on it.

I believe that this is the most minimal thing we can do, consistent with current semantics, and that won’t break any existing archetypes or RM implementations. Over to the group.