Option to record free text as choice against DV_CODED_TEXT - do we need DV_PLAIN_TEXT

thomas.beale · 20 August 2020 17:39

Yes, I was not clear, that is indeed what I meant: a new VXXX rule, with a validator check to implement it (phase 2 in my system). The rule by the way can be made to apply regardless of RM type - it’s something like this:

VURMST: unique RM sub-typing - a node of RM type T may not be instantiated as a subtype T1 if there are also constraints for type T1, and the instance does not conform to one of those constraints.

pieterbos · 21 August 2020 07:15

That could work. It may need some further restrictions because obviously this does not apply to items in a cluster - is this only for single valued attributes? Or only when marked as ‘this cannot be subclassed’ the archetype? Or specific parts of the RM?

pieterbos · 7 September 2020 13:42

So https://openehr.atlassian.net/browse/SPECAM-68 adds a constraint_status, which corresponds to the fhir value set binding strength.

I wonder if that can be used in a new validation rule, something like: if a DV_CODED_TEXT constraint is present and its constraint_status is required, it is not allowed to create any other DV_CODED_TEXT in a child archetype or in corresponding RM data that does not conform to that DV_CODED_TEXT, unless that new DV_CODED_TEXT is a specialisation of the parent DV_CODED_TEXT.

That would mean you could create different constraint_status of DV_CODED_TEXT/DV_TEXT combinations to solve the different use cases.
Note for currently all DV_CODED_TEXT have required constraint_status by default because it is the only supported binding strength currently, so not sure what that would mean for current archetypes and specializations.

If this does not solve it, or perhaps is a bit ugly solution, I would prefer the final keyword here

ian.mcnicoll · 7 September 2020 14:20

That’s not a bad suggestion Pieter but I might suggest slight change. Currently FHIR extensible means you can extend the valueset either with new codes or with free text.

So perhaps we have what we need just with combination of CODED_TEXT = extensible/required and TEXT.

a) TEXT

=> anything including CODED_TEXT

b) TEXT
CODED_TEXT = extensible

=> Any code from the CODED_TEXT valueset, any other code or free text

b) TEXT
CODED_TEXT = required

=> Any code from the CODED_TEXT valueset or free text

This makes the requirement clear, although technically the TEXT could still be sub-classed at run-time and practically makes this an ‘extensible’. I can live with that as at least the requirement is clear even if it cannot be technically enforced.

sebastian.garde · 7 September 2020 14:37

Your (Ian’s) examples b) and c) sound like direct applications of Pieter’s validation rule to me?
I think the validation rule makes a lot of sense (and is not that ugly either).

pieterbos · 7 September 2020 20:55

One thing: The ADL 2/AOM 2 specification is entirely independent of the RM specification, without any domain classes or concepts, so a rule that specific cannot be in AOM validation rules - while that would be necessary for this solution - in fact the implementation would mean specific code for the OpenEHR RM, outside of the BMM model. Or would there be another place where this can be specified and implemented?

The final keyword might still be easier?

ian.mcnicoll · 8 September 2020 08:20

I think the thing here is that the binding strength is actually a property of the Valueset not the DV_CODED_TEXT as such

a) TEXT

=> anything including CODED_TEXT

b) TEXT
VALUESET = extensible

=> Any code from the valueset, any other code or free text

b) TEXT
VALUESET = required

=> Any code from the valueset or free text

Does that help validation by being domain neutral?

THomas’s suggested rule would also work but appreciate there are broader issues around where it should apply

pieterbos · 8 September 2020 08:31

it’s on the c_terminology_code, not the value set, so the same thing as in fhir (which is good for easier mapping!)

But that doesn’t help, same problem. But perhaps there is an easy way to do this.

ian.mcnicoll · 8 September 2020 12:48

I’m starting to think that we can just leave this as guidance, forget about validation, if that is problematic. I think we have clear way of communicating the designers’ wishes in term of extensibility / use of free text which was a bit vague before.

thomas.beale · 8 September 2020 14:34

OK - a reasonable version of the changes for required | preferred etc constraint strengths:

ADL2
AOM2

thomas.beale · 1 April 2021 17:56

Following recent discussions in the real world relating to solving this problem in tools (mainly Better’s AD tool), I have re-read this topic, and have the following synthesis to offer.

Please review and comment w.r.t your various tools: @pieterbos , @borut.fabjan , @ian.mcnicoll , @sebastian.garde , @yampeku … anyone else interested.

Assumptions

The following are assumptions we should keep in mind in existing openEHR tools & systems.

A rule that is only mentioned briefly in the ADL2 spec is that runtime-validation of data instances uses the most specific RM class it can, for the given instance. The example in the spec is a PARTY hierarchy.

With respect to our pattern of interest, i.e.

    value ∈ {
       DV_CODED_TEXT[id4] ∈ {[ac1]}
       DV_TEXT[id5] ∈ { ... }
    }

If a DV_CODED_TEXT data instance is created at runtime, it must conform to the DV_CODED_TEXT constraint. If this were not the case, the more specific RM type constraints have no weight.

We sometimes forget this (even I do), but it is almost a second ‘golden rule’ of archetyping.

In general, multiple constraints for the same RM type are allowed (and are normal). Example:

    --
    -- pre-tuple style of handling different units
    --
    value ∈ {
        DV_QUANTITY [id14] ∈ {
            property ∈ {[openehr::151|temperature|]}
            units ∈ {"deg F"}
            magnitude ∈ {|32.0..212.0|}
        }
        DV_QUANTITY [id15] ∈ {
            property ∈ {[openehr::151|temperature|]}
            units ∈ {"deg C"}
            magnitude ∈ {|0.0..100.0|}
        }
    }

More generally, any typical constraint within a container attribute is also often a bunch of ‘alternative’ constraints on the same RM type, e.g.

    HISTORY[id2] occurrences ∈ {1} ∈ {
        periodic ∈ {False}
        events cardinality ∈ {*} ∈ {
            EVENT[id3] occurrences ∈ {0..1} ∈ {    }           -- 1 min sample
            EVENT[id4] occurrences ∈ {0..1} ∈ {    }           -- 2 min sample
            EVENT[id5] occurrences ∈ {0..1} ∈ {    }           -- 3 min sample
        }
    }

It’s just that more than one alternative could be used within a container attribute, whereas in a single-valued attribute, you have to pick one.

Problem (re)statement

Allow a pattern like DV_TEXT + DV_CODED_TEXT to be defined in the child of a parent that has just DV_TEXT, such that any DV_CODED_TEXT instance provided at runtime must conform to only the DV_CODED_TEXT constraint, and is not allowed to ‘sneak through’ as an instance of the DV_TEXT constraint. This was @borut.fabjan 's original worry.

Problem Interpretation

According to assumption #1 above, the intended effect will be achieved anyway, as long as alternative DV_CODED_TEXT constraints are not added in further children in order to escape the DV_CODED_TEXT defined in the first child. So the thing we really need is to be able to define a constraint on an RM type that is a descendant of another type, such that it is the only constraint for that RM type - no alternatives allowed.

Candidate solutions

From the original long discussion, we have 3 possible kinds of solution:

A: fix the RM, using DV_PLAIN_TEXT as a sibling of DV_CODED_TEXT, and make DV_TEXT an abstract type. This fixes the problem by getting rid of the situation of two concrete classes in a parent-child inheritance relation in the RM; impractical from a backwards compatibility point of view;
B: my initial solution - ‘closed’ attribute + ‘RM type final’. This might work, but is clunky and not too intuitive.
C: @pieterbos 's solution (here) - a smart rule like this: if a DV_CODED_TEXT constraint is present and its constraint_status is required, it is not allowed to create any other DV_CODED_TEXT in a child archetype or in corresponding RM data that does not conform to that DV_CODED_TEXT, unless that new DV_CODED_TEXT is a specialisation of the parent DV_CODED_TEXT (generalised to any RM types obviously).

Of the above @pieterbos 's solution is the nicest - it’s minimal and reasonably intiutive - except that it cannot be applied generally, since it breaks assumption #2 mentioned at the top. It’s close to what my ‘rm_final’ flag was trying to do. So to have a rule like this means it needs a keyword to indicate it is operating, possibly something like this:

    value ∈ {
       DV_CODED_TEXT[id4] unique ∈ { ... }
       DV_TEXT[id5] ∈ { ... }
    }

That ‘unique’ keyword means: this is the sole constraint specified for any DV_CODED_TEXT instance (including any instance of child RM types, if such existed). A child archetype could redefine this, but can’t add another independent DV_CODED_TEXT constraint - i.e. it is saying to apply Pieter’s rule ‘here’. EDIT: if people think ‘frozen’ or ‘final’ makes more sense than ‘unique’, fine with me. We can vote on it.

I believe that this is the most minimal thing we can do, consistent with current semantics, and that won’t break any existing archetypes or RM implementations. Over to the group.

yampeku · 1 April 2021 19:59

One problem I can see is that constraint_status value is assumed to be ‘required’ if not explicitly stated otherwise, which in the end:
A) could make every existing archetype non-specializable as they are right now, requiring the modification of every archetype
B) could make decision difficult if we cannot tell apart if the constraint_status has been put as required by making explicit something that was non-existing in the archetypes (see A) or it was explicitly put by the editor in the archetype

Also, I don’t know if alternatives of C_TERMINOLOGY_CODE are really allowed, and what is the outcome of the constraint_status proposed rule in that case (and if the constraint_status can differ in the alternatives)
I think alternatives of C_TERMINOLOGY_CODE should be allowed, and allow different constraint_status, with this we could add examples and the actual constraints in the same parent archetype

wouterzanen · 5 January 2023 15:14

Hi,

Have been reading this topc because in the Netherlands the same question popped-up. Basically the we have a more conceptual definition of care building blocks (Zibs) which are only defined in UML. It allows for CODED_TEXT this can be required or extenisble. Problem is that if it is Extensible you can extended it with any other code system and or text. But this is not what is needed in practice. Because you would like to strictly adhere to a code list and only use free text when no code is suitable in the list.

So a similar challeng. I can see in th OAM2 document:

ELEMENT[id22] occurrences matches {0..1} matches {
            value matches {
                DV_CODED_TEXT[id58] matches {
                    defining_code matches {extensible [ac2]} -- use ac2 value-set if there is a match
                }                                            -- or another code from same terminology
                DV_TEXT[id59]                                -- or plain text
            }
        }

That it seems to be solved in this way. Now the value set that is used is a constraijt list from |SNOMED CT: <260787004|Physical object||.

So now the questions:

Does the extensibility mean that now all values from SNOMED CT are posibble as an extension, but not from anyother code systems? (This seems to still leave it quite open what codes can be used)
And maybe this is not a topic for here, but did anybody try to constraint this in Fhir as well or has a pattern to do this? Or do we leave it up to the system storing the data (openEHR or legacy) to solve this.
It seems that the adl2 description is somewhat in conflict with the oam2:

extensible

Data item value should satisfy constraint, i.e. a term in constraint is to be used if it covers the data item meaning (including more generally); if not, another code may be used, including from **another** terminology.

Enumeration value = 1.

Where in th oam2 definition it says: or another code from same terminology