Option to record free text as choice against DV_CODED_TEXT - do we need DV_PLAIN_TEXT

ian.mcnicoll · 30 April 2020 10:07

There is an established use of the DV_TEXT/DV_CODED_TEXT choice pattern to allow

extension/replacement of internal code lists (better solved by the proposed changes to allow for ‘extensible’ etc.
To allow a free text to be used in place of a coded text, where none is available.

As an example, the Adverse reaction archetype is intended to be used in the UK with a very strict SNOMED valueset, so that it can power decision support across systems . So in this case the valueset is ‘required’ and not ‘extensible’, However we do recognise that very occasional allergies need to be recorded against drugs that currently have no coded terms e.g new medications, trial drugs, foreign products.

in FHIR the ‘extensible’ valueset binding allows for a free text extension (against their CodableText datatype) but that would not work for us technically , and also is, for me, not quite right semantically. In the UK use-case I want ot strictly control the coded valueset so it should be ‘required’ and not ‘extensible’.

The good news is that with the changes to valueset binding strength, we can make the DV_TEXT/CODED_TEXT option work for this.

The bad news is that it leaves open a problem that Fabio highlighted in Braunschweig – If we leave DV_TEXT as an alternate through to run-time (or a template-level) it can always potentially be sub-classed to a DV_CODED_TEXT, and therefore subvert any existing constraints on the existing DV_CODED_TEXT valueset bindings. i.e I could then use any coced valueset I wished.

This is one of the issues that is blocking full adoption of ADL2 in AD.

Possible solutions…

Add an attribute to DV_TEXT this signifies it as ‘final’ ie. not able to be sub-classed to DV_CODED_TEXT
Add new sub-class to DV_TEXT alongside DV_CODED_TEXT as DV_PLAIN_TEXT.

In practical terms when, as modellers we want to use the Choice option, we would use DV_PLAIN_TEXT/DV_CODED_TEXT as the normal choice.

I think this solves Fabio’s issue.

Seref · 30 April 2020 10:27

I think option 2 of having a DV_PLAIN_TEXT is the better one. openEHR by design prefers types over codes to express semantics (or that’s my incorrect view of it )
A type fits better into overall methodology, tooling and implementations for this particular requirement.

No runtime surprises for DV_PLAIN_TEXT since it has no descendants, communicates a modeller’s explicit request/wish to use free text. Sounds good to me.

ian.mcnicoll · 30 April 2020 10:51

There are some possible use cases for DV_PLAIN_TEXT where we do explicitly want to prevent the use of coded_text e.g. a narrative comment.

thomas.beale · 30 April 2020 13:37

i.e. sub-classed to create an ‘alternate’ DV_CODED_TEXT constraint, alongside the other one that you already created? (Just trying to remember the details of the problem here).

If I understand the need correctly, neither of the solutions helps, because they are in RM-land. Instead we would need something extra in the AOM2 model that relates to RM type removal or similar, as explained here in the ADL2 spec.

In theory, if you just put something like the following it should work:

    value ∈ {
       DV_CODED_TEXT[id4] ∈ { ... }
       DV_TEXT[id5] occurrences ∈ {0}
    }

I don’t have an example of this in the spec, and it’s unusual (that’s because DV_TEXT is a concrete, i.e. instantiable type, which is not that common in good OO models); I also have never tried it in tools. Would be interesting to see what Archie does - maybe @pieterbos could make a little test? I’ll try in ADL WB when I get the chance.

BTW, we did have DV_PLAIN_TEXT a long time ago, and every so often I think of putting it back in, if we want to represent a ‘text’ type that can never be instantiated by a coded text object at runtime. But I can’t see how it helps the problem described here.

pieterbos · 30 April 2020 14:30

Yes you can do this in ADL 2. I just tried with Archie and our editor:

parent archetype:

value ∈ {
    DV_TEXT[id5]
}

Now if you create the following child archetype:

value ∈ {
    DV_CODED_TEXT[id5.1 ] ∈ {.....}
}

DV_TEXT[id5] will be gone and replaced by DV_CODED_TEXT[id5.1] in the flat form. This is because of ADL 2 spec, section 9.2.5, object node constraints, case REFINE an existing node. Basically anything with effective_occurrences () returning {1} in the parent will be replaced in case of specialisation, and that’s the case for a single valued attribute. So occurrences matches {0} is not needed here, done automatically

If you want to keep both the DV_CODED_TEXT and the DV_TEXT option, all of the following will work in Archie in ADL 2:

Add a DV_TEXT[id5.2] to the child archetype after or before the DV_CODED_TEXT
Add a new DV_TEXT[id0.1] node to the child archetype, after or before the DV_CODED_TEXT
specialize the ELEMENT instead of the value, so you get two elements (perhaps not desirable)

I would prefer the first option, although the semantics of this case are perhaps not entirely described in the spec. If I remember correctly, Archie handles these kinds of specialisation of nodes exactly the same as the ADL workbench. If you could check with this example @thomas.beale?

I would say this does not need fixing in ADL 2, but might need a bit of clarification specifically for these cases.

If anyone wants to play around with these things in ADL 2, send me a message on slack and I’ll give you access to our ADL 2 editor. It allows you to edit by GUI and by text editor, it validates archetypes, displays differential, flat and operational template forms, plus JSON data examples of the resulting RM data. Bit of an expert tool, but I would say this is the right audience for that. And Archie usually follows the spec meticulously with these things.

pieterbos · 30 April 2020 14:33

Oh and if your parent is:

value matches {
    DV_TEXT[id4]
    DV_CODED_TEXT[id5]
}

Then in the child archetype or template you can do:

value matches {
    DV_TEXT[id4] occurrences matches {0}
    DV_CODED_TEXT[id5.1] ∈ { ... }
}

To get rid of the dv_text while specializing the coded text.

ian.mcnicoll · 30 April 2020 16:48

That’s not the issue though, I can already remove the DV_TEXT constraint if I wish (not sure what happens under te hood)

I want to keep the original DV_CODED_TEXT constraint AND the original DV_TEXT but do not want it to be allowed to be sub-classed to a DV_CODED_TEXT constraint.

I need to keep the option open of having the original DV_CODED_TEXT constraint OR ‘plain text’.

thomas.beale · 30 April 2020 17:02

In that case, apologies, I have misunderstood your original post…

So if you want to be able to end up with a ‘coded-only’ constraint and a ‘text-only’ constraint, but not one that can be turned into a coded-text instance, then we would need an RM class of which that is an instance, i.e. DV_PLAIN_TEXT. I’m still not sure I get why the remaining DV_TEXT is a problem in the Archetype Designer though…

because even if it still says DV_TEXT in the archetype, at runtime, you can always instantiate a DV_CODED_TEXT, since its a technical sub-type. Is there a reason to block that?

It seems to me what you are really after is something that says ‘this here DV_CODED_TEXT constraint is the only one allowed, no alternatives (of the same RM type)’?

ian.mcnicoll · 30 April 2020 17:10

Yes … exactly.

Because I have already defined a ‘required’ codeset but also need to keep a plain text option open but do not want that DV_TEXT to be redefinable downstream.

Ultimately I do really want to say

This DV_CODED_TEXT constraint OR any DV_PLAIN_TEXT (semantically, speaking)

pieterbos · 30 April 2020 18:00

But it is always possible to add a node in a child archetype, so you can always add a new DV_CODED_TEXT and remove the DV_TEXT, even with a new ‘final’ text data value type. To do what you want you would probably need add a way to block redefining the constraint and block adding new object constraints to the attribute.

Also I am not sure why you do want want the DV_TEXT to be redefinable downstream. Why wouldn’t you, and what goes wrong if someone does make it a DV_CODED_TEXT?

pablo · 1 May 2020 02:02

In programming languages that is solved with the “final” keyword, that makes a type to not be extendable.

I like solution 1.

In openEHR, the “final” keyword could be applied in archetypes or templates, in the C_OBJECT type. So a constraint for a DV_TEXT could be “final” without having a new type. Also I think by adding that keyword, we extend the constraint model with a very common object-oriented constraint we currently don’t support.

Seref · 1 May 2020 08:04

Please note that final keyword (and its synonyms) applies to classes in most OO languages. Using it here would mean the semantics would have to hold during runtime, most likely for a field of a class and result in breaking the Liskov substitution principle. In other words, implementation of this for validation and runtime representation in OO languages look problematic to me. Happy to be corrected if I’m missing something here

ian.mcnicoll · 1 May 2020 08:12

Hi Pieter,

What goes wrong is that I want to lockdown the system to

a) A choice of coded texts from a required external valueset i.e only SNOMED CT codes which are either allergy codes or medication codes

OR

b) a piece of free text.

I do not want to allow someone to be able to enter any other coded_text.

If I leave DV_TEXT in the final constraint, alongside the DV_CODED_TEXT external valueset which I currently need to do to satisfy (b) , then that can be sub-called at runtime and allow the system to add any code they like.

I’m not talking about adding new nodes in a child archetype, this is very largely about templating.

pieterbos · 1 May 2020 08:43

In our implementation it’s not so hard to build - just change the validation to do exact type checking when final, instead of exact or subclass in these cases, and change the archetype/template validations to not allow subclasses here. If you use archetypes to generate OO models/classes, that could be more difficult.

But something like this is not enough:

value matches {
    final DV_TEXT[id4]
    DV_CODED_TEXT[id5] matches {....}
}

Since you can still add DV_CODED_TEXT[id0.1] in a specialized archetype. You actually need to do something like:

value no_new_c_objects_in_specialisations matches {
    final  DV_TEXT[id4]
    DV_CODED_TEXT[id5] matches {....}
}

Since you probably do want to be able to specialize the DV_CODED_TEXT to limit the options available in a template.

I really wonder if this does not cause more trouble than it solves. Modelling and templating already requires people to know what they are doing. Isn’t this just part of modeling guidelines or advice, under the heading ‘please don’t’?

Seref · 1 May 2020 08:48

That’s my point. The programming artefact, i.e. some facade to RM instances or the RM implementation itself is allowing assignments of subtype values to a field of type X, but then it fails during runtime due to validation. We could always throw more tooling at it, but the essence of my objection is it is all because we’d be going against the nature of the underlying programming language. Why fight that uphill battle and deal with many complexities when adding a new type can solve the problem with less conflict?

It is a design choice at the end, we may have different views of pros/cons, but your response helps me clarify mine, so thanks for it.

pieterbos · 1 May 2020 09:03

I don’t think a new type will solve this issue. You can then still do:

value matches {
    DV_CODED_TEXT[id4]
    DV_PLAIN_TEXT[id5]
}

And specialise it with:

value matches {
    DV_PLAIN_TEXT[id5] occurrences matches {0}
    DV_CODED_TEXT[id4] occurrences matches {0}
    DV_CODED_TEXT[id0.2]
}

In ADL 2 you can do this even in a template - no problem.

thomas.beale · 1 May 2020 09:05

Right, technically that’s true. But I agree with @pieterbos - you can program badly in every language under the sun, or do it well. Part of writing good models surely has to be good practice?

Anyway, at the start you mentioned a technical problem with this structure in Archetype Designer that Fabio mentioned in Brauschweig. If we knew what is breaking in the tool, it might make it clearer what is ‘wrong’ with this structure?

thomas.beale · 1 May 2020 09:11

Right - going against basic OO polymorphic binding is not likely to be a good idea. (The reason ‘frozen’ works in some OOPLs is that it relates to code, not data items - I don’t think doing this will help us here) So I would agree that having a DV_PLAIN_TEXT type is a more obvious option to achieve a ‘pure text’ DV type. But as @pieterbos pointed out, you can go on specialising forever…

I think we need to look at changes that:

don’t have unexpected side effects in normal models;
are not over-complex for tooling;
are intuitive for modellers.

ian.mcnicoll · 1 May 2020 09:17

Sure, but it will not conform or validate correctly against the original template. - for a shared definition across a community.

I want to be able to clearly state the rules around coding/ free text without resorting to implementation guidance and for that to be testable, as far as possible.

We need to be able to hide the complexity of our space behind as simple a layer as we possible can without ‘insider knowledge’ , particularly as the scope of a CDR consumer goes beyond a single app developer, and into increasingly ‘naive devs’.

ian.mcnicoll · 1 May 2020 09:35

don’t have unexpected side effects in normal models;
are not over-complex for tooling;
are intuitive for modellers.

As this would be a new class it should not have any backwards compat issues. Any existing DV_TEXT could be (and should be left alone).

I think this is much more intuitive for modellers. What we do now is really quite difficult to explain, especially as we already subvert the DV_TEXT choice to be sub-classed to DV_CODED_TEXT to get around the extensible valueset issue. We are solving that one of course.

Use DV_TEXT if you cannot be sure if you will want or need codes.
Use DV_CODED_TEXT if you will only ever want coded items ()but can be optionally extended)
Use DV_PLAIN_TEXT if you want to enforce free text.

Use DV_CODED_TEXT/DV_PLAIN_TEXT if you want to enforce coded valuesets (esp ‘required’ valuesets) but allow a free_text only get-out.

That is way easier to understand and explain than the current situation - I will ask on the modelling channels for thoughts.

In terms of TD and AD tooling, I think it would be pretty easy - DV_PLAIN_TEXT would just be another alternate datatype to be offered. The UI could be made more user-friendly but a minor tweak to what is there at the moment would work ok.