DV_CODED_TEXT with open/extensible set of codes (value set)

Just to illustrate with a timely example from the Covid-19 modelling.

We created a specialization of the health risk evaluation archetype. The purpose of this was to model the Covid-19 risk factors. It was great to use the archetype for such purposes regarding the translations into different languages and also the documentation of the information model.

The table below lists the first risk factors.

Term Description
Contact with confirmed Covid-19 case Contact with confirmed Covid-19 case within 14 days before symptom onset.
Contact with suspected case/ pneumonia case Contact with suspected case/ pneumonia case within 14 days before symptom onset.
Contact with birds in China Contact with birds in China in 10 days before symptom onset.
Contact with confirmed human case of Avian flu in China Contact with confirmed human case of Avian flu in China in 10 days before symptom onset.
Contact with severe, unexplained respiratory disease Contact with severe, unexplained respiratory disease in 10 days before symptom onset.
Potential contact exposure based on location Potential contact exposure based on location.

Later in the epidemic situation new risk factors where added. Some might even be national or regional. We wanted a way to add more terms into the archetype.

And with the rapid development of the disease we also wanted to be able to add free text in some situations.

2 Likes

Yeah, I was referring to creating new DV_TEXT/DV_CODED_TEXT which are actually objects. Domain types kind of hide it a little, but it’s just a RM specific way of defining alternatives.

I would disagree in having garbage, you will have an actual archetype/template that governs your data, it wouldn’t be strictly a specialization, but neither are now with the ‘open unless specifically prohibited’ approach. I would argue that is less critical in this case because the semantics on the element node can be very clear (Sometime you could even infer the expression that defines the subset).

Arguably, if same ‘open unless specifically prohibited’ is also allowed then we could also simulate the strength attribute in ADL (i.e. we probably can find equivalent prohibited or fix things to certain value/prohibit other that can mimic that behavior), so strength probably becomes a useful syntactic sugar.

So I would assume this corresponds to the need to mark the term constraint as either preferred or maybe extensible as per our discussion yesterday?

1 Like

Well actually, strictly speaking, so-called ‘added’ nodes like ELEMENT and so on are proper specialisations, since they still fit the data space defined by the RM (e.g. CLUSTER.items: List<ITEM> can always have another ELEMENT of whatever form) - unless ‘closed’ or prohibited in some way. Currently the spec assumes ‘open unless closed’, but the opposite assumption could be made, or it could even be settable on a template or at OPT generation. Which way it is doesn’t matter that much, as long as it is clear what the rules actually in some particular data processing situation.

Creating a new DV_CODED_TEXT with extra codes in it, for the same data item that you already have a DV_CODED_TEXT[id4] for in the parent archetype might work in a kind of messy way, since that new DV_CODED_TEXT[id0.2] would not be understood as a specialisation of the existing one in the parent, but rather an alternative (in the ADL understanding of that word) to the one in the parent. So the runtime processor would allow you to put in any of the terms in either the [id4] or [id0.2] nodes which will create the effect you want in the data, but now there is no longer a single constraint containing the logical intended value set - it’s spread across two constraints. It would be like creating 2 separate Integer range alternatives like {|0..3|} and {|4..10|} to obtain the effect of {|0..10|}. So if you were visualising the archetype, or doing any tool processing, it won’t be obvious that some of the alternative constraints should logically be joined together.

Anyway, do you agree that the required + strength change achieves what we want (assume that the required is really just a function that checks to see if strength = required) or do we need something more? I’m not sure if I understood the last bit, you might need to expand on that a bit!

last bit was related to your second paragraph, explicit id4 specializations could be created, and even id4 be prohibited. This would also give some actual meaning to the occurrences in DV (i.e. if it is defined as alternatives of 1…1 objects it wouldn’t be able to remove “terms” from the subset, but if they are defined as 0…1 then you probably can)

In any case I think the proposed change would be enough :slight_smile:

Sounds good to me.
It has the additional advantage (over my initial proposal with code_list_open in 1.4) that 1.4 and 2.0 are more aligned. (My intention to suggest code_list_open was to keep the 1.4 change minimal and aligned with C_STRING’s list_open)

OET’s “limit-to-list=false” in the described case would then probably best match “required = false / strength=extensible” (unless someone thinks strength=“preferred” is better?)

I think the default if not specified for existing archetypes should be required=true / strength=required.
That’s my understanding of what it means at the moment and I wouldn’t change that. No migration needed, no change in meaning.

1 Like

Well that’s certainly what the formal meaning is. But is that compatible with @ian.mcnicoll’s requirements that internal code-sets can be ‘expanded’?

1 Like

Yes, I 'm ok with that - it reflects the current position in ADL1.4. It can be the default for existing archetypes where not expressly stated but we can always change existing archetypes (I assume going from required = false to true would be a breaking change) or decide what is needed for new archetypes.

1 Like

I agree on this.

As an alternative, could all elements with the combination DV_CODED_TEXT with a value set and DV_TEXT, have required=false as default?

I can see why you want this, but technically it would require interpreting a DV_CODED_TEXT with a certain definition differently depending on whether there was a DV_TEXT next to it. But I don’t think we need to do it quite like that - consider that the presence of the DV_TEXT enables you to use the DV_TEXT anyway, regardless of what the DV_CODED_TEXT says. So if the DV_CODED_TEXT says ‘required’ but there is a DV_TEXT as well, then I think the correct interpretation is:

  • if you code this, it must be from these codes
  • or you can use text
  • (but you can’t use some other code)

Does this achieve what you want?

But… the entire point of making this change was to get away from the DV_CODED_TEXT + DV_TEXT pattern?

Then I am confused: I read your previous question to be referring to archetypes that do contain a constraint of the form DV_CODED_TEXT + DV_TEXT (i.e. as currently exists)?

Technically (i.e. from a software processing point of view), if there is no DV_TEXT at all, then it means that you can’t create a DV_TEXT instance for that data item, and satisfy the archetype. I don’t see any way out of that…

But maybe I misunderstood the requirement.

The DV_CODED_TEXT + DV_TEXT pattern is generally (only?) used when we identify that an element needs to be coded and there is a set of terms that are generally used, but we can’t be sure that the codes we include in the DV_CODED_TEXT are the only ones for every use case. The idea is that the DV_TEXT would be made into a DV_CODED_TEXT, and new terms/codes added, in the template. We’ve been told that’s not a good pattern to use, for Technical Reasons.

Aha I see. In those cases, you would modify the archetype to:

  • remove the DV_TEXT
  • change the DV_CODED_TEXT terminology constraint to have strength = extensible or strength = preferred

But any cases where the DV_TEXT is meant to stand for an actual text, it needs to be retained.

Note that it will take me a while to make the change to AOM1.4 and AOM2, and then come up with a way to represent it in ADL1.4, and ADL2, but I’ll work on it ASAP.

1 Like

Agree! :+1::smile:

Hi Silje, yes but also commonly needed where there is a need to fall back to free text where a coded entry is not available - one example being the causative agent in a n allergy where the drug that caused the allergy is not yet on the drug database (trial drug, foreign product, drug database not yet updated).

You are right that we can currently sub-class a DV_TEXT to DV_CODED_TEXT to make the list extensible but this does get messy and we should find different approaches if we can.

So agree with Thomas’s last statement.

Where we only require DV_CODED_TEXT then ‘strength’ will clarify that purpose. More importantly the same rules should apply in templates. We have a current block in AD where if a DV_TEXT is sub-classed to DV_CODED_TEXT, the DV_TEXT ‘choice’ is removed, so essentially we lose the ‘trick’ to allow is to extend the coded_text list or add free text (where appropriate).

True, but those kinds of elements (almost?) never have internal codes in the archetype. Causative agent of an adverse reaction is a good example of this. This is a pure DV_TEXT element where we would like to code it, but it’s perfectly open for free text.

The cases I’m talking about above are the ones where we always want the elements to be coded, but we’d like templaters to be able to choose between the codes in the archetypes, or their own codes.

Sure but we also need to understand how these rules flow through to specialisations and templates. and where external codes/valuesets have been defined in the archetype.

I’m not sure I understand the points here, but looking at the initial issue https://openehr.atlassian.net/browse/SPECPR-302 what I see is a modeling problem, not an RM issue. If the codes initially set in the archetype are not for general use, then the node should have an ACNNNN constraint to use an external terminology/subset. If the name of the subset is generic, implementers could use any subset they want just by setting the right API accessing the codes. At design time, what modelers could do is to set the ACNNNN code for the coded text and provide a sample subset, instead of a local list of codes inside the archetype itself. The CKM allows to create subsets/termsets I think, so that could be a good use case to use that feature in the CKM.

Of course I might not understand all the requirements in detail, so I might be missing something here.