Why is value not optional for DvCodedText

Anything important is in the specs. If you go to Section 5.1.1.2 of the Data Types spec, and read a bit, you’ll see the reasoning referred to in this discussion, along with much else we considered at the time. We were very clinically driven from early on!

1 Like

I believe at the OO modeling level the idea to make value mandatory was to ensure the data contained the full context of the record. Basically when a doctor records a code, they don’t see the code, they see the text, since they are clinicians not coders. The clinical context for them is that text, which internally could be coded. The text gives the clinical semantics while the code_string + terminology_id gives the ability to process that data.

IMO this differs from the goal of FHIR, which is data exchange for different use cases, some might be only for processing data, in which cases the full context might not be needed, though for clinical contexts the text should be required, i.e. showing the data to a clinician on a screen. If in such case, the text is different from the original one, this could cause misinterpretation of the information and could be a patient safety issue.

These are the considerations we need to take when we deal with clinical info, is not only moving data from A to B.

3 Likes

Imho
The technical Reference Model (including DataTypes) must be unrestricted as possible.

The implementable Template is just the opposite. As much as possible should be restricted because they are implemented at point in time and in a certain context.

Archetypes are intermediate. More is restricted than at the technical level because they are clinical models usable at any point in time and in a restricted set of contexts.

1 Like

Gerard, agree in general. However, there is also a need for coherence of models. If we create a type Quantity with a value field (DV_QUANTITY.magnitude in openEHR), that field needs to be mandatory - a Quantity with no value is meaningless garbage. So it’s not 10% optionality…

1 Like

I just think the DvCodedText shouldnt relate from Text, i like inheritence but here i think it has more downsides.
The whole thing should be an own class.
The name value is misleading since i would expect that the code would be entered here.
This results in a somewhat weird structure, i would expect the value, meaning and designator/terminology as part of one nesting, as it is in e.g. DICOM and FHIR.
Furthermore I interpret the term mandatory a little bit different since i dont think a freetext description for a coding is something that should be mandatory, yeah feel free to add it, but for operating …
Anyhow there is also a good reasons to make it mandatory stated above from you.

It could have gone the other way. When we modelled this in the past, the clinical people were pretty adamant that they wanted any text item to be seamlessly replaceable by a coded item. So we did it that way. More recent discussions (e.g. this one with 72 replies :wink: came to the conclusion that if we had our time again, we would model it like this:

  • DV_TEXT (abstract)
    • DV_PLAIN_TEXT
    • DV_CODED_TEXT

So if you wanted to have a value in an archetype with a constraint meaning ‘text item, coded, or plain’ (aka coded with exceptions or CWE in HL7v3, v2), then you would just constrain the value to DV_TEXT. Otherwise you constrain it to only coded or only plain text.

We might still do this…

1 Like

@thomas.beale
I agree that in order to be sensible a Quantity type and its value field is there be filled; at some time. In the RM it specifies a field where in the Archetype, Template or at Runtime a value could be entered and documented. It can be restrained and made optional or required in the Archetype or Template.

Sounds good to me, i dont think this has super high priority obviously, but the change would be welcome (at least from my side).

If we are going to make significant changes there is a case for using something more like the FHIR CodableConcept which flattens the hierarchy and also makes handling of defining_code vs. mappings somewhat easier to manage.

1 Like

That will break a lot more software. The DV_PLAIN_TEXT addition wouldn’t break much. Flattening the hierarchy too much causes problems in archetype tools, and indeed, when I implemented FHIR as a reference model to see what these types would look like, I had to add a workaround to ADL to deal with it. It’s actually more of a problem having everything flattened in EHR type systems, because the types are not proper ‘models’ of concepts, but rather munged data objects containing elements that properly belong in different classes in a clean model.

There’s nothing wrong with having those flattened forms as well for certain purposes, and we probably should contemplate it for openEHR, as a more efficient retrieve format - but even then we have to think what the requester is going to do with the data - if they want to compute with canonical structures, then they are better requesting canonical structures. If they want to do reporting or some other BI kind of activity, the flatter forms probably help. But for commit, it would create a lot of problems (as people who try to create true EHRs from FHIR are going to discover…)

I’d be interested in other tech views on this . My experience is that traversing these hierarchies is quite painful for developers and that , at least in this case, we should at least consider a CodeableConcept facade.

A facades is a different thing - it’s a mini API, but it allows you to maintain the formal coherence of the interior structure without having to manually traverse the structures. I would be quite happy to put more of them into the RM.

There really are concrete differences between flattened ‘classes’ of the FHIR variety and ‘designed’ classes. In the latter there are all kinds of cardinalities and invariants that define the semantics, and it is not uncommon to see things like ‘field_a /= Void xor field_b /= Void’, not to mention ‘for_all a: items | a.field_a = value_x’ and so on. These are nearly impossible to get right (or even to write) in flattened classes, and the result is that developers easily create data that is semantically invalid, but the ‘model’ doesn’t detect it, so we get garbage in real systems.

I’ll say in advanced that I’m not keen to have a software engineering discussion in the context of this particular issue. I just want to make the following point though, if by ‘flattening’, we’re talking about squashing a new of IS_A assertions about an ontology of things into a single type, then I’ll take your word regarding the difficulty of expressing invariants in that case.

However, there is potentially another meaning of the word flattening here, which is creating a number of types without implementation inheritance, as in DV_CODED_TEXT including a DV_TEXT field but not inheriting from it. The lost DV_CODED_TEXT IS_A DV_TEXT semantics then goes to a constraint language that operates at the data composition level, which is Archetypes for us. Well, not directly, instead it’d be expressed at the point of use by allowing a data field to have a closed set of types consisting of DV_CODED_TEXT and DV_TEXT, which is something we already have I think. This is another way of expressing the same model of coded and not-coded text data, and you could still write your invariants in the non-inheritance scenario, on a class by class basis. So I finally arrived at the point I wanted to make: there’s more than one potential meaning of flattening here.

I’m not saying this to suggest what we have now is wrong. I have nothing but huge respect for the design you put in place for the RM, and you did it when the software industry was just getting to use Object Oriented design as a mainstream means of production.

I do think though, adoption of openEHR as an Object Oriented model of computing health data is an opportunity that time left behind. I don’t think that’ll happen. Ever. The data projection of that OO model however, has a bright future, around REST, AQL and other things to focus on data. So in my opinion, going forward, arguments that make sense in the OO centric view of openEHR have the risk of hurting adoption.

I’d therefore suggest gradually extending the set of RM with types with new ones that use less inheritance and shifting the means of validation to second level of openEHR as a long term strategy. I won’t go as far as building an openEHR 2.0 based on this approach because I don’t think I’ll live that long.

If you want an extreme example, see the HL7v3 Act and ActRelationship classes - 22 and 18 attributes respectively. They could never adequately define the rules for what values were allowed in each field, depending on what values were in other fields, or when optionality applied. The interdependencies make it impossible. It’s only possible to do it when the class represents a coherent type all of whose attributes relate to that type and not either semantic parents, or composition sub-parts.

Industry activity suggests otherwise :wink: But I know what you mean. However, openEHR RM isn’t trying to be anything other than a coherent canonical model of data in patient records. The main thing I regret was not using more generic types, because I didn’t realise that Java etc would eventually get them kind of right (even if they do evaporate in byte code). Then we could have had things like Measured<Quantity>, which would have been very nice. Inheritance is already pretty minimal, and about the same as in other typical industry models.

Agree on various data projections, but to have projections, including models optimised for data retrieve, we have to have the original model to generate from. Starting with projections is very risky.

I’m not aware of anywhere where the amount of inheritance is a problem. Whether we got every inheritance relationship (the DV_TEXT / DV_CODED_TEXT example) is a question. But the inheritance doesn’t seem to cause any problems generally - maybe the DV_ORDERED part of the model could be easier. But we could achieve that by things like Measured<> which would move all the accuracy and related fields to the right place. Indeed, the current Quantity classes are in fact a good example of the flattening/ mixin problem, created by us (mainly me).

I think the problem is less the amount, more the way some classes are inherited at least in this case. Surely I dont know about the rest of the RM, as you guys do.

Having the current inheritance helps to choose between DV_TEXT and DV_CODED_TEXT on each field declared as DV_TEXT, which is a useful feature.

Which could be modeled as Thomas mention below, but it wasn’t:

The possibility of having a text and code it, in runtime or by constraints, should still be there.

2 Likes

Thanks i already read that, i just do not 100% agree with that design choice :wink:
Thats exactly why it has the “issues” described above with naming and cardinality, at least in my opinion. But there are also valid points in doing it like that as you and thomas already stated.

1 Like

Realising that I am bit late to the party, I won’t go into a philosophical discussion, but want to quickly voice my opinion here:

  • Inheritance in openEHR feels about right to me, generally speaking. (If ever redone, some streamlining may be useful).
  • The generic types approach Thomas mentions would have been nice, but too cumbersome for tooling at the time
  • DV_CODED_TEXT is a trouble maker as can be seen with this renewed discussion
  • I used to agree with the idea of having the DV_TEXT text value mandatory and believe it was the right design choice at the time. However, nowadays you can consider e.g. Snomed CT and LOINC almost as a commodity and this changes the balance somewhat.
  • I can see why Thomas disagrees, but - from my (in this case) maybe rather pragmatic perspective - the FHIR guys got CodeableConcept pretty right. Leaving the difficult question of legacy data, potential migration, multiple ways of expressing the same thing, etc. aside, I think a CodeableConcept-ish datatype would be useful.
  • Not sure how exactly Ian sees the CodeableConcept facade working in practice, but maybe this is something to be explored further.
1 Like

This is what CodeableConcept looks like:
CodeableConcept

And it’s sub-part type Coding…
Coding

In this model, we can see a field ‘userSelected’. This demonstrates the factoring of FHIR types as being designed for clinical data retrieval - contextual attributes such as this are mixed in with definitional attributes. In a canonical model, you would never do this. The correct name for the Coding (based on its attributes) type would in fact be something like CodingUsedInData. It is easy to imagine applications that want to use this type, but could never populate this field, for the simple reason that there are numerous places in models and applications you want to represent codes that are not captured via user entry.

Indeed, FHIR-based app developers are doing this all the time, and have to skip past this field, and make sure no software reading the Coding object does anything wrong…

This is a model in the Foundation Types to do this job (this has been around for a few years now).
TerminologyTerm

We could quite easily use these types outside of the EHR part of the RM, e.g. in Task Planning, GDL and so on.

I remain very comfortable with having value as mandatory - we do come across occasional instances of ‘technical codes’ where there is no formal text value , though the code itself may be human readable. In that case I simply advise replicating the code as text.

re CodeableConcept - I agree that the ‘display’ is badly termed - it should be (and I suspect is often taken to be) what we would call the defining_code. Nevertheless having all of the assigned codes in a simple list, with the ‘source of truth’ failed is way easier to understand, than explicitly separating the mappings as we do.