Why is value not optional for DvCodedText

SevKohler · 18 October 2021 12:48

In FHIR the display for a coding is actually optional, which i think is not a bad idea. Im currently mapping those and just asked myself why the value for DvCodedText is mandatory.
In the end the value does only add optional information to the DvCodedText, important and mandatory in the end are the system and the code.
Surely the value is especially important to an DvCodedText when defining e.g. internal codings. Anyhow important is not crucial.
If you think about e.g. SNOMED and LOINC codes the value is totally redundant in the end.
Surely it makes the DV_Coded_Text more readable, but are there any other reasons for it ?

yampeku · 18 October 2021 14:53

Actually is a little more complicated that: value is inherited from DV_TEXT, where it’s mandatory. And having DV_TEXTs with no value would be… interesting to say the least.
In any case if I remember correctly we detected the use case and we proposed the addition of another data type more or less equivalent to coding, but I would assume is not yet approved or fully supported in tooling

SevKohler · 18 October 2021 14:55

Yeah I saw that it inherits the TEXT.
Thanks for the answer

thomas.beale · 18 October 2021 15:10

we had long discussions on exactly this attribute right at the start of openEHR, in about 2003 or so. The principle determined by clinical people was that data that is re-used, including sent as some kind of extract across a system boundary, has to be ‘self-standing’ and clinically comprehensible. It was understood then that the terminologies from which the codes come might not be installed / available / licensed etc at the receiver location, but at least the value field tells a human the clinical meaning.

A good example was ‘Ross River Fever’, which is a reasonably serious viral disease found in parts of Australia (I think PNG etc as well); a US clinic receiving this data containing a code for a terminology they don’t have can still manage as long as the value field is filled - even though they probably have to physically look it up in google…

In more modern times, we might expect lightweight NLP could do something with the value field in the absence of a useable/resolvable code.

I believe the working rule today is: if the value and the code (and optionally preferred term) are available at source, put them in the data - make it as useful as possible for when it is taken out of its operational context.

SevKohler · 18 October 2021 15:30

Yeah in the end one has to differ between what is mandatory on a technical level and what is mandatory in the eyes of a e.g. clinician.
Thanks for the exact explanation Thomas.

joostholslag · 18 October 2021 16:49

Are there any notes or other recordings of the discussions back then?(a) I’m really curious to learn more about the history of the specifications.

yampeku · 18 October 2021 17:08

Old mailing lists are here
Definitely discouse is better in every aspect

thomas.beale · 18 October 2021 17:34

Anything important is in the specs. If you go to Section 5.1.1.2 of the Data Types spec, and read a bit, you’ll see the reasoning referred to in this discussion, along with much else we considered at the time. We were very clinically driven from early on!

pablo · 18 October 2021 17:44

I believe at the OO modeling level the idea to make value mandatory was to ensure the data contained the full context of the record. Basically when a doctor records a code, they don’t see the code, they see the text, since they are clinicians not coders. The clinical context for them is that text, which internally could be coded. The text gives the clinical semantics while the code_string + terminology_id gives the ability to process that data.

IMO this differs from the goal of FHIR, which is data exchange for different use cases, some might be only for processing data, in which cases the full context might not be needed, though for clinical contexts the text should be required, i.e. showing the data to a clinician on a screen. If in such case, the text is different from the original one, this could cause misinterpretation of the information and could be a patient safety issue.

These are the considerations we need to take when we deal with clinical info, is not only moving data from A to B.

GerardFreriks · 19 October 2021 11:38

Imho
The technical Reference Model (including DataTypes) must be unrestricted as possible.

The implementable Template is just the opposite. As much as possible should be restricted because they are implemented at point in time and in a certain context.

Archetypes are intermediate. More is restricted than at the technical level because they are clinical models usable at any point in time and in a restricted set of contexts.

thomas.beale · 19 October 2021 12:29

Gerard, agree in general. However, there is also a need for coherence of models. If we create a type Quantity with a value field (DV_QUANTITY.magnitude in openEHR), that field needs to be mandatory - a Quantity with no value is meaningless garbage. So it’s not 10% optionality…

SevKohler · 19 October 2021 12:44

I just think the DvCodedText shouldnt relate from Text, i like inheritence but here i think it has more downsides.
The whole thing should be an own class.
The name value is misleading since i would expect that the code would be entered here.
This results in a somewhat weird structure, i would expect the value, meaning and designator/terminology as part of one nesting, as it is in e.g. DICOM and FHIR.
Furthermore I interpret the term mandatory a little bit different since i dont think a freetext description for a coding is something that should be mandatory, yeah feel free to add it, but for operating …
Anyhow there is also a good reasons to make it mandatory stated above from you.

thomas.beale · 19 October 2021 12:54

It could have gone the other way. When we modelled this in the past, the clinical people were pretty adamant that they wanted any text item to be seamlessly replaceable by a coded item. So we did it that way. More recent discussions (e.g. this one with 72 replies came to the conclusion that if we had our time again, we would model it like this:

DV_TEXT (abstract)
- DV_PLAIN_TEXT
- DV_CODED_TEXT

So if you wanted to have a value in an archetype with a constraint meaning ‘text item, coded, or plain’ (aka coded with exceptions or CWE in HL7v3, v2), then you would just constrain the value to DV_TEXT. Otherwise you constrain it to only coded or only plain text.

We might still do this…

GerardFreriks · 19 October 2021 12:56

@thomas.beale
I agree that in order to be sensible a Quantity type and its value field is there be filled; at some time. In the RM it specifies a field where in the Archetype, Template or at Runtime a value could be entered and documented. It can be restrained and made optional or required in the Archetype or Template.

SevKohler · 19 October 2021 13:00

Sounds good to me, i dont think this has super high priority obviously, but the change would be welcome (at least from my side).

ian.mcnicoll · 19 October 2021 13:12

If we are going to make significant changes there is a case for using something more like the FHIR CodableConcept which flattens the hierarchy and also makes handling of defining_code vs. mappings somewhat easier to manage.

thomas.beale · 19 October 2021 13:21

That will break a lot more software. The DV_PLAIN_TEXT addition wouldn’t break much. Flattening the hierarchy too much causes problems in archetype tools, and indeed, when I implemented FHIR as a reference model to see what these types would look like, I had to add a workaround to ADL to deal with it. It’s actually more of a problem having everything flattened in EHR type systems, because the types are not proper ‘models’ of concepts, but rather munged data objects containing elements that properly belong in different classes in a clean model.

There’s nothing wrong with having those flattened forms as well for certain purposes, and we probably should contemplate it for openEHR, as a more efficient retrieve format - but even then we have to think what the requester is going to do with the data - if they want to compute with canonical structures, then they are better requesting canonical structures. If they want to do reporting or some other BI kind of activity, the flatter forms probably help. But for commit, it would create a lot of problems (as people who try to create true EHRs from FHIR are going to discover…)

ian.mcnicoll · 19 October 2021 13:26

I’d be interested in other tech views on this . My experience is that traversing these hierarchies is quite painful for developers and that , at least in this case, we should at least consider a CodeableConcept facade.

thomas.beale · 19 October 2021 13:34

A facades is a different thing - it’s a mini API, but it allows you to maintain the formal coherence of the interior structure without having to manually traverse the structures. I would be quite happy to put more of them into the RM.

There really are concrete differences between flattened ‘classes’ of the FHIR variety and ‘designed’ classes. In the latter there are all kinds of cardinalities and invariants that define the semantics, and it is not uncommon to see things like ‘field_a /= Void xor field_b /= Void’, not to mention ‘for_all a: items | a.field_a = value_x’ and so on. These are nearly impossible to get right (or even to write) in flattened classes, and the result is that developers easily create data that is semantically invalid, but the ‘model’ doesn’t detect it, so we get garbage in real systems.

Seref · 19 October 2021 15:09

I’ll say in advanced that I’m not keen to have a software engineering discussion in the context of this particular issue. I just want to make the following point though, if by ‘flattening’, we’re talking about squashing a new of IS_A assertions about an ontology of things into a single type, then I’ll take your word regarding the difficulty of expressing invariants in that case.

However, there is potentially another meaning of the word flattening here, which is creating a number of types without implementation inheritance, as in DV_CODED_TEXT including a DV_TEXT field but not inheriting from it. The lost DV_CODED_TEXT IS_A DV_TEXT semantics then goes to a constraint language that operates at the data composition level, which is Archetypes for us. Well, not directly, instead it’d be expressed at the point of use by allowing a data field to have a closed set of types consisting of DV_CODED_TEXT and DV_TEXT, which is something we already have I think. This is another way of expressing the same model of coded and not-coded text data, and you could still write your invariants in the non-inheritance scenario, on a class by class basis. So I finally arrived at the point I wanted to make: there’s more than one potential meaning of flattening here.

I’m not saying this to suggest what we have now is wrong. I have nothing but huge respect for the design you put in place for the RM, and you did it when the software industry was just getting to use Object Oriented design as a mainstream means of production.

I do think though, adoption of openEHR as an Object Oriented model of computing health data is an opportunity that time left behind. I don’t think that’ll happen. Ever. The data projection of that OO model however, has a bright future, around REST, AQL and other things to focus on data. So in my opinion, going forward, arguments that make sense in the OO centric view of openEHR have the risk of hurting adoption.

I’d therefore suggest gradually extending the set of RM with types with new ones that use less inheritance and shifting the means of validation to second level of openEHR as a long term strategy. I won’t go as far as building an openEHR 2.0 based on this approach because I don’t think I’ll live that long.