Uncertain, unknown and no information

Clinical data is hard to get right. One problem is to handle the unknown and/or uncertain semantic. I assume you all have met the challenge to model a value list of

  • yes
  • no
  • uncertain
  • unknown
  • not applicable

Norway is currently implementing a screening program for colorectal cancer. There will be #fhir messages with the data and extensively use of #snomed-ct. DIPS is developing #openehr models (templates and archetypes) to be used within our systems. We need to map these value sets into openehr somehow.

The straight forward solution is a DV_BOOLEAN element with some usage of null flavours.

OpenEHR has unknown as a null flavour. Some argue that unknown and uncertain are different statements.

Thomas Beale wrote this wiki page back in 2007 : https://openehr.atlassian.net/wiki/spaces/spec/pages/4915211/Null+Flavours+and+Boolean+data+in+openEHR

The question is:

What do you think about adding more terms into null flavour and maybe adding uncertain as a first candiate?


I believe null flavour codes can be extended by the implementation, then harmonized with the openEHR terminology, because we need first the examples from the use cases to start defining new codes.

Current codes are few …

… and semantics are not well defined in the specs AFAIK:


Data values are connected to spatial structures via the value attribute of the ELEMENT class of the representation cluster. This class also carries the attribute null_flavour, whose value indicates how to read the contents of the value attribute. Values from the openEHR null flavours vocabulary, including 253|unknown|, 271|no information|, 272|masked|, and 273|not applicable| are used to populate it. Only a small number of generic codes are defined, in order to avoid complex processing for most data instances, for which this simple classification of null is sufficient.

In some circumstances however, additional detail is required in addition to the null flavour code. Examples include reporting and where specific reasons for lack of data have medico-legal ramifications, e.g. ‘patient was unconscious’, ‘patient refused to tell me’, ‘no reason provided’. For these situations, the optional null_reason field may be used to record a specific reason.


@bna What does ‘uncertain’ mean? In what context - data presence/absence or clinical certainty. Serious question. We probably all use it differently unless it’s attached to a value set.

In the archetype modelling we only use Boolean data types when we are sure there is only a Yes/No or True/false clinical answer. Then the null flavours are related to whether data is available or not.

In clinical modelling reality we hardly ever use Booleans, because there are hardly any clinical situations where there is a clear black and white answer. Too often there are subtle shades of grey, which I suspect might be one of the reasons driving your request for ‘uncertain’. In those ‘grey’ situations in archetypes, we tend to use the pattern of CODED_TEXT with a codable value set ‘Present/Absent/Indeterminate’. In this context ‘indeterminate’ means ‘we looked but we couldn’t tell’ - and that could be because of inexperience or simply that it could not be discerned even by the most experienced clinician, but it is still a very important and valid clinical finding that needs to be recorded in its own right, not as a RM ‘flavour’.

Just throwing this into the mix and stirring the soup :confounded:



Having through this journey repeatedly, and going through it again with another ‘reigistry-type’ dataset, I’m definitely with Heather on this.

I avoid DV_BOOLEN almost everywhere and always prefer to use a DV_CODED_TEXT even if there are only Present/Absent type answers, partly for similar reasons to Heather boolean have a nasty habit if not staying that way ’ grey Booleans’ and also, at least the UK the present/ absents are often carried as SNOMED terms. It is also very hard to come up with standard terms for the unknown/indeterminate/equivocal variants, and right now mix and null_flavours are tricky to use (if mixed with non-null options) and I think should mostly be reserved for essentially technical gotchas, not for normal clinical recording.

Having said all that, I wonder if there is an opportunity to see if we can work out some standard patterns and usage, and make use of SNOMED terms as either the primary terms or mappings. That would require a conversation with SCT International about licensing but I sense there is a change in their approach with the Global Patient Set, so perhaps there is conversation to be had there.


First: Sorry for leading the discussion into DV_BOOLEAN :slight_smile: We prefer DV_CODED_TEXT or DV_ORDINAL for most such use-cases. The boolean track is a rabbit hole we want to avoid…

The issue I want to discuss is if we should do some work to find patterns to be used for there use-cases. I think it is of interest to have a shared way to express such statements. It would benefit the data for secondary use.

I don’t have a clear idea on how to to this. Some options:

  1. DV_BOOLEAN with an extended NULL_FLAVOUR
  2. DV_CODED_TEXT with value set defined in archetype
  3. DV_TEXT specialized with external terminologies to reuse the value set between elements
  4. Archetype some ELEMENTS be reused
  5. Other options?

The most common pattern we follow is DV_TEXT in archetype which we specialize into DV_CODED_TEXT with terminologies. It works somehow great.

The reason why I revisit this topic now is the work with coloscopy report where the national program develop FHIR resources. Here they use a combination of HL7 FHIR null flavour and/or absent terminologies combined with local value sets for the specific resource or quality registry. This is why I had an idea of making such value sets a part of the reference model.

I am not sure if it is a good idea. I can see pros and cons.


  • Make such statements semantically defined
  • Suited for international models


  • Many of the statements we find is not semantically well-defined. And as such it makes no sense to put them in a small box of terminologies

Other thoughts?