Regarding the role of ITEM_STRUCTURE

Hello everyone

I am sending this email to clarify the role of ITEM_STRUCTURE in
relation to other structures (such as HISTORY and EVENT) both from the
point of view of EHR semantics as well as the computational view.

My "problem" in one line is that i can't understand if ITEM_STRUCTURE
are there to ensure that their content is interpreted properly
semantically or they really "do what they mean" (i.e. they represent
tables, lists, trees that have to be populated as such).

A related question is also this one:

Is it possible that a C_MULTIPLE_ATTRIBUTE constraint pointing to an
ITEM_LIST will have upper_unbounded=True?

If yes, then ITEM_LIST would have to be dynamically expanding (and this
complicates things a little bit but...that's life)...If no, then the
contents of ITEM_LIST can be considered static (yay!) and you only need
to know that this is an ITEM_LIST for interpretation purposes (which of
course is KEY when you come across an ITEM_TREE)

This will help me in two points:
a) Clarify the role of ITEM_STRUCTURE and use it properly in archetypes
and templates
b) Be able to assign widgets properly (and later read data off them)
when constructing a GUI.

A few more notes are available at the end of this message

All the best
Athanasios Anastasiou

What i understand but would like to verify is that ITEM_STRUCTURE do
what they say they do, i.e an ITEM_LIST represents a dynamic list
(constrained by some C_MULTIPLE_ATTRIBUTE) and an ITEM_SINGLE
represents....just one entry (constrained by some C_SINGLE_ATTRIBUTE).

But i am a little bit confused for two reasons:

a) by what is meant by HISTORY<T>.

HISTORY already implies "A list of events" with the type of the list
being (point or interval)EVENT<T> which could imply "A [single item or
table] of ITEM_STRUCTURE" OR "A [list,tree] of ITEM_STRUCTURE"....and
this is where it gets confusing.

Does that mean that all of the following are valid? (with respect to
HISTORY<T>)

*) a dynamic list of events containing dynamic lists of item structure
(the history.events can expand and so can the item structures held by
events)

*N1) a dynamic list of events containing static lists of item structure
(events can expand, but each event is supposed to simply contain a list
of items that is actually fixed in size).

(dynamic and static expanded for the following lines as well)
*) A list of trees
*) a list of tables
*) a list of single values (this is the most intuitive thing....For
example, a time series represented as a history of point events of
single_value...)

b) by the explanation given in the specs
With reference to (*N1) above:
The example that is given in the spec for ITEM_LIST is that of "parts of
an address". This is what leads me to believe that ITEM_LIST is not
supposed to be a dynamic list but just something THAT IS TO BE
INTERPRETED AS A LIST but from a computational point of view is just a
list. I really do hope this makes sense. (I have gone through section 6
in "data_structures.pdf" and that again points to ITEM_STRUCTURE being
used for interpretation rather than "definition" :-/)

Hi Anthanasios

I think time has shown that this is probably an area of over engineering in openEHR. All archetypes are now ITEM_TREE and could be clusters.

If we think of these as providing constraint on an underlying cluster - ITEM_LIST is a cluster of ELEMENTs and ITEM_TABLE sets up a set of clusters to provide the Information structure of an addressable table.

There is a place for ITEM_LIST and ITEM_TABLE but the other issue is these constraints might be brought to bear at any point in an information hierarchy.

I have proposed in version 2 of the RM that we make these specialisations of CLUSTER as a constraint statement. That would ensure backward compatibility.

Cheers, Sam

Hi Athanasios,

Just to back up what Sam has said, experience has shown that it is
best practice to model every top-level archetype structure as
ITEM_TREE to allow for maximum flexibility to develop the model in the
future without breaking backward compatibility, and as Sam has said
there appears to be good consensus that it would be better to replace
ITEM_STRUCTURE with CLUSTER in the next generation of the RM. There
will definitely still be aspects of the archetyped structures inside
that CLUSTER (via a child CLUSTER ) that are best represented as
simple lists or as Tables, but this is best done by applying some sort
of design constraint on the cluster, rather than using a separate
Table class.

e.g. for a possible Visual Acuity archetype

data
  Cluster Laterality" (As Table pattern with row names carried as
Run-time name constraints Left eye, Right eye , Both eyes)
       Element "Low level acuity"
       Element "Snellen"
       Element "ETDRS"
       Element "Measurement distance"
Element "Mode"
Element "Correction"

This lets express a tabular structure at a much more granular level
than with ITEM_STRUCTURE sub-classes which in th example above would
have to apply to the whole data structure

In summary, you can safely assume that we will only be modelling with
ITEM_TREE for the forseeable future.

Ian

See openEHR 2.x RM candidates A-3 and A4 here.

Hello everyone

Thank you very much for your responses Sam & Ian, they were helpful. In any case the changes you describe do not seem to be reducing the flexibility of the model. This definitely also helped me in clarifying the use of ITEM_TABLE.

There is another part to that question though that is still not clear and if you don't mind i'd like to ask a bit more specifically about:

a) Is an ITEM_STRUCTURE static?
(I think that it's static but i would like to verify that because, as described in the previous message i can't decide just from the specification)
In other words, when you specify an ITEM_STRUCTURE in the archetype editor then you simply denote that you would like your set of defined ITEMs (elements, quantities, counts, etc) to be _interpreted_ as a LIST, TABLE, TREE but that data structure itself will not have to expand (at run time) necessarily to a list, table or tree (Is that correct?).

b)HISTORY holds a list of EVENT<ITEM_STRUCTURE> (parametrising the data attribute here). This particular list COULD BE dynamic (Is that correct?). In the visual acuity example provided by Ian earlier: That would be a HISTORY of 1 POINT_EVENT<CLUSTER> to describe the visual acuity of the subject that was obtained during a regular check-up session in some point in their lifetime. But, if you wanted to measure some drug uptake (or other transient parameter) you would have a repeating POINT_EVENT that could expand indefinitely (because a substance uptake is different for every person). (Correct?)

(Thomas thank you very much for that link, it came later as i was writing this message, i will have a close look at it)

All the best
Athanasios Anastasiou

Hi Athanasios,

a) The very first thing you have to do is to decide on the structure
and this is static. The archetyped definition sits within a sub-class
of item structure. So if I am creating a new Observation archetype and
want to add elements to 'data' I first have to decide which structure
to use. In practice we always now use ITEM_TREE.

b) EVENT actually holds 2 ITEM_STRUCTUREs, one for 'data' , and one
for 'state', each of which are statically defined by the archetype.
Basically any time you see ITEM_STRUCTURE, you can assume that part of
the archetype is going to define the content within this structure. It
is a lot easier to look at EVALUATION as an example as the containing
classes are simpler. This has two ITEM_STRUCTUREs 'item' and
'protocol' . If you look at the Adverse Reaction archetype you will
see that these are separately defined in the archetype.

Ian

So you have to select the ITEM_STRUCTURE class but you don't have to
select the EVENT class? (most CKM archetypes have now EVENT and not
INTERVAL_EVENT or POINT_EVENT)
I think it should be allowed/forbidden following only one criteria.

in general, there is no harm not choosing a subtype if you are only constraining properties of the supertype. In the case of ITEM_STRUCTURE this doesn’t make sense because it is an abstract type with no structure defined; anything you want to constrain will be in a particular subtype.

In the case of EVENT however, you can sensibly constrain just EVENT properties, and if you don’t force the subtype, you are saying - I don’t care if this event happens to be a point event or an interval event, which is entirely reasonable. Although, I must admit I suspect that at least some of those CKM archetypes probably did really intend only a POINT_EVENT, so in some cases, the type constraint should be made.

  • thomas

Hi Thomas / Diego,

As far as the published archetypes are concerned we will have thought
fairly carefully about if and when to constrain EVENT to Point or
Interval, and this is definitely something that should be applied at
template level in most cases but, as ever, we have to be careful not
to over-apply constraints when the downstream use cases are not wholly
clear.

Ian

Hello Ian and everyone

Thank you for your response. (a) and (b) are clear but not towards what the question is really about.

I wouldn't like this to drag on too much so maybe it's better if i implement it the way i understand it and correct any errors afterwards.

All the best
Athanasios Anastasiou

Hi Thomas & Ian,

I see what you mean, and I agree that in its current form
ITEM_STRUCTURE has no sense to be put and not restricted. Maybe there
are other cases where this is still valid (restrict the ENTRY class in
its current form I would say that it has no sense either, but maybe
CARE_ENTRY could be used in the same way). Maybe it is even useful for
those archetypes which is difficult to tell if they are an Observation
or an Evaluation :slight_smile:

that could make sense in the context of a legacy system extract, where you really don’t know what is in there, but nevertheless, you have algorithms that can make an Observation from some types of data (labs, vitals etc) and Evaluations / GenericEntries from everything else …

-thomas

although, you can just leave COMPOSITION.content completely unconstrained, then any ENTRY can go there…

  • thomas

Hi Diego

I think David Ingram has made a valuable contribution; these are empirical solutions to real problems in real systems. The reality of OBSERVATION is that it deals with point in times, intervals ( max, min etc) and analogue readings. These need to be handled consistently or we end up with combinatorial explosion - lab glucose in LOINC is over 200 codes.

Satre said that "belief is confusing things with their names". We need to look at the classes and the utility provided. When we have a small number of archetypes there is no doubt we can manage these things with slots etc. But this requires massive alignment very early in the piece.

Cheers, Sam

Sam,

According to me:

  • Observations have in reality points in time or ranges attached to it
  • As do Evaluations about processes in the patient system they have in reality times attached to them. Inferences are made at a point in time, but relate to inferred processes that come and go, or are believed to be present, or not, during a period of time.
  • As do Instructions
  • As do Actions

Time is never is a discriminating factor that sets Observations apart from the other Entry types.

Gerard Freriks
+31 620347088
gfrer@luna.nl

Let me try and clarify the time aspect of the ontology… the question is not that time doesn’t relate to all Entry types. The question is how. In the Clinical Investigator Ontology Sam and I constructed, there are 3 temporal upper level categories

  • history - i.e. information relating to reality the way it was in the past
  • opinion - thoughts formulated at the current point in time based on what has gone before, and/or what is known so far
  • instructions - concrete statements about what should happen in the future

Obviously these categories were not primarily named based on the temporal aspect, but nevertheless they are based on temporal considerations. They correspond relatively well to the epistemic categories described in Sowa’s upper level ontology as follows:

  • history = Sowa’s ‘history’ category, i.e. a proposition about an occurrent, including anything that can be observed about the state of the subject at a point in time
  • opinion = Sowa’s ‘description’ category, i.e. propositions about a continuant, normally the subject of care. A diagnosis of ‘Diabetes’ for example is saying that the patient John Smith (a continuant) organism includes the diabetes process as part of it
  • instructions = Sowa’s ‘script’ category, which is itself an occurrent representing time-based sequences of events, including conditional decision points

The first and last of these categories are considered by Sowa to be occurrents, i.e. time-related while the second is not. This seems pretty clear.

Now consider the diagnosis archetype (an instance of the ‘opinion’ aka ‘description’ type)… it contains the main ‘proposition’ - i.e. the identified index condition, diabetes or whatever - and a bunch of times / dates / durations / other descriptive details. So how is this not time-based? Well we need to consider that what this information is really doing is drawing a picture of temporal disease course, in terms of these dates and times and durations. These are a way of qualifying the main statement of Diabetes - it is recent, severe, intermittant (symptoms) etc?

So why does temporal aspect this matter, practically speaking? For the very simple reason that ‘history’, i.e. past time, is linear, i.e. representable as a series of events / states; instructions are future time, and therefore unavoidably have a branching, conditional nature e.g. ICU insulin management protocol containing items like: if blood sugar drops below 130 mg/dl, drop dosage. So the data structures for historical data - Observations and Actions are unavoidably different from those for Instructions, which are more like a set of statements, conditions, etc (the structure can be quite complex). And Evaluations (we could have named this one better) has neither historical nor future-instructional nature, but instead the structure of a description or set of statements (thoughts, ideas) about something else - in other words, it could be any structure, so we just use a tree. If we subdivided (as indeed we did in the paper), we could posit a number of more specialised data structures.

See here for Smith/Ceusters/Scheuermann on an ontology for disease course and diagnosis, in which they identify the same categories of information more or less - clinical picture / diagnosis / plan.

So to summarise, it came down to finding categories on which health information data structures - i.e. information models - could be based that would work reliably for most if not all of medicine. Now, having used these categories for some years, it turns out that they are remarkably stable. I know that the grey-zone debate on Observation/Evaluation will continue for ever, but if one steps back for a second, what you see is that for most types of clinical data, it is obvious which category to use. Most of the archetypes for these types are not really in question. Further, noone has identified (to my knowledge) a strong contender for any new kind of category (excepting for sub-categories of AdminEntry, which will probably appear one day).

I think the main thing we got wrong was naming ‘Evaluation’ too narrowly, when what we needed was a name that means ‘what the clinician thought’ (if we had used Sowa’s ‘description’ category I am sure we would now be having arguments about why someone was using a ‘description’ to record a ‘diagnosis’!). We would know we had gotten things radically wrong if now, 5 years later, it were clear that say another 5 or 10 data structure types were needed.

If indeed we managed to get this sort of right (for now), it’s only because we had 3 previous attempts (GEHR 1992-95; Australian GeHR (1997-2000), first draft of openEHR (pre-2005)) where we got it wrong. This is a hard problem to solve. In HL7v3, it was attempted with the ‘mood’ code, which is certainly a reasonable starting point philosophically, but doesn’t in the end help you get the right data structures. This is well known in HL7v3 as a difficulty (and I am not criticising for that, as I say our own little effort was 10 years in the making).

The really amazing thing is that traditional epistemological categories are of such little help. Divisions of a priori / a posteriori / how-to are only vaguely useful (we used them and gave up on Aus GeHR), and yet to a clinician, the differences between the observation of blood glucose over 9mmol/l, Dx of diabetes mellitus and insulin care plan for glucose management are crystal clear.

I don’t doubt that something better is possible in the future, but I think for now some finer adjustments on the current ontology and data structures will be of most practical help.

  • thomas beale

Let me try and clarify the time aspect of the ontology… the question is not that time doesn’t relate to all Entry types. The question is how. In the Clinical Investigator Ontology Sam and I constructed, there are 3 temporal upper level categories

  • history - i.e. information relating to reality the way it was in the past
  • opinion - thoughts formulated at the current point in time based on what has gone before, and/or what is known so far
  • instructions - concrete statements about what should happen in the future

It is clear:

  • things happened in the past before the time of recording
  • things happening at the time of recording
  • things that will happen after the time of recording.

So far the things can be:

  • an Observation
  • an Evaluation
  • an Instruction
  • an Action

And depending on the context the time can be a point in time or a period.

Obviously these categories were not primarily named based on the temporal aspect, but nevertheless they are based on temporal considerations. They correspond relatively well to the epistemic categories described in Sowa’s upper level ontology as follows:

  • history = Sowa’s ‘history’ category, i.e. a proposition about an occurrent, including anything that can be observed about the state of the subject at a point in time
  • opinion = Sowa’s ‘description’ category, i.e. propositions about a continuant, normally the subject of care. A diagnosis of ‘Diabetes’ for example is saying that the patient John Smith (a continuant) organism includes the diabetes process as part of it
  • instructions = Sowa’s ‘script’ category, which is itself an occurrent representing time-based sequences of events, including conditional decision points

Yes they correspond.
History= An Observation about an occurrent (state)
Opinion= Evaluation propositions (inference) about a continuant (process)
Instruction= Scripts to execute protocols (as occurrants) describing sequences of events including conditionality’s

The first and last of these categories are considered by Sowa to be occurrents, i.e. time-related while the second is not. This seems pretty clear.

Helas. I disagree.
The first and the last are occurrents and have timing related with them.
But Evaluations have them as well. Processes (continuants) start and can end in time.
They are recorded at a point in time but the beginnings and ends of these processes extend over time in the past, the present and the future.
The evaluation Diagnosis=Diabetes can have a semantic annotation that this process (condition, continuant) it started at age 65 and is ongoing.
To me this is clearly time related.

Now consider the diagnosis archetype (an instance of the ‘opinion’ aka ‘description’ type)… it contains the main ‘proposition’ - i.e. the identified index condition, diabetes or whatever - and a bunch of times / dates / durations / other descriptive details. So how is this not time-based? Well we need to consider that what this information is really doing is drawing a picture of temporal disease course, in terms of these dates and times and durations. These are a way of qualifying the main statement of Diabetes - it is recent, severe, intermittant (symptoms) etc?

I fear that the inference Diabetes as process in the patient system as you describe IS time based since all continuants are processes with begin and possibly end-times.
What this information is doing exactly what is intended.
Other additional semantic annotations about the severity, frequency, behavior over time are other things than the notion that there was an inference of diabetes in the patient system.

So why does temporal aspect this matter, practically speaking? For the very simple reason that ‘history’, i.e. past time, is linear, i.e. representable as a series of events / states; instructions are future time, and therefore unavoidably have a branching, conditional nature e.g.

ICU insulin management protocol containing items like: if blood sugar drops below 130 mg/dl, drop dosage. So the data structures for historical data - Observations and Actions are unavoidably different from those for Instructions, which are more like a set of statements, conditions, etc (the structure can be quite complex).

Yes.
Each of the 4 specialisations of Entry hav e attached to it specific patterns. And time is just one of the possible semantic annotations of each Entry.

And Evaluations (we could have named this one better) has neither historical nor future-instructional nature, but instead the structure of a description or set of statements (thoughts, ideas) about something else - in other words, it could be any structure, so we just use a tree. If we subdivided (as indeed we did in the paper), we could posit a number of more specialised data structures.

Really?

I can record as inference (Evaluation) something in the past (as history)
As something inferred now or inferred to happen in the future.
All Evaluations with a history, present and future as do the Observation, Instruction and Action.

See here for Smith/Ceusters/Scheuermann on an ontology for disease course and diagnosis, in which they identify the same categories of information more or less - clinical picture / diagnosis / plan.

They use (to me) a funny definition of ‘symptom’.
But as realist ontologists they know that diseases whether diagnosed or not, inferred or not are occurrents (processes) with times attached to it.

So to summarise, it came down to finding categories on which health information data structures - i.e. information models - could be based that would work reliably for most if not all of medicine. Now, having used these categories for some years, it turns out that they are remarkably stable. I know that the grey-zone debate on Observation/Evaluation will continue for ever, but if one steps back for a second, what you see is that for most types of clinical data, it is obvious which category to use. Most of the archetypes for these types are not really in question. Further, noone has identified (to my knowledge) a strong contender for any new kind of category (excepting for sub-categories of AdminEntry, which will probably appear one day).

On the contrary, in spite of your claimed years of experience, I have my claim to experience both in the openEHR world and that of the EN13606 association and this experience shows that they way users use the definitions in openEHR give rise to problems for the users. I agree that nobody will annotate a Blood Sugar Measurement result as an Evaluation or Instruction or Action.
In literature I have seen many wrong annotations because of problematic and unclear definitions in the real of openEHR.
Why are we having this discussion if there were no problems what so ever?

I think the main thing we got wrong was naming ‘Evaluation’ too narrowly, when what we needed was a name that means ‘what the clinician thought’ (if we had used Sowa’s ‘description’ category I am sure we would now be having arguments about why someone was using a ‘description’ to record a ‘diagnosis’!). We would know we had gotten things radically wrong if now, 5 years later, it were clear that say another 5 or 10 data structure types were needed.

I disagree.
When a clinician is listening to heart murmurs as part of the physical examination I can assure you he is listening and thinking a lot, because it is very subtile what he is listening to.
Idem dito for palpation, smelling, etc. And there are more examples.

It is not whether he is thinking as a mental process but what matters is what is the relationship with the patient system.
Is he thinking about an observable?
Or is he thinking, inferring, about (disease) processes on the basis of observables.

For an Evaluation it is clear that it is about a process (continuant) in the patient system and that by definition it is an inference.

If indeed we managed to get this sort of right (for now), it’s only because we had 3 previous attempts (GEHR 1992-95; Australian GeHR (1997-2000), first draft of openEHR (pre-2005)) where we got it wrong.

The word Evaluation is OK.
But we need the correct not confusing definitions with good discriminators to make judgements easy.

This is a hard problem to solve. In HL7v3, it was attempted with the ‘mood’ code, which is certainly a reasonable starting point philosophically, but doesn’t in the end help you get the right data structures. This is well known in HL7v3 as a difficulty (and I am not criticising for that, as I say our own little effort was 10 years in the making).

The really amazing thing is that traditional epistemological categories are of such little help. Divisions of a priori / a posteriori / how-to are only vaguely useful (we used them and gave up on Aus GeHR), and yet to a clinician, the differences between the observation of blood glucose over 9mmol/l, Dx of diabetes mellitus and insulin care plan for glucose management are crystal clear.

I don’t doubt that something better is possible in the future, but I think for now some finer adjustments on the current ontology and data structures will be of most practical he

I can only point at our efforts dealing with semantical annotations in the EN13606 community.

one thing I forgot to mention is that obviously the primary recordings of the events summarised in the diagnosis will normally have occurred previously in time, and be scattered about in the health record, although this need not be the case of course - someone can come in one day and report headaches for the last 6 months. The point is that the date/time information in the diagnosis is a summary of things considered salient by the clinician to build up a picture on which the diagnosis is based.

  • thomas

Not my personal experience, I mean the experience of deployed openEHR systems around the world. In a sense there is no difference between openEHR and 13606 - the same problem has to be solved: what data structures are used to represent various clinical information? It’s just that 13606, being designed solely for haphazard and unknowable (in advance) legacy data doesn’t provide any specific Entry data structures. OpenEHR provides the GenericEntry for that case, but also specific types of Entry for data (legacy or newly created) whose structure is known.

There are various advantages of these data structures:

  • a) you can actually build software that can process the data properly. When everything is a tree, you have to write your own private model for that - everyone re-invents a basic wheel, and those software components are not generally re-usable with different data.

  • b) You can reliably query, because you know where things are. When you aggregate huge amounts of clinical data from numerous sources into patient-centric health records, being able to find Event origin times, Action descriptions, protocol data and so on is worth a lot.

  • c) you have a standard representation for some key patterns, which means archetypes based on those patterns will be the same in those key features, which provides a level of built-in standardisation for archetypes, for those features.

It is not that there are no problems; the problems are to do with, in some cases, knowing what Entry type to use (since you appear to agree with the use of the specialised Entry types, this problem is the same in 13606). The answer has been confused somewhat by us being too purist and trying to completely equate the data structures with the intended ontological categories - as Ian McNicoll has said many times. I have always said, what’s the problem? But as soon as a clinical modeller comes up with an example like some kind of summary, there is a long debate. That’s all that is happening here.

There is nothing basically wrong with the existing Entry types, or at least no evidence of it that I have seen (and a lot of evidence to the contrary). What is potentially not quite right is our understanding of how to use them in all cases. We need to develop better descriptions of which types to use for the more recently apprehended examples like ‘clinical summary’ and various types of lab result which have ‘interpretations’ included.

Interestingly, in the openEHR system implementation environments I know of, these debates are short and sweet. The choice is made, the archetypes, template and software are built and deployed.

  • thomas

Think the background of our discussions is about CLASSfying.