Hi Andrew,
this description of ‘semantic’ attributes is quite useful. People should indeed realise that the named attributes in various health reference models are often semantic concepts that just happen to be defined in the information model, not a terminology. The same occurs at the class level. The point of doing this is that such models commit to some level of meaning-of-structure rather than being completely agnostic. In terms of your explanation, a few points:
-
all models (openEHR, ISO13606, CDA usage of RIM) provide a structural container concept, which can be understood as a ‘document’ or ‘recording’. Classes for this purpose include COMPOSITION (= CDA Document) and SECTION, and also many classes and attributes for defining audit & context attributes. openEHR & 13606 have a similar model for this.
-
the openEHR reference model uses ‘semantic classes’, e.g. OBSERVATION, EVALUATION, INSTRUCTION, ‘semantic attributes’ e.g. ‘CARE_ENTRY.protocol’, as well as generic compositional attributes like CLUSTER.items. Some of these classes, like HISTORY and EVENT are for extremely generic concepts in science/mathematics.
-
the HL7 RIM uses ‘semantic classes’, e.g. Observation, ‘semantic attributes’, e.g. ‘activityTime’, special attributes like ‘contextControlCode’, ‘contextConductionInd’, generic compositional attributes like ActRelationship.source and target. There are some data structure classes within the HL7v3 data types, like HXIT (a generic history concept) as well as hardwired classes like Person name, address etc.
-
ISO 13606 uses no ‘semantic’ classes at the domain level - everything is just an Entry, but it does include a couple of ‘semantic attributes’ in the wrong place (a specific attribute should never be on a generic class) - ITEM.obs_time, ENTRY.act_status.
All three approaches use some special ‘programmable attributes’ which are given values at archetype design time, or at runtime to provide the actual meaning of the instance:
-
openEHR: there is just one such attribute, LOCATABLE. archetype_node_id, which marks any object with the concept code from an archetype (thereby for example making an ELEMENT instance into a systolic blood pressure measurement)
-
HL7 attributes like Act.code, moodCode, classCode, levelCode etc
-
ISO13606 has a few attributes for this purpose: RECORD_COMPONENT.archetype_id, ITEM.item_category, CLUSTER.structure_type.
Another key difference that contributes to the difficulty of mapping & harmonisation: HL7v3 uses a pervasive ‘restriction’ approach in all its modelling, which means that all RIM classes contain numerous attributes (in theory sufficient to cover every possible situation in the universe of healthcare), and these have to be constrained out to come up with classes that mean something for any given concrete situation - the high level classes like Act and ActRelationship don’t have any meaning as such. On the other hand, openEHR and 13606 take a standard object approach to the information model, and have only very general attributes defined on general classes; more specific descendants then add relevant attributes. To give an analogy: in the HL7 approach, if you had a model of ‘animals’, the top class Animal would contain every kind of defining characteristic of particular animals, such as ‘wing’, ‘claw’, ‘fin’, ‘spine’, ‘horn’ etc, and the class Bird would be created by the removal of ‘fin’, ‘horn’ etc.
These differences not just in modelling but in modelling paradigm are what make harmonisation difficult.
My question is: why do we want harmonisation? What we need is a clean clear information model for the health information domain, carrying sufficient semantics to be useful, while being ‘prgrammable’ in some way. Trying to ‘harmonise’ 3 different models & theories of modelling won’t achieve that, it will simply create a frankenstein.
Do we need convertability? If we have a world in which all 3 of these models exist, then there is no avoiding it. The only question is whether it can be achieved in a generalised manner - e.g. a converter than can take any CDA and generate the equivalent 13606 Extrtact - or a custom ad hoc manner - i.e. every CDA => 13606 conversion requires its own converter, or, something in between.
We should also be interested in the qualities of the models themselves. Many government programmes are paralysed trying to work out which one to use. In a different world, the problem could have been solved by an information model so generic as to be a ‘node/arc’ construct; this would be like having to build a car from individual atoms of iron rather than nuts, bolts and sheet metal. ISO 13606 has stayed more generic, based on the idea that it is a lowest common demominator model of interchange, that can’t impose a view of semantics on originating systems. HL7 (both message and CDA forms) is intended to be a model of message & document interchange, and is quite generic in the sense that you have to make everything you want out of Act/ActRelationship nodes (essentially a node/arc idea). openEHR on the other hand models some of the basic structures found in science and computing, to help the definition of models.
Hence in openEHR there is quite a good model of time-based data, in the HISTORY & EVENT classes. These classes allow the easy definition of structures via archetypes for:
- any time series of instantaneous measurements, like vital signs
- inclusion of ‘interval’ events, representing e.g. 4h average, max, etc
Another inclusion in the model is OBSERVATION.state (also in EVENT), where the state of the subject of recording can be recorded. This enables a very clean representation of the history of events in an oral glucose tolerance test, where not only is there timing involved, but also the state of the patient at each point is key information (post fast, 1 minute post 75g glucose challenge, 1hr post 75g glucose challenge etc).
As mentioned above, there are various kinds of specific attribute in HL7, including in Act descendant classes.
The key question here is: how easy is it to create technology-independent models of content, based on these models? In other words, can I create a model of the many medical concepts found in clinical information in some framework based on an information model - and can I reuse that one definition for a) messages, b) screen display c) screen capture d) reporting, etc.
This is what archetypes offer: a method of single-source semantic modelling. Now the choice of information model becomes critical. If the information model is just node/arc, we get no help, and we have to soft-model different ontological categories like Observation and Instruction over and over again, and build every simple common structure like ‘history of events with state’ from scratch every time. If it provides some of this help, then the clinical models of content don’t have to re-invent such things, and life is easier. If it doesn’t modellers will model all of these things differently every time, and greatly reduce the chances of the information being interoperable. One of the other key values of archetypes is that they support a coherent and portable query methodology, called Archetype Query Language - based on archetype paths. This provides a lot of power over using and reusing health data.
In summary, openEHR has opted to provide some ontological and structure concepts in its information model that are common across healthcare, rather than force its archetypes to have to re-invent everything each time - in other words to standardise common structure concepts that otherwise get reinvented in incompatible ways. This is the reason that there are some hundreds of archetypes in the CKM right now, and the NHS was able to fairly quickly build over 1000 archetypes in 2007. It has proven much harder to do the equivalent thing with HL7 v3. There are RMIMs, but these contain a lot of messaging attributes and it is hard to see how to reuse them in non-message situations. One can create CDA templates, but as they rely on HL7 RIM constructs at level 3, you still get hit by the difficult RIM semantics, and lack of common structure classes.
So, here is an existential question: what it the so-called Detailed Clinical Models activity trying to do in fact? Use a node/arc model and create its own domain models for everything (including having to replicate History-of-events everytime)? It could do this easily enough: just choose a model with some document/section container semantics, and a simple Cluster/Element internal structure, and then use archetyping. In the history of openEHR we were doing this in about 2002. What we found was that being too simple came with a huge cost - limited clarity, and unnecessary replication of typical structures. Over a period of some years, we improved the openEHR RM to include things that domain modellers should not have to keep re-inventing. The openEHR RM is the only model I know of that has been actively re-engineered over time to directly absorb requirements from domain modellers using it. I would suggest that DCM thinks very carefully about what its goal is, and how many years it wants to take to get there. openEHR already offers a very solid basis for doing much of what DCM could be aiming for, and could greatly reduce the time taken to make progress.