openEHR-technical Digest, Vol 18, Issue 38

There is plenty of health informatics science that tells that combining data from various systems is only possible when each data element is uniquely coded.

That single use case alone - reusing data from multiple systems - justifies the SHALL linkage between data element/node and terminology as Snomed CT.

I agree with Sam that the meaning should be derived from the DCM and the collection of data elements in it.

Here is where the science of data Modelling and the science of clinical terminologies meet and team up.

Vriendelijke groet,

Dr. William Goossen

Directeur Results 4 Care BV
+31654614458

My two cents,

A nodeID only has to be unique inside an archetype, because an archetype with a specific archetypeId is considered unique.
The path to a data-item contains the nodeId and the archetypeID, and together they form a unique combinations. (most of the cases the paths contains more then one nodeIds and archetypeIds (in case of slots))

A data-item is not identified by the nodeId, but by the archetype-path to that data-item

Bert

hi William

There is plenty of health informatics science that tells that
combining data from various systems is only possible when each data
element is uniquely coded.

That single use case alone - reusing data from multiple systems -
justifies the SHALL linkage between data element/node and terminology
as Snomed CT.

I generally find that only 5% of my data elements have an appropriate
code in snomed. Sometimes I can find a general code that can be used
as the code for the field, but it's actually the root code for the set
of values that is the answer. That's not the same as the code for the
field. And I think that given the sparseness of Snomed-CT for
elements, that the granularity of the available codes is too coarse to
even detect subtle mismatches between different systems.

Though I do very much *wish* that what you say was possible

Grahame

exactly correct.

- thomas

Hi Everyone,

There is plenty of health informatics science that tells that combining data from various systems is only possible when each data element is uniquely coded.

I sincerely believe that if it wasn't for the positive publication bias
there would be even more science that tells that combining data from
various systems (using various standards) is incredibly hard.

That single use case alone - reusing data from multiple systems - justifies the SHALL linkage between data element/node and terminology as Snomed CT.

In the space of representational artefacts DL ontologies such as SNOMED
CT and many of the OBO ontologies make a balance between
representational power and computational complexity so that the domain
may be represented to some extent while keeping complexity (i.e.
reasoning time) on a practical level. So, SNOMED CT can be classified in
seconds at the cost of a limited representational power. Archetypes on
the other hand have no similar constraints and thus, there will always
be things in archetypes which cannot be faithfully represented in any
computationally manageable ontology. An option would be to add primitive
content to the ontology but the cost of maintaining such content is
considerable as is the risk of introducing quality issues.

How shall the SHALL linkage between data element/node and terminology to
be interpreted if there is nothing on the terminology side which
matches? Should the existence of codes be ruling over record keeping
requirements?

Vänliga hälsningar,
Daniel

Re: Ontology & archetype codes

aren't we, here, in the realms of Descriptive v. Prescriptive Grammar?

http://grammar.about.com/od/basicsentencegrammar/f/descpresgrammar.htm

*Descriptive* obliges you to change whenever the language seems to.

*Prescriptive* obliges you to try to hold the language static.

The hard bit is gauging the utility of responding to any given change.

Gavin Brelstaff
CRS4 in
Sardinia

yes, I agree.

And it is the same as communication in a ‘closed world’ or ‘open world’ situation.

Gerard Freriks
+31 620347088
gfrer@luna.nl

Gerard, Everyone,

could you please *NOT* reuse existing terms like "open world" and
"closed world" with an already agreed specific meaning in a well-defined
context for your own purposes!

On the topic of descriptive vs. prescriptive I believe that that is an
additional dimension in this discussion. I still want to have an answer
to the question of what to do with archetype nodes for which there are
no existing terminology correspondence. Should we ban those archetype
nodes or should we (over)inflate terminologies with imprecise content or
should we just accept that archetypes and terminology are different
artefact beasts with different properties and that we have to thread
carefully balancing terminology binding possibilities and specific use
case requirements?

/Daniel

Daniel,

Closed and Open world assumptions are used the world of:

  • Formal logic
  • Knowledge representation

This notion of Open and Closed world assumptions occured to.
Let me explain.
I happen to see a parallel/overlap between: systems that serve a well defined (Closed) community with implicit and explicit agreements and systems to deal potentially with not yet defined things in an not defined (Open) community.
In a system according to the Closed World Assumption all data fields are explicitly and implicitly agreed upon. Nothing that is not defined can not be processed, just like Relational Data bases and messages.
In a system according to the Open world assumption the semantics of a data field are fully defined semantically by archetypes and reference terminologies. There is (almost) no implicit meta-data. Ontological reasoners can fully exploit the data. These are the systems we want but do not have on the market.

Do you have any suggestion for alternative terms?

Gerard

Gerard Freriks
+31 620347088
gfrer@luna.nl

Gerard, Everyone,

could you please NOT reuse existing terms like “open world” and
“closed world” with an already agreed specific meaning in a well-defined
context for your own purposes!

On the topic of descriptive vs. prescriptive I believe that that is an
additional dimension in this discussion. I still want to have an answer
to the question of what to do with archetype nodes for which there are
no existing terminology correspondence. Should we ban those archetype
nodes or should we (over)inflate terminologies with imprecise content or
should we just accept that archetypes and terminology are different
artefact beasts with different properties and that we have to thread
carefully balancing terminology binding possibilities and specific use
case requirements?

I have questions:
What is the purpose of a Reference terminology when it is missing essential and relevant lemma’s?
Perhaps we need several Reference terminologies?
Then the next question is how do we delineate more than one Reference Terminology?

One thing I know:
We need an agreed list of words we use, reflecting concepts we need, in the context of health data inside systems and between systems.
We need a Reference Terminology as a kind of dictionary.
How many dictionaries do we need?
One per domain such as: anatomy, demographics, medicinal product, health and care services (interventions, lab-tests, etc.), structure of documents, units of measurement, family relations, kinds of media formats, etc., etc.

Gerard’s description of what he calls the Open World is precisely the problem of archetype nodes with no terminological bindings. It is possible toreason with them, in prinicpe even not for humans.

WhenI receive data with a node identifier and I can look up in the archetype that the label attached to that node in the archetype definition is systolic, I still don’t know whether or not is a systolic bloodpressure, even when the archetype is about bloodpressure. Only with a validated terminological link, we know the semantics of the node. The designers of the archetype could equally well labelled the node goofey. With the proper terminological binding we know that that goofey node is the systolic blood pressure.

The only way out of this is to collect all those nodes that do not have a terminological binding and provide in a freely accessible document what the meaning of each node label is.

Jan Talmon

I wonder if the purpose of an archetype is not getting unclear i this discussion? Aren't we talking about templates?

I think, the purpose of an archetype is to give context to the data-nodes. They are not meant to be read by machine without human interfering.
In the case, when machines deal with isolated data-items, every node should have a stand alone, unambiguous meaning.
I think the purpose of archetypes is to make software run, and to use them in the inside house, for a well known software purpose. The purpose is to serve humans, not machines

If you want to use data for data exchange, like messaging, or semantic web, you cannot use archetypes in the way they work now.
You have to define messages, like Nictiz did, in some XML-format, or you have to define specific data-format to send to the semantic web, for example for epidemiology-detection or other big-data-purposes.

You can use templates to create those data-constructs/formats.

It seems to me very inefficient to search for a data-notation which can serve every purpose.

Bert

Bert

Archetypes were conceived to support SEMANTIC INTEROPERABILITY. The 13606 is a communication standard, but of course you can also use it to build systems. OpenEHR had 13606 as it root (Thomas Beale was also involved in 13606 as (co-)author of the archetype and ADL parts of the standard) As far as I know, and EHR extract can be wrapped in an HL7 message body (a blob) and transmitted to an other system. In principle archetypes should ease the communication, since you have not define in detail all the data elements in a message, but make the message self explainable.

So it is not a new use case that should be treated differently.

Jan

Hi Daniel,

there is a third answer - the one actually in use. There is little/no economic benefit in coding most archetype nodes. So not being able to code them doesn't matter that much, at least for many years to come while there are almost no environments that could make use of the codes.

The problem today might be what someone considers high value nodes that should be coded, that can't be - either because of a definitive lack of a code, or more commonly due to the inabiity of analysts to figure out which of the many approximately matching ones is the correct match (if any). I have no idea how big or small this problem is today.

There is certainly no point banning archetype nodes that can't be coded, because that would prevent any archetype development (and going by Grahame's comments, any FHIR development).

What I think is needed is a theory that identifies what/ when archetype nodes really need to be coded. Today the best we can come up with is that names of terminal nodes that might appear in queries, plus values of those nodes that have codable values, should be coded.

- thomas

You have a good point in this.
But does that mean that every data-item in isolation should have an unambiguous meaning?
Doesn't it mean that archetypes must be seen as complex data-items which items can become quite meaningless in isolation?

This is how I understood it always.

An address is just a street with a number, but it becomes meaningful if the rest of the archetype tells you that there is an hospital on that address with some services.

Don't pin me on the example, it is just to explain.

But how can you know that the rest of an archetype describes an hospital, and the address becomes meaningful?
You will have to study the archetype and interpreted it.
And is the address the emergency-entry, or the fire-exit-door?
Oh yeah, maybe there is a code for that, somewhere.
It depends very much on your specific situation which one you want to find.

There is no machine which can figure this out. Archetypes are thus, in my opinion, made for humans, and they are fit for semantic interoperability, in that way, they are, in a complex, self-explanatory (for humans).

I believe however, that the promise of flexible automatically/machine interpretable interoperability is not true. An archetyped data set remains something that should be judged by humans.

As you know, I am not an academic researcher trying to find the stone of philosophers. This is only my opinion, from my experience and daily practice. I am building systems based on OpenEHR and also EN13606, yes indeed.

I deal with practical wishes, I want my software to run, and I don't think what you think OpenEHR is, will ever be possible. And why should it be? Any well defined message can do the trick.

My attraction in OpenEHR lies in its flexibility: two level modeling! another very important goal.
That is well reached, the archetype-editors could be better.
OpenEHR, and also an well build EN13606 kernel can fit flexible in the ever changing requirements of health care.

I think, it has to also some connection with the idea of one world wide archetype-repository. But we found out in discussion, this will never happen. So now, in the new ADL-standard, 1.5, there will be room for namespace. Archetypes will not be centralized maintained, but every company will have its own set.

This means, also something for the view on semantic interoperability.

My opinion in software building is: don't let rigid datastructures keep you from innovating healthcare. Software should be able to follow the practice, at low costs, and also quick. Archetyped systems make this possible.

But maybe, for this reason, I should not interfere in this academic discussion. Sorry if that is the case.

Bert

Bert

Each data-item in isolation should indeed have a unambiguous meaning. That doesn't mean that knowing what the meaning is, is helpful. Just that one of the data-items in a bloodpressure data set is the systolic bloodpressure supports semantic interoperability, not necessarily allows for a proper clinical assessment, because the context is missing (was it at rest or after exercise, how many minutes of rest, position of the patient, cuff size). But still we know it is systolic bloodpressure and when you receive it and store it in your system you have to address the unknows.

An address archetype is just an address archetype, but there may be different variants. One with a street + number as one item, the other with street, number, city, postal code, province, country. Still you need to define which item represents what, in such a way that it is interpretable in a consistent way.

A hospital may be described by its address as well as some other items that are relevant, like the services that are provided. Still each slot needs to be defined in such a way that when you receive a set of data structured according to a specified archetype, you need to know the semantics of that archetype. You can say that is for humans to interpret, but I would be nice if your system would be able to receive an archetypes based set of data and be able to populate your system with the received data.

Items that are not recognized may need human intervention to be handled (or just stored as a blob, together with (a reference to) the archetypes.

Jan

Ok Jan, that is your opinion, and with good reason, especially if you look at the history of EN13606 and also, less emphasized, also OpenEhr.

OpenEhr has also as target, being a two level modeling system, as I wrote, really being a big convenience.

Now there is a movement in En13606 of being a two level modeling data storage too.

The kernel I wrote takes EN13606 and OpenEhr data sets, even simultaneous, also queryable, even in one query, different reference models, also CDA, as long as it is expressible in ADL, I can store it and query it.

This flexibility is very important.

I never understood why Nictiz messages aren’t good enough, they are written for use for large variety of Non-HL7 systems, lots of them with very old and rigid data models. Systems at hospitals, nursing houses, GP’s, dentist, farmacies, etc. The whole shebang of Dutch healthcare must be able to use them. They were designed to save lifes, 1800 lifes every year because of avoiding medication errors.
Nictiz worked 10 years on them, they spend 500 million Euro.

Is the message that they are useless? Is it a failure?

Because, if they are useful, and many systems can use them, why wouldn’t it be good enough for an OpenEhr system? Why then emphasizing that OpenEhr should be interoperable on dat-items inside complex data sets?

Isn’t this a legacy requirement? Especially, as I wrote, since the central maintenance of archetypes is in fact given up since the introduction of namespaces in archetypeIds.

I would like to hear how interoperability can be achieved under these circumstances, and if that is still a goal in the way you expect it.

Bert

Companies could make their own set, and sometimes they will make their own specific archetypes, but in the majority, I think they will re-use what is already available. Consider: to create from scratch 20 or so key archetypes (perhaps 400 data points) that has taken 100s of hours of expert clinician time and quality assurance - very few companies could attempt that. Also, companies that routinely make products with archetypes that noone else uses and/or companies that don't share truly new archetypes.... won't have many interoperability partners.

- thomas

In the environment I worked last few years, we created maybe 50 archetypes, few more or less. We did not use one of CKM, but a some were inspired by CKM.

My experience is that customers, users of our OpenEHR services, wanted their own archetypes.
Interoperability was achieved by adopting a message-format which serves the purpose.