Trying to understand the openEHR Information Model

well, yes, there'd be nothing lost, and everything would be in the database. But if the users can only see the last update, then prior stuff is lost anyway. If, on the other hand, users can see the older updates, then they'd simply have no idea what information was current.

I think of that last as the worst possible outcome.

Grahame

well, yes, there'd be nothing lost, and everything would be in the database. But if the users can only see the last update, then prior stuff is lost anyway. If, on the other hand, users can see the older updates, then they'd simply have no idea what information was current.

In a new version, only what the user (f.e. GP) decided to remove from the composition will be removed.
So a new version on a composition is what a GP wants it to be.

It is his responsibility if he removes information, just as it is in a conventional EHR system.
The visible information is all of the current and historic information.

Every composition that is ever written remains visible, unless if it is replaced by a new version, mostly this is a correction, or it is logical deleted by the GP.
Just as it is in a conventional EHR system.

(the difference is that OpenEHR keeps the old version, but that is for archive, and maybe for legal or other research, not for normal EHR-use.)

So I don't see the problem.

Bert

These scenarios were one of the reasons we were very careful to properly model commit time (system time) separately from the times of the visit, observations, actions etc (world time). The commit of the info may come days late, but it is always easy to determine a) what other clinicians could see on the system at time T and b) in what order things happened in clinical reality. The caveat is that the system won't tell you the full story until everyone has committed their data.

This doesn't mean there are no tricky competitive write situations, but via the above, and the versioning semantics (which include system-based branching), there are reasonably obvious strategies for correctly resolving the confusion.

- thomas

(attachments)

OceanInformaticsl.JPG

Hi Gavin and others!

I thought about this a few years ago and came to the conclusion that
the GUI/Client would need quite a bit of savvy HCI.
The person working on the data need to be kept informed
of how/when the system maybe changing under him.

Google documents has now come along and does something like that.
You’re busy editing one section of an article then a networked
colleague begins to edit the same thing.
GDocs tells you who it is and how to communicate with them by a
secondary channel (EHR would be the primary channel).
You can both still keep editing but at least you know you are
going to have double check the result afterwards.
Conflict resolution is best avoided by timely human intervention
rather than automated attempts afterwards.

And GDocs does well even when clients go offline for a short time.
[…]
Gavin Brelstaff - CRS4

Some of the “magic” behind multi-user/multi-device editing in Google docs is referred to as “operational transformation” algorithms.

Have a look at for example:
http://www.codecommit.com/blog/java/understanding-and-applying-operational-transformation
or http://en.wikipedia.org/wiki/Operational_transformation

Very interesting stuff when you look closer at it. Some years ago one of our student projects used some of that power provided in Google Wave to experiment with an experimental partial implementation of a multi-user archetype editor. In that case the operational transformation operated on XML pieces. The simplest case is operations on plain text - that case is usually described in explanations. Open source implementations of operational transformation working with for example pieces of JSON are also available (http://sharejs.org/).

In the upcoming (BMC accepted) paper “Applying representational state transfer (REST) architecture to archetype-based electronic health record systems” (and briefly in my thesis) - I mention the thought of using operational transformation in the EHR editing stage taking place before doing “real” openEHR contribution commits. This would be a possibly interesting replacement or upgrade of the “Contribution Builder” component described in the paper. It would allow for simultaneous shared multi-user and multi-device data entry for many (but not all) use cases. It won’t scale to thousands of users simultaneously preparing the same contribution for the same patient but it should scale well for a handful of simultaneous users per patient if they are somewhat aware of each others duties and responsibilities. The possibility to flag openEHR content as “incomplete” would allow snapshots from the shared contribution build to be persisted in the proper EHR at a regular interval and/or actively triggered by the user when they need to shift attention to other things. Later when considered complete, another version could be marked as complete, be signed and committed.

If anybody currently has time or resources (e.g. master thesis students) to pursue an operational transformation openEHR data entry approach in an open source project, then don’t hesitate to contact me for more detailed discussions and potential cooperation. A bit wiser from my work with the repeatedly delayed REST implementation and publication approach, I’d prefer to do such experimentation in an incremental, multi-site, open, public way instead of only having a big publication/delivery in the end.

Best regards,
Erik Sundvall
erik.sundvall@liu.se http://www.imt.liu.se/~erisu/

P.s. Qoute from the upcoming paper “Applying representational state transfer (REST) architecture to archetype-based electronic health record systems”:

Hi Thomas,

Again, you’ve advanced my grasp of openEHR.

the change set in openEHR is actually not a single Composition, it’s a set of Composition Versions, which we call a ‘Contribution’. Each such Version can be: a logically new Composition (i.e. a Version 1), a changed Composition (version /= 1) or a logical deletion (managed by the Version lifecycle state marker). So a visit to the GP could result in the following Contribution to the EHR system:

  • a new Composition containing the note from the consultation
  • a modified Medications persistent Composition
  • a modified Care Plan persistent Composition.
    Your comment here is in the context of persistent Compositions, and I think what you’re saying is that these are a special case: persistent Compositions, unlike event Compositions, contain only one kind of persistent information, and no event information, thus allowing clean substitutions when that persistent information is later updated. This would avert the horrible scenario I suggested, involving updating heterogeneous persistent Compositions. If I’m grasping you, this makes perfect sense.

Systems do have to be careful to create references that point to particular versions.

Does that mean that tracing a web of connections with current relevance requires systems to present invalidated Compositions to users? Or are the links themselves revised to point to the replacement Compositions? If the latter, how does one avoid having to recommit whole sets revised compositions involved in the affected thread of links? It would seem that you can’t just swap out one item in a tangled web, at least without some very sophisticated compensatory activities. Or maybe links are somehow named in such a way as always to point to the latest version of something, which you seemed to suggest is possible (version-proof links?).

OpenEHR is a remarkable piece of technology. An EHR record is externally a collection of independent and separate documents called Compositions that can be invalidated and versioned and swapped out at any time. Yet, logically and internally, it is magically a vast graph of nodes and edges and vertices, with connections not just within archetypes but also between archetypes. Logically, the nodes (typically archetypes) are not deleted (usually) nor do they lose their initial identity when their contents change or when links between them are altered. One wonders, then, why not just use a graph DB instead of a collection of documents to house the information? Wouldn’t that be a shorter path to the same end and reduce some of the versioning complexity (you’d say that would increase versioning complexity)? Perhaps there are some openEHR implementations that are doing just that. No? Could an openEHR system use a graph DB and still be considered openEHR?

Do you have a picture or map, somewhere, of your metadata graph, or must I examine individual achetypes to see all the links between them?

there is an emerging set of ‘second order’ object definitions, that use the URI-based referencing approach in very sophisticate ways to represent things like care plans, medication histories and so on. I can’t point to a spec right now, but they will start to appear.

What is the motivation for that? To increase the granularity of externally-referenceable objects? What current problem would this solve?

We need something to keep us off the streets…

Not a worry for you, sir. I’ll embarrass you by letting on here how impressed I’ve been with the raw intellect everywhere evident in what I take to be chiefly your creation and the literary talent you have exercised in making it all clear. Great work!

Randy

to be 100% clear: the change set versioning model works for all Composition types - a single change set (what we call a Contribution) can contain versions of persistent and even Compositions. Semantically, your understanding above is correct: persistent Compositions are always dedicated to a single kind of information, usually some kind of ‘managed list’ like ‘current medications’, ‘vaccinations’ etc. Normally when a Composition is committed (within a Contribution) and it contains a LINK or DV_EHR_URI, that link points to the logical ‘latest available’ target. So the link is always valid. Such a link might point to e.g. a lab result Event Composition. The assumption is that the only changes to a lab result are corrections or in the case of microbiology and some other long period tests, updates - but essentially the latest available version = the result. On the other hand, a link to a care plan might easily point to the care plan (usually a persistent Composition) as it was at the moment of committal. If the referencing Composition were retrieved, and that link dereferenced, an older version of the care plan will be retrieved. absolutely. Using path-based blobbing probably isn’t a million miles from such DBs. Personally I used a wonderful object database called Matisse (still around today), which essentially operates as a graph db with write-once semantics, and I would love to have a side-project to build an openEHR system on that. Nevertheless, there are a couple of container levels that have significance in models like openEHR, 13606, CDA and so on: the Composition (can be seen as a Document) and the Entry (the clinical statement level). So it’s not completely mad to do blobbing at these levels, or build in other assumptions around them. for example: provide a fast retrieval ‘map’ of all medications, including all actions, for some care plan, e.g. chemotherapy. ah - don’t blame me for it. I added some engineering understanding and integration along the way, but this work started with a bunch of very smart clinical people who gathered the best set of requirements for the ‘EHR’ concept, during the Good European Health Record project. One of them, Dr Dipak Kalra (now head of department of CHIME at UCL), wrote his on EHR requirements, and one outcome of that was the ISO 18308 standard, on the same topic. Sam Heard and other physicians were key in developing these requirements and the understanding they have given to the domain have greatly affected the quality of the development. This plus numerous technical people, debates, conferences etc have led to the you see today. Have a look at the revision histories, particularly on the EHR IM and Data types - you’ll see a lot of names. - thomas

Using path-based blobbing probably isn’t a million miles from such DBs. Personally I used a wonderful object database called Matisse (still around today), which essentially operates as a graph db with write-once semantics, and I would love to have a side-project to build an openEHR system on that.

I’ll talk to you off line sometime about that. I can tell you from my own experience that it might not be as forbidding as one would think. The more I’ve examined the archetypes and seen how they are linked and the linkage rules are defined, the more excited I’ve become. There’s definitely a way to do this.

Nevertheless, there are a couple of container levels that have significance in models like openEHR, 13606, CDA and so on: the Composition (can be seen as a Document) and the Entry (the clinical statement level). So it’s not completely mad to do blobbing at these levels, or build in other assumptions around them.

I agree it’s not “completely mad” to use your approach partly because you really do need the information in chunks that smell and feel like parchments. But what if you could have it both ways?

ah - don’t blame me for it. I added some engineering understanding and integration along the way, but this work started with a bunch of very smart clinical people who gathered the best set of requirements for the ‘EHR’ concept, during the Good European Health Record project. One of them, Dr Dipak Kalra (now head of department of CHIME at UCL), wrote his PhD thesis on EHR requirements, and one outcome of that was the ISO 18308 standard, on the same topic. Sam Heard and other physicians were key in developing these requirements and the understanding they have given to the domain have greatly affected the quality of the development. This plus numerous technical people, debates, conferences etc have led to the specifications you see today. Have a look at the revision histories, particularly on the EHR IM and Data types - you’ll see a lot of names.

Of course you’d say that. I’ve looked at the names. And using a bit of networking logic, it’s not hard to deduce who has been at the center of it all–and doing much of the writing. But yes, there were also others, and you know better than I how it all actually balances out.

Thanks for the considerable time you’ve spent answering my questions.

Randy

>Using path-based blobbing probably isn't a million miles from such DBs.
Personally I used a wonderful object database called Matisse (still around
today), which essentially operates as a graph db with write-once semantics,
and I would love to have a side-project to build an openEHR system on that.

I'll talk to you off line sometime about that. I can tell you from my own
experience that it might not be as forbidding as one would think. The more
I've examined the archetypes and seen how they are linked and the linkage
rules are defined, the more excited I've become. There's definitely a way
to do this.

>Nevertheless, there are a couple of container levels that have
significance in models like openEHR, 13606, CDA and so on: the Composition
(can be seen as a Document) and the Entry (the clinical statement level).
So it's not completely mad to do blobbing at these levels, or build in
other assumptions around them.

I agree it's not "completely mad" to use your approach partly because you
really do need the information in chunks that smell and feel like
parchments. But what if you could have it both ways?

Hint: think about how you're going to get data out before thinking how
you're supposed to keep it. There are lots of possibilities, but you need
to anchor those with a single method of access. I suggest a brief look at
Archetype Query Language (AQL)

Hi Seref,

Hint: think about how you’re going to get data out before thinking how you’re supposed to keep it. There are lots of possibilities, but you need to anchor those with a single method of access. I suggest a brief look at Archetype Query Language (AQL)

That’s the whole point, Seref–“how you’re going to get the data out.” And certainly AQL is one way to do that. My concern has to do with querying performance (deserialization as a prerequisite to record inspection, etc.) and the infrastructure resources necessary to support them. Thomas hints at possibly some big changes when he said, “There is an emerging set of ‘second order’ object definitions, that use the URI-based referencing approach in very sophisticate ways to represent things like care plans, medication histories and so on. I can’t point to a spec right now, but they will start to appear.” I don’t know how radical that will prove to be. I’d assume they’d still occur within the AQL paradigm. But it does appear that openEHR itself is evolving on this point and perhaps for good reason.

Please don’t interpret my remarks as any sort of disrespect for openEHR; I hope it has been apparent that my respect for the entire system has grown as I have learned more about it. Some really brilliant people, perhaps including you, put this whole thing together. And you all do the whole world a favor by make it all open and by making yourselves available for the sort of questions I have raised.

Randy

I should probably point out that there are some dozens of openEHR operational deployments <http://www.openehr.org/who_is_using_openehr/healthcare_providers_and_authorities&gt;, all heavily using AQL for screen population, reporting and so on. The performance is perfectly adequate in all of these systems for the kinds of queries used in point of care (e.g. typically sub 1-second), and in some cases where ETL is implemented, the performance is also acceptable. It's also true that quite a lot of effort and thinking has gone into optimising AQL queries. There is always a query that can be written that will take a long time to answer, but so far there is no overlap between those type of queries and point of care latency requirements i.e. such queries are always report-oriented, research queries or some other kind of population query, where a (let's say) 5 second response is perfectly acceptable.

There is probably about 3 years of experience of such systems now (there's more like 6 years experience of commercially deployed AQL) that show that the performance challenges of this kind of framework are satisfiable, and no longer a research question (they were once obviously!).

The second order types of structures I mentioned below rely less on AQL, and more on smart commit type rules / triggers logic, which effectively enables pre-built query results to be maintained in a live system.

We're somewhere on a road where we are already riding in motorised transport, but we don't really know if what we have today is a Fiat Punto or a Maserati. Hopefully it's the Fiat, because that leaves us a lot of fun and room to get to the Maserati (at which point we start looking at air travel;-).

- thomas

The performance is perfectly adequate in all of these systems for the kinds of queries used in point of care (e.g. typically sub 1-second), and in some cases where ETL is implemented, the performance is also acceptable. It’s also true that quite a lot of effort and thinking has gone into optimising AQL queries. There is always a query that can be written that will take a long time to answer, but so far there is no overlap between those type of queries and point of care latency requirements i.e. such queries are always report-oriented, research queries or some other kind of population query, where a (let’s say) 5 second response is perfectly acceptable.

That’s excellent! Can you give any idea how long it takes to retrieve into live memory and screen on a user’s computer an entire EHR record of “typical” size and complexity? Or does that not generally happen, with records instead being fetched in smaller pieces?

Randy

Right - you wouldn’t ever pull an entire EHR to the screen. I have seen openEHR applications pulling the main managed lists (say 6-8 Compositions), latest lab results, plus a chronological list of consultations / events for the last year or so, plus key demographic data, all sub 0.5 sec. Then the user starts clicking on things, and more comes back. More interesting screens contain a mixture of text and e.g. vital signs real-time graphs, which AQL copes with nicely - you can bring back just a 2-D array of numbers and timestamps for the graph, using AQL. - thomas

Thomas, somehow I’m not finding the AQL specification. It’s probably right under my nose on your specification/release page. Also, do you have any references describing the AQL processor? Did you write that from scratch?? It would seem that the AQL processor would indeed function as a formidable DBMS in its own right, at least with regard to reads, capable of managing AND/OR logic trees and serving up flat “tables” of joined data structures like any RDBMS.

Randy

AQL is not part of the official specification yet.

Regards
Seref

Smaller pieces, although scrolling would be cool.
Cheers Sam

Dr Sam Heard
FRACGP, MRCGP, DRCOG, FACHI
Chairman, Ocean Informatics
Chairman, openEHR Foundation
Chairman, NTGPE
+61417838808

Seref, I was simply trying to take your hint. :).

I’m glad you’ve considered doing that. In my humble opinion, AQL is the most neglected, yet, probably one of the most important components of an openEHR implementation. It is not part of the implementation, but it has been implemented by at least two vendors that I know of, with a third having something quite similar to it. Personally, I’m happy that it is being cooked in real life before becoming a part of the implementation. If you’re interested in implementing openEHR, I suggest that you keep AQL as a high level priority, rather than saying: “let me use sql/whatever_other_query_language_that_my_persistence_layer_uses for now and I’ll implement AQL later”

  • thomas

Hi Seref,

In my humble opinion, AQL is the most neglected, yet, probably one of the most important components of an openEHR implementation. It is not part of the implementation, but it has been implemented by at least two vendors that I know of, with a third having something quite similar to it.

Neglected? Two vendors? A third with something similar, implying a branching of some sort? Then how has everyone else been accessing their data? This implies the existence of an alternate and older query engine. Is there a link describing this older one? Judging from Thomas’s wiki link, AQL appears part of a full-fledged query engine on the order of any SQL query engine, very sophisticated. Is this query engine open source? Did one of the two vendors develop it? How could such a thing be neglected or even optional? Are there licensing issues? Cost?

Randy

Seref, to add to my questions:

AQL is the most neglected, yet, probably one of the most important components of an openEHR implementation.

Does this imply that each implementation of openEHR is sufficiently different from others as not to allow for easy sharing of such things as search or storage technologies?

Randy