[All: this is a long post on an important topic, so I’ve made it a wiki post i.e. directly editable by others. Feel free to make inline additions, but please try to retain the general integrity. I suggest to add your initials to any additions. Most likely we should create extra topics on each major question described below.]
We have had a long running need to better solve cross-referencing in the openEHR EHR for managed lists such as the Problem List, Allergies list and so on. We’ve had many discussions in the past, including this recent one on Linking in openEHR. I have previously created UML for some initial ideas (‘view Entries’) if you want to look at something, but this is far from complete and could even be wrong.
There are various needs that simple LINKs and use of DV_EHR_URI don’t solve particularly nicely, much of that analysed by @ian.mcnicoll and other clinical modellers (@siljelb , @heather.leslie, @varntzen, @vanessap etc, feel free to chime in) in trying to build models for Problem List and the like. I’m going to try to articulate a few at a time, in the hope we can expose the needs and therefore the solution here.
(In the below, you can mentally trade other reference lists like Medications List, Allergies, Family History, etc for Problem List, with the same general semantics.)
So the first thing to think about is the idea of one or more Problem Lists (at least one ‘master’ Problem List with the main Dxs) for which I propose the following semantic requirements statement (to be debated). Managed Lists:
- are curated, i.e. manually managed (i.e. not query results)
- have content consisting of ‘focal’ and ‘related’ data - ‘focal’ meaning the thematically central data i.e. problems, allergies, medications etc; ‘related’ meaning anything else;
- are not the primary structure in which the thematically focal data (Dxs and the like) are originally recorded
- have their own documentary structure, i.e. something like Section/heading structure
- the focal content is citations of previously recorded diagnoses and/or other ‘problems’
- may have citations of other related previous content, e.g. important observations, past procedures etc
- ?could have have internal de novo content, i.e. not just own Sections, but Entries (probably Evaluations?) created within the List to represent notes? summaries? thoughts about care planning?
- are managed over time by the usual means, with each modification creating a new version.
One key thing we have to determine is: what can be cited? Is it:
- A: only Entries within previous Compositions? I.e. individual clinical statements?
- B: Sections containing multiple content items within previous Compositions?
- runs the danger of pointing to too much content if you don’t check properly;
- C: sub-Entry level items, e.g. Clusters and Elements, e.g. a single lab analyte inside a lab result OBSERVATION?
- runs the danger of mixing up e.g. a target value (e.g. target BP) with an actual value, or anything else taken out of context;
- D: any structure anywhere in a previous COMPOSITION (let’s limit it to LOCATABLEs, which is nearly everything);
- seems dangerous in general.
I am personally strongly in favour of a type A kind of citation - having a single Entry as the target. It always seems attractive to want to refer to anything, but I think that is of limited utility, and carries dangers. It is of course technically possible to model different kinds of citation object, that can point to different kinds of target structure.
Technical Requirements - Representation
To these we need to add some technical requirements, e.g.:
- does a retrieve of the Problem List:
- get all its cited contents in one go? I.e. what the clinician considers to be the content? OR
- get only the heading structure and the citation objects (some kind of direct references), with further dereferencing needed to resolve all the citations in order to build the List for display and update?
It seems fairly obvious that the first option is what we want - the whole point of the managed List after all is that you can easily get hold of it as a single logical object.
So here’s the main technical problem. To achieve the result that the full List, including all cited contents, is returned through the API on request requires a solution to either persisting or computing the full contents of what the citations point to. The options include (with some obvious dangers listed):
Persisted Copies: citations are resolved at create time, i.e. they cause copying into the persistent List structure, i.e. the EVALUATION recorded 3 years ago containing my diabetes type 2 Dx is just copied into the Problem List when it is added in the curation process.
- the obvious danger here is that copies of Entries are likely to cause duplicates in querying - we are breaking the golden rule of IT here after all;
- however, making some sort of safe, encapsulated copy is undoubtedly possible;
Generated Copies: citation references are resolved at retrieve time on the server when a retrieve request is made such that the full Problem List is instantiated prior to sending through the API
- this requires a model that includes data items that are not persisted, but generated post retrieve - more complicated;
- the query service has to do a different sort of retrieval, so that these duplicate content structures are not created prior to executing the query - again more complexity;
Persisted Serialisations: citations are resolved at create time, but don’t create structure copies (e.g. a 2nd EVALUATION etc), instead are instantiated in serialised form, e.g. XML or JSON which just need to be rendered to the screen (this kind of approach is documented in the Confluence page on Report representation).
- this approach will prevent duplication in querying and any other process that aggregates persisted EHR data;
- but it loses the native openEHR structures that might be useful on the client side.
- Some other (new) native technical representation: some new converted form of the current native structures, e.g. a flattened readonly Entry or similar (see below).
As per that Confluence page, I think there are very good arguments for using the serialised approach for report-like objects, e.g. discharge summaries, referrals, etc, because they are indeed a kind of recorded statement at a point in time that is treated as a medico-legal document. Whether that same logic holds for managed lists is a question.
There is another potential requirement as well, which is that the client may want not just the cited Entries in the Problem List, but:
- their context info i.e. from their containing COMPOSITIONs, indicating ‘when and where did you get your Dx of type 2 diabetes’ AND/OR
- the version information, i.e. from the containing ORIGINAL_VERSION object, indicating ‘when did this information become visible in the EHR’.
So we might not just want ‘straight’ Entries, but ‘wrapped Entries’ or ‘flattened Entries’ containing that other data, or each cited Entry. Note - this need is not specific to managed lists, but could be desirable within query results in general (today we solve it by stating the bits and pieces we want in the SELECT part of a query).
Technical Requirements - Update
When a managed list is being updated, i.e. ‘curated’ as we often call it, you can’t modify the cited contents (well, you might be able to do that, if you see errors etc, but it’s not a routine part of List update).
Therefore if the ‘resolved’ (client side) representation includes native objects representing the citation targets, those latter objects have to be considered readonly. If the representation is in a serialised (or some other) form, it might be easier to do this.
Other than this, updating a managed list should allow any reasonable change - removal of references, addition of new ones etc.
Technical Requirements - Interoperability
There are some other technical questions to think about as well. For example, what happens when copying the Problem List(s) and Medication List to another EHR system, e.g. GP → hospital? This can be via an EHR Extract or some other means. How would the receiving (openEHR) system persist the data? That depends on how it is represented, according to those options above - as native openEHR structures, or as a serialised form. Would such copying require that all the cited Entries and their containing Compositions be copied over as well? For native openEHR → native openEHR, a full copy should be made (like a Git repo sync operation with branches being pushed to a target repo) but for other environments, we might want to make fewer assumptions.
We might therefore consider that there is a form of managed List that has references that no longer have targets in the system where it is persisted, due to being a copy.
Towards a Solution
My current thinking over the years on this issue is toward the following kind of solution:
- within a openEHR EHR system, we represent managed lists such that citations contain direct references, which are resolved (each time) on retrieval in the server, so that the structure that goes through the API is the ‘full’ structure.