How accurately do we model "copied" data?

openEHR templates will frequently feature data from other repositories. In the context of a clinical referral letter, results may be added in to the narrative of the text and this is itself a copy of the source data from the pathology system. We have to be aware that this data exists and should not be used i.e. query only the source systems.

We have to make a decision on how accurately we model this data i.e. as an observations, SNOMED CT codes etc with accurate values. Or do we only store these “copies” as text strings to prevent reuse (misuse)?

Is there best practise in this scenario?

We’ve added a separate context/category code for report documents. The purpose of these is to reuse data from the EHR for outbound data.

Content in such compositions will not be included in AQL results by default. They are queryable but the query editor must make explicit conditions in the AQL to get them

1 Like

Important point John!
Some thoughts: there’s a generic entry class, one of its goals is an integration scenario like you described.
The same problem occurs if e.g. lab data is summarised or quoted in a discharge summary both stored in openEHR CDR, there’s been some talk with proposals from @thomas.beale of references some months ago on this forum. Let me know if you can’t find it.
@Bna do you mean this is a local solution dips built? Or is it in the spec? Could you share a bit more? I’m really curious (a)

Hi John

I wonder if you can be more specific. By “other repositories” you mean another in-house system, or from outside of the organisation the EHR is suited? The results that could be added in a clinical referral, is this a referral from an external part to the health provider or outbound back to the external part (i.e. a GP)?

A concrete use case would help to understand your question.

The way Bjørn describes how DIPS is dealing with reports “out of the domain of their system” is a nice solution. It re-uses the same archetypes as in production, but in another context (“just for reporting”) and doesn’t “pollute” the true production data from the hospital/health service provider.

Regards, Vebjørn

Might be this.

This is something DIPS does locally.

This wiki page may address what John is talking about here. Also related: Subject Proxy Service, for getting subject (patient) data from other sources. e.g. patient demographics from Oracle MPI or vital signs from wearable device.

Here’s a wiki page on Reports (much based on @bna presentation from some years ago).


This is very much in line with some of our thinking. Is this implemented as a standard operating procedure when dealing with AQL i.e. don’t include this code as standard on all AQL calls? Probably better if you had an example?

Sorry for not being clear!

We have a pathology results service and for the time being, we would not want to replicate the data there as this component is designed to support the requests and results etc. In the medium term this will be wrapped or converted to FHIR native.

My concern is breaking the clinical narrative by not modelling “properly”. We hit this yesterday with @ian.mcnicoll and it occurred to me afterwards that I didn’t like the notion of either dumbing down or even excluding the observation in question. Part of this is to do with our need to present a copy of the composition in our document repo. We would then be faced with both the form needed to get data from 2 places and then replicating it to create a PDF binary as a document.

If we can avoid this issue with a “no reporting” flag as @Bna mentions, that seems a more elegant solution.

I like the idea of the ‘reporting/ secondary record’ flag but I’ve always felt that it needs to be at Entry level, not for whole Composition, which as in this case, or even some discharge reports, there is a mix of primary and secondary records.

In our use-case there is also an element of potentially needing to capture the exact lab results which underpinned the decision-making, s this may not be clear when the lab results are pulled dynamically, esp from an eternal source.


How wonderful, problem with data for life solved! :smiley: (sorry, couldn’t resist)


Sometimes we have to deal with the world as-is not as we might like it!! Ideally the lab tests would be in the same consolidated CDR, of course. You still need ot flag a ‘copied result’ but at least everything is under your control.

1 Like

A few cents by a relative outsider.

Data in any patient record has context.
1- it is from a third party like: lab results, a referral letter, …
Third party data has to be admitted to the record and occasionally annotated by the author that admits it formally to the record
2- it is entered by the author

Third party data will be separated from the data the author has entered either is a separate database with incoming or send documents.
Documents needs to have a state (received, read, discarded)
Any document is modelled using a Composition since the third party data can be a complete report with many pages and various clinical facts.
In the patient record it must be possible to link to data in a document that has been read and that is stored in the document repository; the links are inserted by the author. This implies that AQL must be able to deal with referred data.

Or the documents are entered in the record next to data entered by the author. In this case all Compositions need to be able to indicate that is a from a third party and has a state (received, read, discarded)
The author of the patient record is able to link to data in Compositions entered as third party document.
AQL needs to be take into account the flag indicating what it is: authored by the author or third party.

The first solution has my preference.

1 Like

This is what a Citation (Entry) is…

This is something we don’t deal with properly - the acknowledge/read/ etc status of received documents, at least not in the RM.

1 Like

I’ve never used Citation - it feels clunky, involves referencing and precludes using AQL directly when appropriate. I’d much rather have an attribute on Entry (similar to Dips ‘report’) that indicated that a given Entry is not a primary datasource and should not be picked up in ‘standard querying’- same intent as ‘report’ but much more flexible and granular.

Sam wrote that wiki page in 2008 and built this CITATION archetype in 2010. There is no EVALUATION.citation that I’m aware of, as he suggests in the CLUSTER archetype ‘Use’.

Oops… completely wrong link. This is the page I intended to link.

The citation I am talking about is something like the View model that we have discussed much more recently. This covers representing:

  • referenced static data, e.g. an existing diagnosis or lab result
  • capturing an AQL result
  • capturing the result of an external API call, e.g. some CDS system result.

The problem with putting a flag on Entry is that we are just making copies of things, and providing a way to mark the copy as a copy rather than an original. That would help a bit. But we need more than that: we need to be able to capture AQL results, API results etc.

And I think it’s likely that some systems / apps will not set the flag the right way, so there will be querying errors anyway. Better to have a dedicated type(s) that support different kinds of views, and also support the representation of reports and summaries as shareable documents, which a flag on Entry won’t do either.

1 Like

We have the notion of this “clean” architecture where pathology results will eventually be able to be referenced as a URI to a FHIR based repository. We would even be feeding some of these results where they exist primarily in openEHR, and our principle apps would interact with national services for observations or results. We have the latter now but it’s behind some clunky SOAP based endpoints or direct SQL querying. Not at all acceptable in this day and age…!

The flag might well be a suitable get out jail card though.

1 Like

I agree with John and Ian - this question also applies to data held outside of the openEHR system, e.g. in another repository, and to medication / problem lists. (e.g. GP has a problem list. Patient is admitted to hospital. Doctor copies GP problem list into hospital system, and then into electronic discharge letter. Patient is discharged to GP. GP accepts hospital discharges into the problem list. GP system now has a duplicate version of the problem list. This needs to be prevented, e.g. by linking the returned entries to the originals so that the roundtrip does not create duplicates).

Could we think of the ‘copied’ Entry as effectively being a cached reference to the underlying source of truth? If the network was working perfectly we could forget about keeping a copy and just refer to the original source (and choose to view the version that existed at the time the composition was committed).

When running queries the important thing would be deduplication if both the original and the ‘copy’ are retrieved. Maybe a solution could be that the Entry contains a flag to say that it is a copy, and it also contains a reference / URI to the source of truth?


This is the intent of this kind of modelling (solution #2 and #3).

1 Like

One extra usecase we need to keep in mind is summarising.

E.g. the highlights from a lab results are regularly put into a soap report: e.g
S/ no more bleeding
O/ old blood crustae in the nose, no fresh bleeding
Lab/ Hb 5.1 (4.9 yesterday)
E/ slowing of anemia after nose bleeding
P/ hb in two days, transfusion in case of hb < 4

Do we want to put the data under O/ lab/ in an obs.lab? This will lead to duplication. As pointed out unless it’s flagged somehow it leads to duplication. (Which I don’t think is as big a problem as assumed here, since it’s obviously a duplicate, but from a different perspective (doc note instead of automated lab, which indicates increased validity and information to the lab result itself).