I have a question about updating templates and their associated compositions.
Imagine the case in which you have your CDR with compositions and templates and there comes a time when it is necessary to update one of them because it has been seen that you can better define the use case by adding a new field.
How would this affect the old compositions?
Would it be necessary to update all of them or are they left as they are and only the new ones are updated?
Templates are constructed of elements of archetypes.
If your archetype and/or template changes are non-breaking, the version numbers will only change in the minor or patch positions (see here for a diag). (Version ids are always of the form 1.2.3, i.e. 3 parts).
no older data will be invalid with respect to the new template(s)
new data will be compatible with previous data - older applications just won’t see new fields.
If the changes are breaking, the major version number will change i.e. v2.x.y will go to v3.0.0. With archetypes and templates, the major version number (the 2 or 3) is part of the identifier. That means that a breaking change actually creates a new archetype (or template).
The template id is always known in every version of a Versioned Composition, so the version ids can be compared and a breaking version change can be detected e.g. by an AQL engine and other back-end components. This doesn’t guarantee that this is happening properly in any give product of course!
How much does this matter in reality?
it is breaking changes in archetypes that matter, because that changes paths to data in querying
breaking version changes in archetypes are not common - but they do happen
breaking changes in templates alone don’t generally matter too much
Now, to avoid bad things happening, implementations may want to migrate older data when there is a breaking change to an archetype. This is usually a small change to data structure or similar, but it might require a more complex algorithm. If this isn’t done, and older versions are likely to be accessed by querying, then such algorithms need to execute on the fly.
This is an important feature of dual modeling that is sometimes forgotten.
If the historic versions of archetypes and templates are maintained and are accessible, you can always use them to interpret their respective clinical data instances (for visualization purposes, for example). Moreover, it would be probably possible to build a query that combines results obtained from querying different versions of the same archetype/template. It will depend on the kind of changes introduced, and makes the query more complex, but it is feasible.
I guess it is only in some cases that you want to run through the entire database and update exisiting data due to a template/archetype change. (DIPS did this for Blood Pressure v1->v2 though since it so frequently used.)
As @damoca hints at, if you don’t update existing data, you may need to send more than one query to target/fetch all (historical) data stored using more than one major version of a template/archetype. (This should be OK since the alternative of always requring backwards compatibility would be worse from a development point of wiew.)
I have a (possibly wrong) assumption that I would like to check though:
If we have a long-lived composition with many versions, for example an allergy list that is updated over the life of a patient. Is it then OK to store new versions of the specific composition with data based on newer templates/archetypes than the the previously(/originally)stored data that was based on an older template/archetype?
Ok, I Agree that an allergy list may be a thing one would like to actually run a system wide update on so that the lists of all patients get updated to the new version at once to avoid confusion.
But let’s pretend it is something of less imprtance than an allergy list, then I am still curious about the technical aspect of my assumption/guess: May different versions of a specific composition contain data based different versions of a template?
It is to be expected that from time to time a Vm->Vn bulk conversion will need to be applied to a mature CDR, which is why we employ highly skilled software engineers to help perform such a task with minimum grumbling.
We have no rule against it. My view is it is fine, because the idea is that the ‘versions’ relate to changes to the clinical data. It is a technical detail as to how the data are structured at an atomic level. So versions of archetypes and templates for such persistent Compositions (although could also be Event Compositions since they can have versions - but much less likely) are all recorded or findable from the template and archetype ids within the data, and so a well-written query processor should have no problem dealing with the different versions, i.e. running the appropriate older queries on the version that match the versions used in the queries and so on.
This all sounds terribly complicated, but it is worth remembering one key fact: almost all querying (and access in general) in a clinical system is on the latest version. So from this we can conclude:
when new versions of archetypes and templates are deployed, what should happen in any Query repository (i.e. stored queries in authoring / design environment and also in the Definitions service) is that current queries are checked against changed archetype / template ids, and queries can be flagged ‘out of date’. We are not yet at the point of having algorithms that would rewrite queries into new forms based on version changes, but maybe one day that will be possible.
at runtime, any query that wants to look at previous versions can fairly safely be assumed (not always) to be a non-operational query, probably a forensic check (when medical errors occur). So lower performance can be lived with. That means a certain amount of computing complexity in handling older versions (assuming not migrated) will be tolerable.
It should be noted that none of these challenges are openEHR-specific - any DB system has schema and query) evolution issues over time. And most DBs don’t have any proper versioning of anything at the schema / query level. It is just the details that are specific.
We didn’t do it that way. We analyzed how v1 was used and found that only nodes/paths that where equal among the versions where used. Then we could change the stored queries to accept both v1 and v2 for the queries. This made i.e. a graph continous across versions of the archetype.
We expect to face a situation where the changes in information models (archetypes/templates) require an update of existing data. This will require lots of analysis how such a change will impact on data in the EHR and the surrounding applications using it.
We expect persistent data to be the first candidate for such migrations.
Sorry if I was not clear enough Sebastian. I didn’t mean to imply there’s anything wrong with what Erik is asking about. I have a gut feeling there can be some interesting corner cases and I find that an intriguing thought exercise, that’s all.
Thanks @thomas.beale this was my assumption too, and what I have beeen telling colleagues when describing the versioning system. So we conclude that the answer is: YES, different versions of a specific composition can contain data based different versions of a template.
Think of it as if you add a v2 archetype to a template already containing the v1 then this is not a breaking change to the template, any more than adding a new archetype covering the same content is breaking the template.
This does commit you to managing the 2 versions or archetypes diwnstream in aql , ui logic ir other facade.
The alternative is to fill @Seref with deep joy and ask our fine engineers to bulk convert v1 to v2. Which option works best does require careful analysis