Requirements for republishing published archetypes

siljelb · 7 December 2021 09:11

Already published archetypes sometimes need to be republished, to incorporate a change. We have clear rules for how different levels of changes affect the versioning of the archetype:

A patch change such as correcting a typo or adding a translation, leads to a 0.0.1 increase in version numbering.
A minor change, such as the addition of a data element or changing the text of an element without significantly changing its semantics, leads to a 0.1.0 increase in version numbering.
A major or breaking change, such as removing a data element, significantly changing the semantics of a data element, or moving a data element to a different part of the archetype tree, leads to a 1.0.0 increase in version numbering.

What we’re missing is a similar set of rules for the process of committing changes, especially the major changes.

Patch changes can be applied and republished at any time, and for translations only this is often performed by translation editors.
Minor changes are usually also applied and republished without any ceremony, but it may be prudent to consider any outstanding change requests or discussions.
Major changes are more difficult. What are the requirements for republishing an archetype with a new major version, e.g. going from v1 to v2? In the past this has been done by consensus among clinical knowledge administrators (CKAs), but it’s always been uncomfortable and it would be a good idea to have clearer guidance which is anchored in the community.

So just to throw some ideas out there, these are some alternatives for the process of republishing an archetype as a new major version.

The editor makes sure to clear all pending change requests before republishing, but otherwise no significant process.
The editor polls the community about any additional change requests or considerations.
There’s a discussion and decision among editors about each archetype which is a candidate for a major version republication.
There’s a formal CKM community review before an archetype can be republished as a new major version.

These ideas each have their own pros and cons, ranging from lack of oversight for #1, to extensive overhead and the risk of reopening cans of worms for #4.

There may also be an argument for creating a special status for some archetypes which have been extensively explored and tested in multiple implementations, including adjacent use cases. These would be the archetypes where #4 was a necessary requirement before republication as a new major version.

I’d like for members of the community to respond with their thoughts on the matter.

varntzen · 7 December 2021 14:21

Thank you for opening this discussion, @siljelb !

The ideal would be that v1 was so perfect it would never need any change at all , but we all know that’s not going to happen. We promote the archetypes to be “future-proof”, but at the same time we realise that there will be a need to change them due to errors, new requirements or new medical knowledge or changes in the pattern established in archetypes.

I suggest the question shold be a part of the planned work on how the archetype governance in the international CKM should be in the future. It’s good to get input here to tease out the alternatives though. If there was a voting today, I would certainly not go for neither #1 or #4. Maybe #3 is the most viable, but would possibly need #2 in some cases. #3 demands a steady group of skilled editors which have a commitment to participate, and have a kind of formal role within the governance. This leads to the question of what are the requirements to become an editor.

Or, maybe we just can ask the one that rules the universe…

siljelb · 7 December 2021 14:39

The acceptance that archetypes sometimes have to be “leveled up” is part of the future proofing, in my opinion. Over time, nobody would use models which stagnate because of overly strict governance requirements.

I agree, but this is both a short term and a long term issue. We currently have several archetypes which are waiting to be republished. How do we handle those, in the next few weeks?

varntzen · 7 December 2021 14:44

I might be partial, but #3, and add in the original author if possible.

siljelb · 7 December 2021 14:47

I’d be happy with that, but as you point out in your first post, we’ll have to define who the editors to participate in the discussion are.

joostholslag · 7 December 2021 16:52

In the end it should be an editorial decision.
Even if we hold public reviews, the editor decides when there is enough consensus to publish. Votes are not binding right?
I think there are two sides to this story, clinical and implementers. Clinically I only care wether a change is ‘controversial’, and not about the semantic versioning related to archetype publishing. A patch change on which there is any reason to suspect disagreement in the clinical modelling community warrants a review (does this significantly change the meaning of an element, is a nice example). This is the editors skill to determin if a change is controversial. And if there’s regular disagreement from the community with editors (I know zero cases), we may nee a procedure to let the community overrule an editor to trigger a review. But I have high trust in all current editors, so I see no need for this currently. And believe the current (non existent) process for republishing is fine.
But since breaking changes may mean work for implementers, they are probably the most important stakeholder here. And they already ‘vote with their feet’: they are free to upgrade, but if they don’t, interoperability suffers. Which in turn is a huge problem for the clinical modelling group. So this is the main decision an editor has to make: does this improvement warrant the risk of losing interoperability.

siljelb · 8 December 2021 13:53

I agree, and I’d really like to get implementers’ views here. As you said, they get the extra work when breaking changes are introduced, but on the other hand they can also get stuck between a static model and new requirements.

joostholslag · 8 December 2021 18:17

Btw I hope it’s editorial policy never to do this, but deprecate/remove the element (node ID) and create a new one (a). Because it’s really easy to mis changes like this and just record data in the same element with the old meaning.

siljelb · 9 December 2021 06:59

It is. It’s sometimes done before initial publication, but I can’t recall we’ve ever done it after.

thomas.beale · 9 December 2021 16:05

I have not thought about this enough to respond properly, but reading quickly - one thing you might take into account is a ‘release’ concept. I know there is something already in CKM, but just to state how we would think about this for software deliverables, we would mostly worry about having coherent releases, where a ‘release’ is a particular ‘configuration’ of deliverables - essentially, a list of particular deliverables (archetypes, other things…) and their specific versions.

Commonly, we do this in ‘components’ - hence the ‘reference model’ is a separate component from the ‘archetype model’, and both are separate from the CDS, Process (Task Planning) and other components. Each component is separately releasable.

Could it make sense to have clinical ‘components’ in CKM, e.g. ‘general medicine’, ‘oncology’, whatever basis makes sense. Then, the oncology component (let’s pretend there is one) is just a list of some archetypes and other items, and specific versions, specific to oncology. Most components are going to have a dependency on more basic / general components, e.g. ‘core medicine’ or ‘emergency medicine’ or whatever (i.e. all the vital signs, plus some other basics), ‘lab investigations’ or whatever.

Then your original question could be looked at as: ok, for each defined component, what do we need to do to make sure the whole component is coherent / good quality (according to whatever the quality criteria are) for its next release? This could provide some more concrete basis for working out which changes need to be considered together and which can be ignored for now.

The next concept in this approach is that you assign in advance specific changes being proposed to specific artefacts to a particular release of one of the ‘components’. Then ‘clearing pending change requests’ is just performing the CRs for some particular release of some particular component. Once again, since there will be dependencies, if you make a change to e.g. blood gases or whatever, it will have a ripple effect on other components.

There is a higher level of release as well, which can be thought of as ‘product’. This consists of stating which particular component releases work together / can be used together to achieve some business purposes.

In all this, patch and minor releases are routine things; a major release of anything (single artefact, component, product) is a non-backwards compatible version of the previous form, i.e. a new artefact / component / product. This is also ‘routine’ but of course when breaking changes are included, the impact assessment is different.

Apologies if I’m just stating known things / current practice in CKM-land - if so, just ignore!

siljelb · 10 December 2021 08:17

You’re correct we haven’t been using the release set functionality of the CKM.

I’m not 100 % sure I understand how this would help with this problem, but could it help make it clearer to implementers which groups of related archetypes are in which revisions, and which revisions are intended to work together?

An example I’m looking at right now is Medication. We currently have a set of archetypes which are used for medication related information. They’re created to be used together, and changing one may require changing the others as well. Implementers may want to upgrade to a new version when it’s released, but they may also prefer to stay with the old set. Release sets would make it easier to make decisions about upgrading or not for groups of related archetypes, instead of having to do it for each individual archetype. Simultaneously, for CKAs this would make it less controversial to have several major versions of release sets available in parallell while archetypes of earlier release set versions have been deprecated.

Is that what you’re proposing?

Jelte · 10 December 2021 11:14

As an implementer I find it difficult to form an general opinion about this.

Updating a major version can become problematic when you actively exchange openEHR data. But we’re currently not exchanging openEHR data directly, so I don’t have experience with that. In other situations you can just stick to the old version as long as needed.

I think it also depends on if you’re currently using the archetype. If you are using the archetype and don’t need the changes, major updates can be annoying. But if you want to use the archetype and/or need the changes in the major version update a quicker release process would be nice.

I think bundling changes to related archetypes could be helpful, but it could also cause delays in releasing the major versions.

thomas.beale · 10 December 2021 15:00

Pretty much - discussions would need to be had to really get a good picture of what makes a ‘component’ for clinical models, but the essential idea is that it is a separately releaseable entity. Now (as is obvious) certain components will depend on more basic or ‘core’ components, so one question to solve is potentially: what constitutes ‘the core’ - or maybe there is more than one.

This is just a general truth of IT, nothing to do with openEHR. The general rule is: a new major version is essentially a new thing, and systems will usually have to do one of the following with data created according to the previous major version:

always convert to the new version on the fly at data retrieve time
perform a one-off data migration (can be done incrementally over time) of older data to the newer form.

To do either requires a defined and published algorithm that performs the vN → vN+1 conversion, e.g. it knows how to move data subtrees, or populate a newly added mandatory field with some standard value for the older data.

Again, this is standard IT stuff; it’s only a question of details as to how we solve it in openEHR.

borut.jures · 10 December 2021 16:04

Sorry if this is a beginner implementer question.

Is by “major version” meant archetype/OPT version or AM/RM release?

Are CDR implementations for openEHR accepting only a single AM/RM release?
Or can they receive OPTs for any AM/RM release?
I expect answers: No, Yes

There is AM/RM release information in the OPTs so a CDR knows which releases are needed for e.g. validation or conversion to the new “major version” and can do it automatically?

  "adl_version" : "2.0.6",
  "rm_release" : "1.0.4",

So to perform a data migration,

OPT v1 will have AM 2.0.6/RM 1.0.4
OPT v2 will have AM 2.2.0/RM 1.1.0

…which are all supported by a CDR and then only the algorithm is missing?

siljelb · 10 December 2021 18:53

This topic was intended to be about archetype versions

thomas.beale · 10 December 2021 18:57

Just to be clear - @siljelb was also referring to ‘versions’ in the semver.org sense.

borut.jures · 10 December 2021 19:30

Not every post goes as planned

Especially if you invite implementers:

@Jelte and @thomas.beale Were you also refering to the archetype versions?

I still wonder what happens when an archetype starts using a newer AM/RM release. The archetype will probably get a new major version but the OPT with such an archetype will also get a newer AM/RM release.

Do CDRs handle that and store each with its own set of AM/RM releases?

heather.leslie · 11 December 2021 02:51

The original intent of Releases functionality in CKM was focused on bundling all resources (archetypes, specs and any other relevant artifacts etc) together and publishing against a specific version of a dataset specification - for example, a national version or revision of the IPS. So you could say to a vendor, go implement this… and a URL to the relevant fileset.

siljelb · 11 December 2021 08:25

I suggest starting a new topic about this

joostholslag · 11 December 2021 09:30

Maybe it’s interesting to add that for measurement scales we build a tool that will let you render data from different versions in a single graph/table by mapping paths across major versions of a single composition. This helps a lot in updating major versions of archetypes/templates for data that is mainly visualised in by the apps that process this mapping. For other apps that acces data over apis or have hard coded interfaces it’s still a problem.