Versioning of archetypes: Minor or major changes?

Another crosspost between the Clinical and the Implementers lists.

In versioning archetypes, we’ve defaulted to SemVer’s three version levels MAJOR.MINOR.PATCH. When discussing with DIPS what should be considered MINOR or MAJOR changes, we’ve come to the preliminary conclusion that many more changes than we previously thought may require a MAJOR version change. This is exemplified below mostly with exchange of information between systems, but may also be relevant within a system when adding new functionality with a newer version of the same archetype.

These are as follows:

Hi Silje,

In CKM, we have tried to implement the technically necessary version increment, which can always be increased by the clinician modeller.

In that sense the technically suggested version increment is a minimum increment, but you can always opt to go higher.

Now, we seem to have a different idea about what technically necessary means and indeed it is very complex (and I don’t think one is necessary better, just different, as it depends on what you want to use the major version for). A while ago, Thomas, Ian and I tried to fill in a spread sheet with some cases to decide what is a major, minor, patch change…let’s say the differences, at least initially, were high.

My main criteria for the technical level would be whether it is technically guaranteed that data recorded using the old archetype can be read and principally understood when using the new archetype.

But not vice versa…!

If you need to exchange data from new to old, there will be problems like you describe below.

If you require this kind of compatibility to be a major change, you’ll end up having many, many changes creating a new major version, essentially ignoring that – this is, more or less - what the minor version can be used for.

The one is intra-system, the other is inter-systems, in any direction, if you like.

On the clinical level, you could – in an extreme case - even go so far that a (alphabetically) miniscule change to the text or description of an at code can be major version change if it (sufficiently) changes the semantics of the element. There will be edge cases for this I suspect.

If any of the changes – in a concrete case - is critical from a clinical point of view, I would then argue this needs to be a major change by the modeller, even if technically it is just a patch or minor version change.

Using you examples below, if the new element (1.) is absolutely critical, it may warrant a new major version, even if it is not mandatory (for reasons discussed a few days ago on the list)

If a value is added to the internal value set (2.), I absolutely agree that, this may change the meaning of the other values, especially in the “other” case.

I have argued the same way like you before, but I now think that it is a clinical (or semantical/logical) decision, not a pure technical one, to make this a major version change.

Adding a runtime name constraint (3.) is a major change from a technical point already, in my view.

(More interesting in my view is if deleting a runtime name constraint is a major change as well, I think so (and CKM agrees, surprisingly :wink: ), essentially for what you argue below.

Widening cardinality/occurrence constraints (4.) would give you the problem you describe.

I still think this is the prototype of what constitutes a minor version change.

We could of course consider requiring a technical change to be backward compatible AND forward compatible, so that this works 100% under all circumstances BETWEEN systems, in all directions. Then the order of old and new doesn’t make a difference for the comparison. Most of the minor versions wouldn’t exist, but be major version changes instead.

Cheers,

Sebastian

Hi Silje,

In CKM, we have tried to implement the technically necessary version increment, which can always be increased by the clinician modeller.

In that sense the technically suggested version increment is a minimum increment, but you can always opt to go higher.

Now, we seem to have a different idea about what technically necessary means and indeed it is very complex (and I don’t think one is necessary better, just different, as it depends on what you want to use the major version for). A while ago, Thomas, Ian and I tried to fill in a spread sheet with some cases to decide what is a major, minor, patch change…let’s say the differences, at least initially, were high.

My main criteria for the technical level would be whether it is technically guaranteed that data recorded using the old archetype can be read and principally understood when using the new archetype.

But not vice versa…!

If you need to exchange data from new to old, there will be problems like you describe below.

If you require this kind of compatibility to be a major change, you’ll end up having many, many changes creating a new major version, essentially ignoring that – this is, more or less - what the minor version can be used for.

The one is intra-system, the other is inter-systems, in any direction, if you like.

On the clinical level, you could – in an extreme case - even go so far that a (alphabetically) miniscule change to the text or description of an at code can be major version change if it (sufficiently) changes the semantics of the element. There will be edge cases for this I suspect.

If any of the changes – in a concrete case - is critical from a clinical point of view, I would then argue this needs to be a major change by the modeller, even if technically it is just a patch or minor version change.

Using you examples below, if the new element (1.) is absolutely critical, it may warrant a new major version, even if it is not mandatory (for reasons discussed a few days ago on the list)

If a value is added to the internal value set (2.), I absolutely agree that, this may change the meaning of the other values, especially in the “other” case.

I have argued the same way like you before, but I now think that it is a clinical (or semantical/logical) decision, not a pure technical one, to make this a major version change.

Adding a runtime name constraint (3.) is a major change from a technical point already, in my view.

(More interesting in my view is if deleting a runtime name constraint is a major change as well, I think so (and CKM agrees, surprisingly :wink: ), essentially for what you argue below.

Widening cardinality/occurrence constraints (4.) would give you the problem you describe.

I still think this is the prototype of what constitutes a minor version change.

We could of course consider requiring a technical change to be backward compatible AND forward compatible, so that this works 100% under all circumstances BETWEEN systems, in all directions. Then the order of old and new doesn’t make a difference for the comparison. Most of the minor versions wouldn’t exist, but be major version changes instead.

Cheers,

Sebastian

All,

for reference, the rules on what level of change is required to the version id is here, in the Identification specification. I suggest that this is taken as the starting point for any discussion about changes to versioning rules. To my knowledge, these rules are already implemented in CKM.

Implementing organisations need to think very carefully about whether they want to be able to support all previous versions of their software at all times simultaneously, because that is likely to be semantically problematic, not just for archetypes, but for any standards (e.g. FHIR profiles etc), terminology static value sets (i.e. not intensional value sets) and any other semantics buried in business rules, application logic etc.

From Silje’s list, I have the following comments:

  1. adding non-mandatory elements: if a receiver system using an earlier version of the same archetype (why?) rejects data from the new archetype, this is an error in the receiver integration logic, since there can be always more elements, archetyped or not, except in the edge case where container attribute cardinality upper limit is constrained to a fixed number.
  2. correct; but modelling internal value sets, or any value sets as X, Y, other is bad practice, as documented for 15 years by the likes of Cimino, Rector, IHTSDO etc. Noone should be doing that. The question is how to do it properly in openEHR archetypes. Some posisbilities:
  • use the DV_CODED_TEXT and DV_TEXT; this has the effect that ‘other’ items can be textually represented when there is no code, and avoids the fake ‘other’ code polluting code sets - this can be done today.
  • introduce an ‘open’ / ‘closed’ marker on DV_CODED_TEXT (i.e. on C_TERMINOLOGY_CODE). There is a discussion about this ‘exclusion’ logic with respect to higher level structures in the AOM2 spec, which may be useful. We would still have to do something new to implement this ability properly for term sets.
  1. agree; but runtime name constraints should never be added in archetypes, only templates. The rule is that they default to the rubric of the at-code of the node if not constrained, which should never be ‘unexpected’.
  2. agree; but the problem here really is: if you now thing that data element X is optional clinically speaking, presumably you think that to be the case across the whole health system, or jurisdiction or whatever - so my question is: what is the sense in keeping old software that is wrongly mandating a data item that will no longer be clinically collected (items like religion, ethnicity, sexual preference, even social gender all seem to be common examples of this, but of course cliincal examples exist too, as science and business process evolve)
  • thomas