Is adding a protocol to an archetype really a breaking change?

siljelb · 16 December 2022 11:54

I’ve recently received an unexpected response from CKM when trying to upload a new revision of an archetype, where the main change was adding a ‘protocol’ section.

The CKM claims this is a breaking change. @sebastian.garde has since explained to me that adding a ‘protocol’ section is technically a breaking change because it’s constraining it to being an ‘ITEM_TREE’ even though nobody is actually using any other structures than ITEM_TREEs for building archetypes anymore.

Should we really consider this kind of constraining to be a breaking change for the purposes of semantic versioning of archetypes? It doesn’t feel right to have to take this archetype to v2 just because of this.

thomas.beale · 16 December 2022 12:04

I don’t understand that as a breaking change - it’s an addition, so it’s a minor version. Sebastian do you have some deeper logic on this question that we are missing?

sebastian.garde · 16 December 2022 12:53

My understanding so far was that protocol “exists” even it is completely unconstrained.
I.e. in data you can put in any ITEM_STRUCTURE into the protocol - whether protocol is constrained in the archetype or not. This is because protocol is an explicit optional ITEM_STRUCTURE in any CARE_ENTRY.

Then if an archetype in Revision A does not constrain the protocol at all, any data created against that archetype can thus have any kind of ITEM_STRUCTURE in its protocol data, not only ITEM_TREEs.

If the next revision of that archetype (Revision B) then constrains the protocol to only allow - say - ITEM_TREEs instead of all types of ITEM_STRUCTUREs, then Revision B is not guaranteed to be able to “read” data created using Revision A (because someone might have used an ITEM_TABLE for example).

Am I wrong in my understanding that this is the nature of a breaking change?

Now practically speaking it may well be that this is not relevant because it can be ascertained that this theoretically breaking change simply never has occurred in practice. This seems likely in this case and thus it may be useful as editors to be able to override the technical recommendation…but let’s first agree on whether this is a breaking change or not.

For reference:
protocol of Revision B:

	protocol matches {
			ITEM_TREE[at0035] matches {    -- Item tree
				items cardinality matches {0..*; unordered} matches {
					allow_archetype CLUSTER[at0036] occurrences matches {0..*} matches {    -- Extension
						include
							archetype_id/value matches {/.*/}
					}
				}
			}
		}

thomas.beale · 16 December 2022 13:31

yes that’s correct logic alright. Does the new revision of this archetype mandate just that protocol, or is it a choice of ‘anything’ or the added protocol? If it’s now limited to just the defined protocol, Sebastian is right (as usual!) - the new constraint breaks one of the openEHR Golden Rules: that a new revision of an archetype doesn’t invalidate data created by previous revisions.

A couple of paths open in order to not create a v2:

determine (for sure) that no users of the older version used protocol, or else if they did, created a protocol structure that actually does conform to the newly added protocol constraint (seems likely if I understand correctly)
define the new protocol structure as an alternative, i.e. not the only possible protocol constraint.

ian.mcnicoll · 16 December 2022 14:16

I think your logic is technically correct @sebastian.garde but agree with @siljelb that in practice this is not a breaking change, partly because we only ever use ITEM_TREE but also because we have never been able to leave a protocol ‘open’ to be constrained in a template. The only place I can imagine this ever being a theoretical problem might be if some one specialised an archetype with no protocol section, then added e.g an ITEM_TABLE, then at a later stage added ITEM_TREE to protocol in the parent archetype.

Agree that the advice should be relaxed for adding protocol or indeed state

sebastian.garde · 16 December 2022 14:38

See the protocol extract above for the concrete example, I have taken that from the archetype directly - no choice defined.

Defining the protocol structure as an alternative is an interesting option.
I haven’t seen that before on that level before as far as I remember, but I assume you mean something like this:

protocol matches {
		ITEM_TREE[at0035] occurrences matches {0..1} matches {    -- Item tree
			items cardinality matches {0..*; unordered} matches {
				allow_archetype CLUSTER[at0036] occurrences matches {0..*} matches {    -- Extension
					include
						archetype_id/value matches {/.*/}
				}
			}
		}
		ITEM_STRUCTURE [at0037] occurrences matches {0..1} matches { * }
	}

I suspect that tooling won’t necessary expect this on protocol level at the moment.
I just tried in CKM and the upload works fine in principle, although the validator needs some finetuning to recognize the abstract ITEM_STRUCTURE correctly here and there seems to be a minor problem in the comparer which needs to be a bit smarter.

Technically, I suspect this approach can be made to work, but it seems a bit awkward to explain to anybody on the other hand.

So, if we all agree that this is practically speaking a non-issue (I certainly do, @ian.mcnicoll), editors can just override the recommendation by CKM, probably best accompanied by a comment in the log message.
It would certainly make any modeller’s life easier if this is possible.

thomas.beale · 16 December 2022 17:33

yes. In ADL2 you wouldn’t bother with the occurrences on the second one, it would just be assumed from the RM.

Whatever is decided here, I think it needs to be as general a solution as possible and needs to be fully automatable, i.e. doesn’t require special manual override of otherwise explainable rules etc.

Seref · 19 December 2022 09:44

Sorry but I don’t see how this is not a breaking change. @sebastian.garde 's point sounds solid to me. If there’s any data that’s created with the first version of an archetype, there are no constraints on the protocol that could be included in the actual data.
If the second version of the archetype adds any constraints to protocol, then it is no longer guaranteed that data from the first version is valid according to the second and as Sebastian says that should be breaking change. I cannot see how what you’ve been able to do in the modelling space matters here.

I can understand how @thomas.beale 's suggestion can help deal with the potential problem, but I’m a bit lost when it comes to both of you referring to this as a practical non-issue I’d love to get educated on this one! Maybe it’s too early in the week for my brain to kick in.

ian.mcnicoll · 19 December 2022 10:10

You are, as ever, theoretically correct

However there are 2 factors that make this practically very unlikely indeed.

I’m not aware of any archetype editors (or at least those used in the community) that allow you to leave a protocol with a choice of ITEM_STRUCTURE that can then be resolved e.g in templates.
For many tears we have not used anything other than ITEM_TREE, indeed Archetype Designer will always assume ITEM_TREE.

I think we can agree that according to ADL rules this is a breaking change but thta in practice, the CKM suggestion is actually quite unhelpful.

My suggestion would be that CKM flags up the issue as a warning but in terms of its overall recommendation, ignores it. This is adding extra human knowledge to the algorithm which ultimately is only a recommendation not strict rules.

Seref · 19 December 2022 10:15

I’d prefer technically correct but I’ll be a good boy for the moment

Ok, I see where you’re coming from now. A warning would be more pragmatic indeed compared to an error under the circumstances and history. Thanks, I appreciate the help

thomas.beale · 19 December 2022 10:27

It is formally speaking a breaking change. Sebastian is 100% correct! There is a practical question of whether anyone has created protocol structures (normally via templates) that won’t validate against the newly defined protocol.

If they have, and existing templates are updated to the new version of this archetype, older data built with the previous version of the template will no longer validate, and that will be a serious problem.

So we do need to be careful on this kind of change - routinely allowing technically breaking changes through might create havoc downstream in systems - or not - depending on what we know about the previous use of such archetypes.

sebastian.garde · 19 December 2022 10:41

From what I understand there are three choices here and none of them is ideal.

Major version change to v2. Clean and correct but always painful to bump the version number for a trivial change like this one.
Ensure it is a technically backwards compatible change using the alternatives pattern suggested by Thomas. Works in theory and tooling can catch up, but the pattern is complicated and you will probably need special support for this in editing tools as well.
Make the practical assessment that the archetype has not been used in a way that could have led to incompatible data. The wider it is used, the less feasible this is, but in this particular case due to the reasons Ian just outlined, probably less of an issue.

Agree in principle, but I wonder what this human knowledge would look like formalised though, so that CKM can safely ignore this as part of its overall recommendation (only for protocol?, only if completely unconstrained before?, only if the new slot underneath allows all archetypes?).
Maybe we can set up a webservice to your brain

siljelb · 19 December 2022 11:20

I’m going to use this option for this particular case then. Thanks everyone for your responses!

sebastian.garde · 23 February 2023 16:46

With the next CKM Release, CKM’s Archetype Comparer will no longer report this a breaking change requiring a major revision.

Instead, a change between two archetype revisions where the protocol (or any other top-level ITEM_STRUCTURE) is not constrained in Revision N, but is constrained to something - most likely an ITEM_TREE - in Revision N+1, will be reported as a minor revision [presumed], accompanied with a special tooltip explanation.

This seems to be inline with the discussions above, but if anybody is not happy with this or has ideas for improvements, please let me know.