# openEHR-clinical Digest, Vol 11, Issue 3 **Category:** [Clinical (archive)](https://discourse.openehr.org/c/clinical-archive/153) **Created:** 2013-01-07 06:19 UTC **Views:** 2 **Replies:** 36 **URL:** https://discourse.openehr.org/t/openehr-clinical-digest-vol-11-issue-3/15224 --- ## Post #1 by @WILLIAM_R4C Hi Heather, Thanks for your reply. In general ISO 13972 describes what good models should look like, not how my models should look. Sorry to see that misunderstanding repeated. And it specifies characteristics of the models. If CKM is the means of implementation, why bother. If you version the models you new the requirements of ISO for instance. OpenEHR is even based on these foundations that you refer to other parts for particular characteristics. So I could not disagree more about such matters. And that is why the governance is in and separated from the model. E.g. If it says that any DCM should have an ID, in normal situations it must be in the file or in the file and. But there are other means to guarantee the uniqueness. I am afraid you are judging the DCM work too much from an current CKM implementation side. Please go back a step to the principles: why do we need these characteristics? Because there are alternative s to archetypes. That is why this TS 13972 exists, to facilitate the 99% of EHR systems that cannot manage archetypes. You already have the 13606 series you can refer to. Vriendelijke groet, William Goossen --- ## Post #2 by @heather.leslie Hi William, Perhaps you can clarify better if I use a concrete example such as 5.8.12.2.2 which is a SHALL (or mandatory) statement. There is a subcomponent within this normative statement labelled: “Publication Status of the detailed clinical model”, with a description: “This is the status of the detailed clinical model in relation with publication in the registry or repository: Not For Use (i.e. teaching); Approved for testing; Approved for Production Use; Withdrawn; Superseded; Rejected(en); Obsolete.” I interpret this as saying it is mandatory for a clinical model specification to contain a statement about the model’s publication status. openEHR archetypes currently don’t contain the publication status within the archetype specification itself. So I conclude that, at present, openEHR archetypes cannot be compliant with 13972 in regards to this normative statement. I know that some other detailed clinical models contain a version history that incorporates a publication status as well, but that is not the case with the archetype as a logical model. Regards Heather --- ## Post #3 by @Talmon_CRISP Dear Heather Apart from implementation issues and current compliance, the question to answer for this particular example is whether or not it is useful to know what the status of an artifact is? When someone sends me an archetype in ADL, it is relevant for me to know what the status of that archetype is\. It will allow me to make an informed decision how to use \(or not to use\) that archetype\. Although it may be in the email to which the archetype was send, later on, that email may get lost\. So it is better to have it as meta data of the archetype\. \(At least from my perspective\)\. More in general, 13972 should specify the useful components of detailed clinical models\. We all want interoperability and safe use of clinical models\. I think the most constructive approach is to argue why certain SHALL statements may not be necessary \(or too mandatory\) to support these goals\. The argument that "My current implementation is not compliant, hence it has to change" is less compelling for others\. Any standardization work implies for some \(if not all\) approaches that work needs to be done to become compliant\. BTW, I have noticed that the "Dutch DCM" approach has evolved over the years to accomodate what has been in successive drafts of 13972, not that the drafts followed what was put in the "Dutch DCM" approach\. Regards Jan --- ## Post #4 by @heather.leslie William, Jan, My intent with previous emails was simply to encourage a broader group within the modelling community to participate in feedback for DTS 13972, which has been sorely lacking in previous ballots\. You're both making reasonable points\. We are all seeking interoperability\. That's why we've all be working on this for so many years now\. I don't want to have detailed arguments in this forum\. I don't necessarily think openEHR has it right in this area\. BUT whether statements should be SHALL or SHOULD is a reasonable debate\. Might it be acceptable to provide strong guidance with SHOULD and allow it to be tightened up further to a SHALL statement if and when there is clear community consensus? Some SHALL statements in the current 13972 will mean openEHR archetypes appear to be non\-compliant\. If they remain as such, and they could well do so, then the questions for us become\.\.\. does it matter? Will we choose to do anything about it? Heather --- ## Post #5 by @system I think, the way we think about archetypes needs to be revised, that the archetypes cannot be compliant in 13972 is one reason, but I have another. A few weeks ago I had an interesting discussion about archetypes on this mailinglists, and by chance, also with Dipak Kalra about the same subject. This discussion is recorded on Youtube. After some confusion and help from other participants, and we came to the point I wanted to make. I think, this is related to the situation around archetypes, semantic interoperability and therefore in this discussion. Bert --- ## Post #6 by @thomas.beale This kind of thing in 13972 is a worry\. Obviously some publication / lifecycle status is useful\. Where it is recorded is another matter, and also the list of possible values is a further question\. I can't imagine how there could be an ISO standard mandating either of these things before industry is even close to working out best practices\. That would need to be changed for ISO 13972 to be of any use in a standards sense\. Personally I can't see the utility of trying to standardise this area for some years yet; industry will provide things that work, eventually, and that entails a period of experimentation and evolution\. It's hard to imagine any other practical approach\. \- thomas --- ## Post #7 by @thomas.beale Jan, > Dear Heather > > Apart from implementation issues and current compliance, the question to answer for this particular example is whether or not it is useful to know what the status of an artifact is? When someone sends me an archetype in ADL, it is relevant for me to know what the status of that archetype is\. It will allow me to make an informed decision how to use \(or not to use\) that archetype\. Although it may be in the email to which the archetype was send, later on, that email may get lost\. So it is better to have it as meta data of the archetype\. \(At least from my perspective\)\. I don't disagree that it makes sense\. Archetypes have a lifecycle field; it just isn't always set properly due to tools\. Currently only CKM sets the field properly\. It is only very recently that a workable set of lifecycle states have been worked out in archetype\-land / CKM \(i\.e\. there is no claim that the lifecycle used here is the right one for DCMs or other content model types\)\. > More in general, 13972 should specify the useful components of detailed clinical models\. We all want interoperability and safe use of clinical models\. I think the most constructive approach is to argue why certain SHALL statements may not be necessary \(or too mandatory\) to support these goals\. The argument that "My current implementation is not compliant, hence it has to change" is less compelling for others\. Any standardization work implies for some \(if not all\) approaches that work needs to be done to become compliant\. ok, but 'standardisation work' is not where the actual work on these things gets done\. You don't solve difficult problems like this by confabulations in the discussion rooms at ISO meetings\. They are solved in the real world \- in competent academic and industry locations where real implementation work is going on\. As I have said many times before, the only utility of 'standards' is if they are defined by choosing / adapting working solutions that emerge from these real world locations\. Aspirational standards are worse than useless; they waste people's time and inevitably have to be debugged by the real world anyway\. So until there are compelling, working implementations of a given concept, a standard for such a concept is just nonsense\. \- thomas --- ## Post #8 by @yampeku When you talk about lifecycle nor being set properly, do you mean that the set of values in not correct or that the lifecycle management process in not correct? --- ## Post #9 by @pablo Hi Jan, William, Heather and others, Just a thought about having governance metadata inside or outside the knowledge artifact (e.g. archetype / DCM). Having that metadata outside the artifact creates the necessity of a centralized governance system, and having it inside the artifact doesn't. We work in distributed environments where the centralized view doesn't fit. The CKM is a proof of that, there is a global instance, an AUS instance and a SWE instance, they share some archetypes and others are developed locally. The 3rd scenario is having the metadata outside the knowledge artifacts, and support a distributed governance environment. That is just an open door for chaos, because it's impossible to keep track of all artifacts, all modifications/versions and statuses, because each governance system will have different metadata about the conceptually equal artifacts (they represents the same concept). What do you think? --- ## Post #10 by @thomas.beale Hi Pablo, that's only a problem if the artefacts are not tracked properly, i.e. with unique ids and potentially checksums / MD5s. You can't put everything in the artefact, because you create dependencies of the artefacts on things which may be of local or specific significance, e.g. how they are classified in an ontology, what the relationship between parts of the artefact and the national data dictionary are and so on. If we put everything in the artefact, we will end up with a BP measurement archetype containing 50k of references to NHS data dictionary items, Nehta ISO 11179 elements, ontology markers for different types of CKMs etc. The lifecycle will be tracked properly in archetypes, it is just waiting for tools to make more progress, and also to finalise what the lifecycle actually is. But not everything can be. So there is no way out of having proper governance tools - whether centralised or distirbuted... - thomas --- ## Post #11 by @ian.mcnicoll Hi Diego, Both!! The set of values is not correct - it was originally drawn, I think, from the world of paper standards and does not fit the archetype development cycle particularly well. The values are also not applied very consistently (although we now do this with in CKM). I am currently finishing a fairly long document which tries to bring together a number of these issues - archetype identifiers, namespacing, lifecyle versions etc, which I will aim to get up into the public space within the next couple of weeks. Regards, Ian --- ## Post #12 by @pablo Hi Thomas, I mentioned only "governance metadata", like identification attributes (conceptual: id, name, code, definition, purpose, ...; physical: hashes, checksums, ...), state attributes (current development state, publish state, revision state, version, ...), autorship attributes, ... Since this is a discussion about an standard and the problems they are solving, I prefer to center on a more usable and difficult scenario: distributed governance. I'm not saying that a centralized scenario is bad in all cases, just that is a far more simpler scenario to resolve. --- ## Post #13 by @yampeku I see your point, but I would say that already happens with translations \(translations with different degrees of 'maturity' may exist on the same archetype\) --- ## Post #14 by @Tim_Cook Since I am an other, There should not be any modifications or versions\. Once a clinical model is distributed, it is concrete\. There may be instances created against it\. They are permanent and should never be changed\. \-\-Tim --- ## Post #15 by @ian.mcnicoll Hi Diego, This is a good point. I think we will probably have to label translations with metadata on authoring state and the revision/build against which that state was set. It will be important for people to know if a translation has been performed/reviewed against the current revision (or not). Ian --- ## Post #16 by @thomas.beale ok - I wasn't clear, since there is all kinds of other 'governance metadata' which people have proposed in the past, and which we do use, e.g. for searching, which are not in the artefacts themselves. But in a more strict use of the word 'governance', I agree. I think distributed governance is the only realistic possibility. - thomas --- ## Post #17 by @system Hi All I am wading in here because I am very aware that the full technical requirements for safe use of clinical models depends on a great deal of checking. At present we have a couple of checks in CKM – an MD5 on the full archetype (this will change if you change anything) and an MD5 on the structure which allows word smithing, translations etc. For systems using archetypes which are evolving rapidly, these two checks are useful. Clearly an archetype, once released for use, is a high risk artefact, regardless of the metadata. We saw in the UK that people changed the archetype and did not resubmit it. The MD5 checks do allow repositories to assess the type of change. The publication status is a complex thing as well. Stand alone publication of an archetype is one thing, but most important is being part of a release set for a particular jurisdiction. Clearly all archetypes in a release set will be published. But the same archetype can be published over and over with more translations, additional features etc, all the while being backwardly compatible. So there is a lot to manage and I am not sure that the proposed meta –data will help a great deal. It is, actually, possible to put the status in the archetype. These have not been standardised as yet as it is still not clear which are needed. The governance pathway in CKM is now well tried and tested, but will have some new issues to deal with now we have subdomains and incubators. Cheers, Sam --- ## Post #18 by @system Because I partially initiated this discussion, I want to summarize my view on it again, but I did not think it over very well. I am too busy with other things to follow these discussions. I am sorry for that. So regard it is as my two cents while make a good and wise decision about how it will be. My idea is to furnish archetypes with more MD5 keys, one for the definition, and one for each language-section in the ontology. Now a locatable stores the ArchetypeDetails to indicate which archetype is used, then ArchetypeDetails should store these keys. It is of course the responsibility of a system-owner to save the archetypes belonging to a specific keys. Backwards-compatibility can not be defined in keys, if that needs to be defined, there must be a metadata-section for that. Regarding to the archetypeID, the term "ID" suggest that it can be used to identify something. If one wants to use the term ID, it must contain information to identify the archetype, in that case, it should contain a key, MD5 f.e., which is The information (RM-model, domain, concept, etc) that is now hidden in the archetype-ID must be provided in metadata-sections, so that the ID will not be longer need to contain any necessary information. I think the ID will be superfluous. It should not be part of the ArchetypeDetails. The archetypeID will not be needed anymore in the information-structure, users are free to assign any name to an archetype. And we also need a humanly expressable name for an archetype, for that purpose, a name-design can be provided, as long as it is clear that the name does not verifiable and uniquely identifies an archetype. --- ## Post #19 by @Stefan_Sauermann > I think distributed governance is the only realistic possibility\. Fully agree\. Greetings from Vienna, Stefan Sauermann Program Director Biomedical Engineering Sciences \(Master\) University of Applied Sciences Technikum Wien Hoechstaedtplatz 5, 1200 Vienna, Austria P: \+43 1 333 40 77 \- 988 M: \+43 664 6192555 E: stefan\.sauermann@technikum\-wien\.at I: www\.technikum\-wien\.at/mbe I: www\.technikum\-wien\.at/ibmt I: www\.healthy\-interoperability\.at --- ## Post #20 by @ian.mcnicoll Hi Bert, Whilst I understand and agree with the need for very precise identification of a particular archetype via MD5 or some other 'meaningless identifier/locator', it is also necessary to clearly identify the version, revision and build of an archetype for authoring and governance purposes, as well as in early stages of development and testing. During the authoring phase, where archetypes and templates can be very fluid, and have multiple dependencies, our experience has been that MD5 hashing is very difficult to manage since it forces very strict dependencies, where in practice we perhaps only want to ensure that the archetype Version is correct, and temporarily ignore Revision changes or minor 'Build' changes, until publication/deployment. In terms of regular software development, imagine the consequences if all of your source code files were MD5 hashed, and each dependency MD5 tagged, resulting in a compile failure if there is a mismatch between the dependent files. I am not convinced that abandoning the existing archetypeID mechanism is the right approach but even if it were I think we would still need to support it for legacy purposes. Let's also leave aside the OID vs ReverseURI issue for now (there is a possible case for supporting both). As I said in a previous post, I am working on some proposals I am working on, based on earlier discussions, Thomas's wiki material, the suggestions for Semantic Versioning at [semver.org](http://semver.org) and experience from both CKM and non-CKM repository governance. The suggestion will be to introduce an extendedArchtypeID, composed of ... OriginalDomain: The originating Domain namespace where the archetype was first governed e.g. org.openehr but could be an OID. This never changes even when the archetype is transferred to a different repository. ArchetypeID: The current archetypeID (including Version) e.g. openEHR-EHR-OBSERVATION.blood_pressure.v1 RevisionNumber: The revision number e.g '2' starting from base 0 with each new Version NonOperationalModifier: A state suffix e.g. (initial, draft, rc, inactive). BuildID: This is repository-defined. In the case of CKM it might be the 'citeable-ID' e.g. 1031.1.1296_2 but it could equally be an MD5 Hash for a non-CKM repository. The only stipulation is that any archetype made visible by the repository must increment the BuildID, whenever ANY change is made to the archetype. So the full ID for a published archetype might look something like org.openEHR::openEHR-EHR-OBSERVATION.blood_pressure.v1.0+build1031.1.1296_2 or for the same archetype taken back into review org.openEHR::openEHR-EHR-OBSERVATION.blood_pressure.v1.1-draft+build1031.1.1296_7 The nomenclature is closely based on [semver.org](http://semver.org) with some modifications The metadata should also carry : AuthoringLifeCyle: (Initial, Draft, Team Review, Suspended Review, Published, Rejected, Superseded, Moved). This is effectively a superset of the operational states to account for authoring and governance requirements which do not affect the operational status of the archetype. i.e an archetype which moves from Draft to Team Review and then Review Suspended, remains a 'beta' archetype from an operational perspective.There is a direct mapping between the operational states and one or more Author lifecycle states. CurrentDomain: identifier for the current controlling domain, to allow for situations where control/governance of the archetype is transferred. We can probably carry all of the necessary extra metadata within the current ADL1.4 specification under other_details. In general I would expect archetypes and templates to refer to dependent archetypes or templates using the full extended identifier, but associated tools could choose to ignore e.g build or even revision incompatibilities where appropriate, and update the references automatically to point to the currently used artifacts, if modelling is still at a stage where loose coupling is necessary. In terms of the original discussion, I think my main concern is that not that the state of understanding of how DCMs/archetypes should be managed.governed and versioned is still very much in the early phases of understanding, particularly in the very distributed governance environment that I think we all agree will be required. openEHR archetypes (? also FHIR resources) are a bit of a special case since we expect these to form the basis of persisted, queryable data, not just as messaging definitions. This makes the challenge of versioning / governance rules much harder but I think we are pretty close to having an implementable solution. More soon ... and would be interested in others thoughts about the BuildID, particularly in the context of Git or Subversion-based repositories. Ian --- ## Post #21 by @yampeku For the moment I can think on possible problems when using that kind of id if someone tries to put the id as the file name in some OS\. ':' is not a valid character in a file name\. --- ## Post #22 by @ian.mcnicoll Hi Diego, That's an issue which has been raised in previous discussions, and was resolved by allowing other characters to be used in filenames e.g. ∼ - see [http://www.openehr.org/svn/specification/TRUNK/publishing/architecture/am/knowledge_id_system.pdf](http://www.openehr.org/svn/specification/TRUNK/publishing/architecture/am/knowledge_id_system.pdf) - right at the end of the document There is perhaps an argument for using '∼' for both the internal identifier and the filename but I don't think we should insist that the filename must replicate the form of the extendedID precisely. In some cases on local systems might want to omit revision number or even the namespace, depending on circumstances. The source of truth for any tooling or application must be the archetype metadata, not the filename. Ian --- ## Post #23 by @yampeku Indeed, but I'm sure that it will cause more than a headache to some newcomers :\) --- ## Post #24 by @pablo Diego, we commented about archetype naming and archetype id in the lists not so long ago, one thing was clear: adl file names and archetype ids are independent, but is useful to have the id on the file name when developing a system. This is not so useful for production systems (the ones I believe will use the full ID mentioned by Ian). --- ## Post #25 by @system Diego, it is a Windows limitation\. My OS has no problem with ':' in a filename\. But besides from that\. I don't think we should connect the place where or the way how the archetype is stored with the way to identify the archetype\. For you as a archetype\-editor\-designer, it can be convenient to have the ID as filename, I think you will find a solution for that\. But in my opinion, we should not let that be a part of this discussion\. Bert --- ## Post #26 by @system Hi Ian, For me the big advantage of internally verifiable archetypes is that you don't need any organization to be sure that the archetype you are looking at is the same archetype with which you defined a stored dataset\. I think it is important to let be possible that archetypes are used with no connection to the outer world, and that they are still are verifiable\. You don't want a hospital to remain out of reliable data because the Internet is down? Because that is the case if an archetype cannot proof itself to be reliable\. That is why I think having MD5 keys in the metadata is a good idea\. Also, imaging archetypes being used in very special medical organizations which have nothing to do with the outer world\. For example, archetypes needed in industry or specialized organizations\. The advantage, in my view, of multi level modeling, this is possible without having to write expensive systems\. Just write an archetype and you are in business\. We should not lose focus on this flexibility\. --- ## Post #27 by @system > But I also see your problem of development, testing, data becoming useless because MD5 keys are changed because of archetypes revisions\. is fixed, it may never change again\. There is something to say for that\. Maybe it is just right that you cannot use a revised version to retrieve/interpreted older data\. You should use the original archetype, identified by the MD5 key\. I think this is a strong point\. Bert \(and if you cannot find it anymore, and it is impossible to write one with the same key, there should be an escape\-API, like there should also be for development and testing situations\) --- ## Post #28 by @thomas.beale I think the clinical modelling community on CKM have worked out a good life cycle now\. That has taken some years to evolve\. I doubt if they would claim it is the final answer for the whole world, nor even all uses of archetypes\. So I would say a standard for this is still premature, but that if anyone wanted to create a draft of such a standard, it would be hard to imagine one of the candidate life cycles not being the one developed in CKM, with its hundreds of users\. \- thomas --- ## Post #29 by @thomas.beale that is more or less what we do in our own implementations, and it has taken some time to work out even how to make this work properly. For example, doing an MD5 of the whole archetype actually doesn't work. You have to first generate a canonical version containing only the 'stuff that matters'. Then MD5s can be useful. But we also need a proper 3-part version identification system. MD5s don't replace this, they just tell you that the copy you have of X is really X (assuming X's MD5 is published in a place you trust), or else a guaranteed equivalent of X (e.g. in Dutch). I.e. the usual integrity check. And/or non-repudiation, if you including signing. But getting the definition of the canonical form is not that easy. Anyway, there are two useful docs that I and Ian McNicoll will get posted ASAP: --- ## Post #30 by @yampeku I know, we changed that constraint some time ago, I'm just saying it's something that could happen if people is not aware\. --- ## Post #31 by @ian.mcnicoll Hi Bert, I can understand Tim's point from an implementation perspective, and it really all depends on what you mean by 'an archetype' - is this a Version of an archetype, a revision of an archetype or a specific build of an archetype. I have seen enough real implementation of openEHR to be confident that the version/revision/build rules work very well, in terms of clearly defining breaking and non-breaking change. One of the joys of openEHR development is slipstreaming in a revision archetype into a running system and watching it carry on running safely but able to support the expanded dataset afforded by the new revised archetype, without having to change any database schema or legacy queries. On the other hand, I can also understand that others might want to take a different view in certain mission critical areas of specifying a specific revision or even build of a particular archetype for data collection and querying. Any proposals have to support both perspectives. The problem with the strict MD5 approach is that you are going to have to update all of your software and query references every time you change your schema and effectively use a 'new archetype', in many respects this is even worse than the RDBMS approach which at least generally allows new columns without breaking queries etc. So, I think I probably disagree with Tim, except in the sense that I think we can have the best of both worlds if we adopt a sufficiently flexible id policy that lets implementers specify the exact degree of control they want to apply. Regards, Ian --- ## Post #32 by @system Thomas, it is also my idea that a MD5 on an whole archetype does not work. I wrote it a few times before on this list. You must have missed it, no problem, I also don't read everything. Most important is an MD5 over the definition, it must be done after removing comments and trailing spaces, line-ends, etc. Other MD5 could be taken on the ontology, one for every language. Then the second part of your message, how needs an ID to be defined? I do not have a strong opinion on that. Where should information be, in the ID or in the metadata? I think if an ID contains obvious information, the chance that it will be unique, will be very small. I think you better call it "name", because it describes an archetype, it does not identify an archetype. I think that this kind of semantics is important, but it is not the most important part of the discussion, the most important part should be which information to include and use and for which purpose, and then, as detail, where that information should be, and how to call it. Bert --- ## Post #33 by @thomas.beale sorry, I missed a few things in the avalanche One theory I have is to convert the archetype to a standard dADL serialisation (the ADL Workbench already does this), and throw out descriptive elements, but I think the ontology has to be included. I'll get the id proposal up and let's discuss it then... - thomas --- ## Post #34 by @system Ian, Joy is a good thing, but once an archetype is used in production, one must be able to find that archetype back, and proof that this is the same. Not a revision, not a version, but that specific archetype. I guess every informatic-specialist will agree on this. But this is an easy requirement, check the MD5. Having an revision does not need to mean that the pre-revision does not anymore exist. It is something that should coexist with the new revision, which can be recognized as an new revision and have the precedence on new development, but the old version should remain available. Things will really get complicated if we have to check what the difference is between the archetype of a stored dataset, and the X-th revision of that archetype. I don't think we should want these complications. I do agree that changing archetypes while using them in an application can cause problems. But also, using different versions of an archetype inside one application is not that complicated. Just regard them as different archetypes, and you are half way. The strict MD5 will help you doing so. Mistakes are not possible. That is good news. Strict MD5 checking will help the application-builder. Best regards Bert --- ## Post #35 by @system OK :\) Bert --- ## Post #36 by @ian.mcnicoll Hi Bert, Comments inline\. > Joy is a good thing, but once an archetype is used in production, one must > be able to find that archetype back, and proof that this is the same\. > Not a revision, not a version, but that specific archetype\. > > I guess every informatic\-specialist will agree on this\. > But this is an easy requirement, check the MD5\. > IAN : Agreed, and that is exactly how we use MD5 hashing in Ocean applications > Having an revision does not need to mean that the pre\-revision does not > anymore exist\. > It is something that should coexist with the new revision, which can be > recognized as an new revision and have the precedence on new development, > but the old version should remain available\. > IAN: Also agreed > Things will really get complicated if we have to check what the difference > is between the archetype of a stored dataset, and the X\-th revision of that > archetype\. > I don't think we should want these complications\. > IAN: The whole point of the Version/ Revision rules is that this check takes place at design\-time, and is not needed at run\-time\. Currently within CKM and other authoring tools, we do exactly these kind of comparison checks every time we upload a new archetype Build, and this is what allows us to determine whether a new Version is required \( effectively a new archetype\) or whether the changes introduced will be compatible with data stored with the previous revision\. Of course, you have to realy on the archetype authors to be adhering to these rules correctly and applying the correct semantic versioning identifiers\. We, and others, have any number of systems and applications which work quite succesfully and safely on this basis\. > I do agree that changing archetypes while using them in an application can > cause problems\. But also, using different versions of an archetype inside > one application is not that complicated\. Just regard them as different > archetypes, and you are half way\. The strict MD5 will help you doing so\. > Mistakes are not possible\. That is good news\. Strict MD5 checking will help > the application\-builder\. > IAN: If you take the approach of regarding every non\-breaking revision as a new archetype, you are effectively negating one of the huge advantages of openEHR, in that the formalism and versioning rules allow for considerable modifications to be made without breaking existing code and queries\. So I can extend an archetype over several revisions, continually adding new content, but this query for All of a patient's diastolic blood pressures will remain completely valid \(pseudo AQL\-for clarity\) SELECT diastolic FROM EHR CONTAINS openehr\.org: :OPENEHR\-EHR\-OBSERVATION\.blood\_pressure\.v1 if we are going to query on the basis of identifying the archetype via a strict MD5Hash, the query \(or any other reference to this concept will have to be updated to cater for every new archetype, worse still every single minor typo change or language addition, that is of no interest e\.g\. SELECT diastolic FROM EHR CONTAINS "7CEDD2FF334E6DE44B16A369F14AC800" OR "CEDD2FF334E6DE44B16A369F1EE123" OR "ABCD2FF334E6DE44B16A369F1EE143" This is exactly the kind of overhead that traditional RDBMS approaches introduces and which makes openEHR much more agile in being able to respond to changing clinical requirements without requiring change to existing software\. We know this works, but it does require careful control of operational repositories, and we all agree that the current metadata is inadequate, and needs to be improved to prevent the kind of naming and versioning collisions you are rightly concerned about\. Part of the delay in getting this specified is that any mechanism has to cope with both the requirements of the archetype and template development cycle, and of course, implementation, which has taken some time to understand clearly\. I expect MD5 Hashing to be part of the solution, but it cannot be the only solution\. I cannot see any reason why the kind of close control you want cannot be acheived, whilst allowing the greater flexibility which is need for development cyclel and to support agile implementation Regards, Ian --- ## Post #37 by @system Hi Ian, > Comments inline. Very good, what I propose is to facilitate this proces by adding MD5 to the archetype, so that the archetype itself can proof it is the one you want. It is a safety-measure. > With you, I am in the opinion that archetypes should be revised if necessary. I am just saying that you should not use a revised archetype to interpreted data which are stored previous to that revision. There is no need to do that, it is only a risk. > No, I do not agree with that. You should always use the best archetype to store data. And if the revised is better (which mostly is the case), you should use that. I do not want to stand in the way of progress. > Yes, that is true, repairing typo's will be more difficult. But if you do not check the MD5, then you have to check and judge what the reason is for revision. There is no solution without disadvantages. And again, there is no reason for not using the old archetype for interpreting data which are stored with that old archetype. Maybe, if the typo is that much disturbing that you don't want to use the old archetypes anymore, you can also store again the data with the new archetype. And besides that, typo's are in the ontology section, not in the definition (that is syntaxly checked), because that, I suggested separated MD5 checks, for the ontology section, for each language, and one for the definition. So, if there is found a typo and the MD5 for the ontology for a specific language does not fit, but the archetype indicates by other metadata that it is a revision, than it can safely be used, also for data stored with the unrevised version. But again, there is no need to do that, because using the original archetype is always safe. We must accept that history has things in it, which are considered wrong later. The storage of data in that time was in the context of that archetype at that time. A good idea would be the mention the revision change in the metadata, in a special section. Bert --- **Canonical:** https://discourse.openehr.org/t/openehr-clinical-digest-vol-11-issue-3/15224 **Original content:** https://discourse.openehr.org/t/openehr-clinical-digest-vol-11-issue-3/15224