New archetype process

So, just testing the water …

We have a potential client that is looking to invest in an existing medical device already in use across several countries. I’m making suggestions they should create a data model to represent the maximal data collected by their device, clearly that could present itself as an openEHR archetype/template.

I appreciate it’s quite vague at this point, but typically how long does it take for an archetype to travel through the archetype development process? I am assuming quite a while.

As standards take time to mature/progress, is it common that organisations create their own "local’ archetypes in the interim? If so, can anone point me to any published material in this area?

Many thanks.


Hi, the water is just fine :slight_smile:

Interesting questions, but nearly impossible to answer. I’ll give it a try, though.

Simple answer: It depends! A straight forward, already well known, well documented scale or score will take only a few weeks - given the resources to make the archetype, run the review and enough reviewers. Hirsutism scales Clinical Knowledge Manager were uploaded to the CKM Oct 5th and published Dec 22nd last year, and that turned out to be a not very simple score because there were variants in use. Karnofsky Performance Status, Clinical Knowledge Manager, were made in January 2021, but stayed in Draft until May 13th 2022, when it was put on review. It is likely (as it seems from the review comments per now) that it can be published in only 4-5 weeks.

More complex concepts, as the family of Imaging exam archetypes, have taken years - not in continuous work, but on and off. The main thing is the resources with deep knowledge to build archetypes and who also has insight in similar archetypes in order to reuse their pattern, access to domain experts to define (clinical) content, insight in available terminologies, and of course community input during the review process.

Medical devices are infamously known for not being standardised. For example ventilator manufacturers seems to deliberately name their ventilator settings differently - causing hard times for clinicians who have to know the differences (or likenesses) between ventilators from manufacturer X and Y.

So again, it depends on how complex the device is, and if there are any universally accepted nomenclature or terminology for the settings/readings.

A way out is to make generalised archetypes, with common name of elements representing settings or readings - if they exist, and map whatever dataflow from the device to these general nodes. And preferrably link them to a terminology either in the archetype or at template level or run time. Examples: SNOMED CT, Loinc, DICOM, etc.

There has been attempts for making archetypes for “breathing” - including disposable devices, as tubes, and ventilator readings. Again, on and off. I think “off” is the status for the work right now. It might be restarted if and when there is a concrete usecase and a customer/vendor willing to spend hours in taking them further.

Surely there has been done work locally, and quite certainly taken into production as local/national or vendor-specific archetypes. Not what we strive for, but as a necessity due to time and/or resource limitations.

I’m not trying to disencourage you and your potential client to start making archetypes for the medical device in question - actually on the contrary, but you should know that there is a need to find the resources to help you out both in terms of investigate the content, do the archetyping and the standardisation process.

To the others in the openEHR communituy: Please fill in, and shoot me down if I’m wrong!

Kind regards,
Vebjørn (in the Norwegian archetype team).


I see two kinds of processes: creating archetypes from scratch, or reusing and improving what’s available in the international CKM. If you know the kind of information that will be recorded from the devices, in minutes you can find out if there are already archetypes representing that information available in the international CKM or not. If you can’t find anything that matches, the process should start in an “unmanaged environment”, which is the same as saying “manage your archetypes locally” to cover all your needs first, then if you want you can share the archetypes in the international CKM. Though today you can ask for a space in the CKM to hold your “project”, so you kind of start your process in a “managed environment”.

Then when your archetypes reach a certain level of maturity/stability, you can share them, by moving them from the unmanaged environment to the managed environment, and ask the community to review and improve them.

In any case, you need first to comply with your own business requirements and then think about the rest.

The advantages I see of using archetypes are:

  1. the archetypes are your requirement documentation in terms of information recording/exchange, so there is no need of fat documents, just a repo with processable items;
  2. your client could take advantage of current openEHR tooling to accelerate things up, which is basically leveraging the openEHR information model and the dual modeling process (all standardized stuff) and the tools that implement the standard. This simplifies creating and managing models, storing and retrieving data (CDR, analytics, etc.) and sharing data (APIs).
  3. could help on integrating other standards like FHIR, since you can map archetypes and templates to FHIR profiles (it’s not trivial but doable, last year I worked doing exactly that for HiGHmed)

So for me it’s a good option that opens the doors to other things, even if those other things are not in the current scope, still a valid way of documenting data structures and defining constraints + bindings to terminologies.


Hi Richard,

Assuming one archetype:

  1. Design - varies enormously.
    An archetype can be built in minutes to directly represent data points where the knowledge is understood and clearly documented. However, it needs more time if you want to clearly document the metadata for the archetype so that others can understand it too. Note that we don’t try to replicate the academic text books re the clinical use, purpose, misuse of the archetype concept, but the metadata is focused on communicating the design principles within the archetype ie how to optimise implementation of the archetype.
    Concepts (and multi-concept domains) where the concepts are not so well understood can be much more complex to design. @varntzen has identified some already. The recent imaging archetype publications are a good example - DICOM is well understood but standardisation of the imaging report not so standardised, especially at the detail level. And the pregnancy/obstetric domain has been an evolving work in progress for some years now - the latest Xmind iteration of the pregnancy ecosystem plan is in CKM and we are building the new archetypes and making updates to the existing archetypes to align with this plan now.

  2. Reviews - varies, but is reasonably predictable with a skilled editor minimising unnecessary changes & extra reviews.
    — Review time - is set by the Editors. For our community of international volunteers we typically run reviews over 2 weeks. But this can be lengthened over holiday periods or shortened if there are specific deadlines or (theoretically for a funded CKM) there is a focused, incentivised (even paid) community ready and willing to respond.
    — Review numbers - As the content of the scores and scales is not contentious, a skilled archetype editor can present their archetype for review and the content published with only one review round.
    More complex ones can take 2-4 review rounds. It is very unusual to require more review rounds than that. More than 2 review rounds can occur with inexperienced editors not presenting the content in an optimal format or the concept is ‘fuzzy’. Examples of fuzzy concepts are the current family of screening questionnaires - it has taken a while for us to design a pattern that works and there are 5 currently undergoing simultaneous reviews for what appears to be the final time - Clinical Knowledge Manager.

Publication occurs when the Editors are content that there is effective consensus by reviewers - that it is fit for use, even if not yet the notional maximal data set.

So the simplest new archetype could be built, reviewed and published within days if it was operating in a well resourced and controlled environment.
More complex ones in the same environment could be built, reviewed and published within weeks.

And remember that many archetypes can be reviewed simultaneously - the rate-limiting step is reviewer burnout. So, in this ideal, well resourced CKM environment tens of archetypes could potentially be reviewed within months.

The reality, to date, has generally been not well resourced. And that has given us the luxury of time, so that the experienced international modellers now have a repertoire of modelling patterns that are tried and tested, and we have a growing library of published archetypes that are the living evidence of those patterns.

I suspect that with device-related data, the data set would already be quite well documented. Depending on the device, maybe only one archetype required, or maybe more than one including reuse in a template. We can do simple template reviews as well.

Hope that helps




Thanks Pablo - very helpful. In this particular case, the client is not too interested in persistence or openEHR APIs (they want to use FHIR). The value statement is around the well-defined knowledge model of the data that the device captures (which is quite simplistic, thankfully!).
I’m not sure the timelines will in fact fit with our potential engagement with the client so that itself might rule this out. That said, it might be interesting to progress this as a learning exercise myself.

1 Like

Thanks, Heather - plenty to digest there. I’m not convinced that the client (they are relatively small) would immediately see the value of the effort required to build a high quality archetype that would satisfy the quality criteria of the international community.
They themselves are not too interested in the EHR (even though their customers probably are).

Food for thought… I’ll see what the appetite is

Is it a multifunction device, or does it just have a single channel? Device data and integration is covered pretty well in ISO/IEEE 11073 and the HL7 Devices group have published a detailed FHIR-IG for point-of-care devices and are working on one for personal devices. When thinking about the ‘maximal data’, it helps to consider the device metrics (see FHIR device resource) separate from the data that the device transmits. More recent challenges being addressed are how device alerts are handled. Saying all that, I think there is a gap needing filled in openEHR for persisting physiological data from devices, beyond simple BP, heart rate, pulse and SpO2. We’re hoping to build a model of a medical device using a raspberry Pi that we can use to simulate the transmission and persistence of the physiological data in a clinical record. It’s at an early stage though.

1 Like

In openEHR a maximal data set is the main definition of an archetype and how the model behind them should be though of. Basically all starts with a custom/narrow data requirement, then the clinical modeler should understand the concept behind the data requirement (minimal data set) and consider all possible scenarios for that concept, then create a model based on all those scenarios and one specific concept per archetype, which means, if in the analysis more than one concept comes up, then another archetype should be created, and each individual archetype should be designed with all usage scenarios in mind, so any archetype end up being a maximal data set.

Note “all possible scenarios” in general include clinical and secondary uses of data (analytics, education, research, public health, epidemiology, etc.) and might include some technical requirements too (ways or patterns of data which simplify storage, querying or sharing the data).

Historically the HL7 community has a different approach to information modeling than the openEHR clinical modeling approach, not saying it’s better or worst, it’s just different. That is why I clarify what maximal data set is (in this context and to my experience).