Another meta-data requirement from CIMI

In CIMI, there are now some thousands of archetypes, 90% converted from Intermountain CEMs. We can start converting openEHR archetypes to CIMI form as well, for contribution to CIMI. To provide traceability, we probably are going to need a new meta-data item where some information about model conversion/import can be represented. In the current CIMI generated archetypes (not the reference ones), it could be information like:

"converted with IHCModelConverter v3.134.0.78, on 12-10-2014 at Intermountain Healthcare, UT"

or

" converted with ADL Workbench v2.0.5.2345, on 03-11-2014T12:05:00 for openEHR Foundation"

or similar, so that conversion errors can be traced to tools, and also to simply indicate that the archetype was machine converted. Note that there is already an 'is_generated' flag in an archetype. So that when an archetype is imported this is True, but later CIMI authors may start manually modifying it, then it is set to False. That way you can tell if the archetype is still being imported or not. So the meta-data for this purpose might be something like:

     import_details = <
         original_source = <"Intermountain model xyz avalable at <URL>">
         time = <2014-10-12T12:44:00>
         method = <"IHCModelConverter v3.134.0.78">
         other_details = < > -- tagged values
     >

thoughts on this?

- thomas

As you say, this information should be somehow related to the “is_generated” flag. But if we consider that once a human user reviews the archetype that flag is set to false, then I don’t find it needed at all. What we would need instead is to define a good practice that says that when an archetype is built automatically it must be in draft state, pending of further validation, as any other draft.

In any case, adding the information that you propose about the original source or the import method seems very useful. I have the doubt if a new metadata section is needed or it could be considered as reference information or just as other details. A new section makes it more traceable automatically, as you say.

David

ah - but consider the situation in which the generation step is done multiple times, over a period of time. I was in this situation with my internal 1.4 => 1.5 (now => 2.0) generator, where it took some time to get the converter right. And Patrick Langford is iteratively getting the Intermountain converter right over a period of some months. The ADL WB always looks at that flag to know what to do. If you right click on an archetype in the left side explorer, and do ‘Edit’, the GUI editor (alpha for the moment, but functionally the same concept as the LinkEHR editor) starts. If the user actually makes any changes and commits them, the AWB removes the is_generated flag. Then a later round of import generation can look at it, and not overwrite this particular archetype, and instead generate a warning (or it could try to do a merge, or..). So I think it’s needed. well that depends on whether we think an import step always creates a ‘draft’ archetype. It might be that the source model, say an Intermountain CEM is actually in an ‘approved’ state, and Intermountain (and maybe even CIMI) don’t intend to do anything furhter with it after import. Although not likely to be the most common case, I don’t think we can presuppose that the lifecycle is rigidly restarted because of an import step. I’m not too worried either way, but more inclined to add new meta-data sections if we think we know what they look like, since it reduces interop errors - there’s no choice about the property names etc, wheres it’s always a risk to hope that everyone will get the other_details section tags identical. - thomas

Could "autogenerated" be a valid lifecycle state?

I don’t think so, but maybe we could use the release_candidate state, instead the draft one that I mentioned.

well I think either could be correct, depending on the circumstances. E.g. the latest openEHR/FHIR joint Adverse reaction archetype might go in at ‘release_candidate’, but many other openEHR ones wouldn’t. Even archetypes that are ‘published’ according to us could easily go in as drafts, since it may be that CIMI is the first environment where a really wide group of reviewers looks at it (and inevitably changes it). So.. I don’t see any clear rules just yet :wink: - thomas

Yes, I understand the process. What I tried to propose was that, if we add
that import information section, the generated flag could be part of it,
instead having it as a standalone reserved word in the header (just an idea
to explore). And that's why I also support adding the import information as
a proper, standalone section.

We can’t assume that an approved/published Intermountain model (to say something) automatically becomes a published archetype either. So we have a problem here. Which should be the default life cycle state of an auto-generated archetype?

I don’t think there is any relationship. I think the lifecycle state for the generated archetype has to be set by some human-input parameter that’s agreed beforehand, or it could be reset later in the tool. I don’t think this value is guessable by machines. - thomas

well the problem there is that ‘is_generated’ can be set for other reasons. E.g. the ADL 1.4 => 2.x converter. This isn’t an import, it’s a format convert - but the basic need is the same - to know when the generated one becomes the master, due to someone editing it. So at the moment I see ‘is_generated’ as a distinct thing from importing… - thomas

Hi Tom,

I'd like to see whether that conversion went 100% right or if the original and imported ones are no longer "semantically equivalent" after the conversion (e.g. due to certain elements not being imported or perhaps RM level incompatibilities or even IP issues etc.). Not sure what's the best way to represent though but from an implementer point of view I'd like to know if by using the imported model my data are 100% compatible with systems holding data conforming to original model.

Cheers,

-koray

Hi Koray,

yes it's something people would want to know, but I don't think we know enough yet about this kind of model conversion to tell us how to represent that. In any case, for now, I have implemented a free key/value String map structure in the ADL WB for this information, so at the very least, it can be represented using some tag like 'transcription_fidelity' or whatever. I don't know what the values of that should be. We'll have to feel our way. Maybe the solution is that the conversion tool generates some kind of string like '29/35 elements converted'...

- thomas