# The good, the bad and the "Wat?" of current simplified FLAT/SimSDT openEHR exchange format **Category:** [ITS](https://discourse.openehr.org/c/its/41) **Created:** 2023-04-09 06:42 UTC **Views:** 1682 **Replies:** 45 **URL:** https://discourse.openehr.org/t/the-good-the-bad-and-the-wat-of-current-simplified-flat-simsdt-openehr-exchange-format/3819 --- ## Post #1 by @erik.sundvall Hi! At Karolinska we are a bit confused about how to format/use some things in different forms of the FLAT/SimSDT format. We are testing at least three CDR products that are supposed to support it and at least two (non-openEHR-based) systems that try to use it correctly for exporting/converting form data. I'm starting this thread to collect some questions (and we hope also answers) regarding confusing things including some that [remind me of the Gary Bernhardt's "Wat?" short speech at CodeMash 2012](https://www.destroyallsoftware.com/talks/wat). The related discussion threads and documentation we have looked at so far when trying to reduce confusion are: * EHRbase's extensive explanations of the FLAT format in chapters 2.5.2 & * https://ehrbase.readthedocs.io/en/latest/02_getting_started/05_load_data/index.html?highlight=flat#flat-format * https://ehrbase.readthedocs.io/en/latest/09_flat/index.html * The unfinished specification documents with maturity status "DEVELOPMENT" * https://specifications-test.openehr.org/releases/ITS-REST/latest/simplified_data_template.html * https://specifications.openehr.org/releases/SM/latest/simplified_im_b.html * https://specifications.openehr.org/releases/SM/latest/serial_data_formats.html * Discussion threads: * https://discourse.openehr.org/t/understanding-flat-composition-json/1720 * ... There are many really **good** things with the key+value approach of the simSDT/FLAT (and the less used simNC) approaches. Combined with the JSON based "web template" (as explained in 2.5 of EHRbase docs linked above) the appproach has in very short time made it possible to use openEHR templates as a base for configurable forms in existing non-openEHR systems (like Sectra's IDS7 and Omniq's Alltid Öppet) Some years ago we also investigated possible use in non-openEHR major Swedish EHR systems, see http://dx.doi.org/10.3233/SHTI190645 Feel free to add posts below for each kind of issue/confusion, I'll try to start with some regarding "context", category and identifiers shortly. Also feel free to edit this post to add more references etc. (it's a "wiki" post). Updates from 13 April 2023 and onwards below: **FINDINGS** (described in detail furher down in this thread) * The archetype authoring tool Archetype Designer had an error that added pointless extra "at-codes" at nodes that usually never have at codes. That error caused seriously confusing results and errors in downstream tools and pipelines, including FLAT/SimSDT. * The errors went undetected for quite a while and caused the CKM-published archetype [self_reported_data.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/adl) to contain the errors. (Some draft archetypes also contain the errors) * [self_reported_data.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/adl) was the root archetype of the template confusing at least us at Karolinska when using FLAT format with solutions from Cambio, EHRbase, Better and Omniq. * The error has been recognised by Better and will be fixed in an upcoming version of Archetype Designer. An updated 'fixed' version of the Composition self_reported_data has been submitted as [change request](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/changerequests) to the International CKM. There appear to be 4 other composition archetypes affected - all unpublished. * There are many other somewhat confusing things (depending on perspective) mixed with the good things of the FLAT/simSDT format and "simplifications" - these things are also sometimes discussed in the thread below. --- ## Post #2 by @erik.sundvall **Confusing subject #1: CONTEXT, CTX etc** Regarding `context`, `ctx` and `event_context` it gets a bit confusing especially when you come from (a fairly natural start of) having looked only at RM-spec (for example fig 14 below) and have not yet discovered https://specifications.openehr.org/releases/SM/latest/simplified_im_b.htm reverse-engineereed object model spec. of the simplified formats ("FLAT" and "STRUCTURED") - that e.g. figure 6 and 7 below are copied from. **.../ctx/....** If we understand explanations in [simplified_im_b](https://specifications.openehr.org/releases/SM/latest/simplified_im_b.htm) and [a related discussion](https://discourse.openehr.org/t/understanding-flat-composition-json/1720) correctly, then the "ctx" object seems to provide a possibility to set a mixed bag of defaults that will be (re)used in *several parts* of a COMPOSITION if not provided in other input. Note that "ctx" is a flat map of variables, NOT a mapping to any particluar RM context-related object. **.../context/...** Examples in the spec having a ".../context/..." are pointing to partially different things depending on if we're talking about the (no longer actively deveoped?) "simNC" example... ``` "/context/health_care_facility|name":"Northumbria Community NHS", "/context/health_care_facility|identifier":"999999-345", ``` ...that points to the actual EVENT_CONTEXT object of the canonical RM model (as shown in the UML diagram in figure 14 below) - or if we are talking about the "simSDT" (now actively used by both Better and EHRbase) example... ``` "laboratory_order/context/_health_care_facility|id": "999999-345", "laboratory_order/context/_health_care_facility|id_scheme": "2.16.840.1.113883.2.1.4.3", "laboratory_order/context/_health_care_facility|id_namespace": "NHS-UK", "laboratory_order/context/_health_care_facility|name": "Northumbria Community NHS", ``` ...that actually points to the simplified S_EVENT_CONTEXT of the [simplified_im_b](https://specifications.openehr.org/releases/SM/latest/simplified_im_b.htm) model (see figure 6 below). This really confused us, since in the real canonical RM the ways of setting an identifier (or several identifiers) for health_care_facility using PARTY_IDENTIFIED... ![image|633x405, 75%](upload://yeXSQvk7F9aYRdeBbAScZBOS64L.png) ...should, according to https://specifications.openehr.org/releases/RM/latest/common.html#_party_identified_class allow/force us to add a list of several identifiers ( **identifiers** : `List`) not just a single identifier that looks like is is attached directly to the to the (S_)PARTY_IDENTIFIED object. We do have use cases where it would be convenient to add more than one identifier to (S_)PARTY_IDENTIFIED objects. So here comes actual questions: **Is it not possible (or just undocumented how) to add more than one identifier (e.g. a list of identifiers) to e.g. health_care_facility (and other S_PARTY_IDENTIFIED objects) using simSDT/FLAT format when submitting a COMPOSITION? Is it also impossible to use the external_ref attribute (that points to a PARTY_REF)?** **.../event_context/...** Adding to confusion is that when you use the "OUTPUT" variant of FLAT/simSDT the .../context/... object is nowhere to be seen anymore, instead you get a structure like ``` "chemoform-mba.v5/event_context/start_time": "2023-04-04T00:35:42.71+02:00", "chemoform-mba.v5/event_context/setting|code": "238", "chemoform-mba.v5/event_context/setting|value": "other care", "chemoform-mba.v5/event_context/setting|terminology": "openehr", ``` (The example comes from our experiments available in https://github.com/regionstockholm/CKM-mirror-via-modellbibliotek/releases/tag/ChemoForm-MBA.v5.rc8 ) And this would probably be understandable if you knew that in "OUTPUT" mode we always get something looking more lik the RM for EVENT_CONTEXT, but for "INPUT" mode you shlould use the "ctx" or "context" way... ...BUT even when generating examples in "INPUT" mode in we get a mix of .../ctx/... and .../event_context/... ``` "ctx/health_care_facility|name": "Hospital", "ctx/health_care_facility|id": "9091", "chemoform-mba.v5/event_context/vårdenhet/namn": "Namn 74", "chemoform-mba.v5/event_context/vårdenhet/identifierare:0": "79f7d19f-cc7c-4f95-9d9e-6ad4499a4d58", "chemoform-mba.v5/event_context/vårdenhet/identifierare:0|issuer": "Issuer", "chemoform-mba.v5/event_context/vårdenhet/identifierare:0|assigner": "Assigner", "chemoform-mba.v5/event_context/vårdenhet/identifierare:0|type": "Prescription" ``` **[Wat?](https://www.destroyallsoftware.com/talks/wat)** Ah, maybe it's a way of enabling use of the `other_context` attribute of (S_)EVENT_CONTEXT - but why not make the path less confusing .../event_context/other_context/... After this we thought - "maybe we can use the path .../event_context/health_care_facility/... also for input purposes" to e.g. add a list of idenitfiers to health_care_facility, but then we get errors from the CDRs when trying to commit a COMPOSITION. (Thus the question in bold above...) Or maybe it would work better if not exporting with Swedish as the primary template language? Phew... **IMAGES refenced above** ![image|690x454](upload://b5XDpYmeYYNI3nXKPZ7Vds4xPek.png) Source: https://specifications.openehr.org/releases/RM/latest/ehr.html#_overview_3 figure 14 ![image|600x500](upload://hlhFxjlSVxrYo4erG6iZuG49y3Z.png) Source: https://specifications.openehr.org/releases/SM/latest/simplified_im_b.html#_application_context_model figure 6 ![image|690x426](upload://iSscedCq1BlnoUzTgJVbTDcl9Fo.png) Source: https://specifications.openehr.org/releases/SM/latest/simplified_im_b.html#_application_context_model figure 7 --- ## Post #3 by @pablo Hi Erik, just out of curiosity, do you see any advantages on using FLAT and similar formats for openEHR data exchange vs. using the official canonical JSON format? --- ## Post #4 by @erik.sundvall [quote="pablo, post:3, topic:3819"] do you see any advantages on using FLAT and similar formats for openEHR data exchange vs. using the official canonical JSON format [/quote] No, not between two openEHR-compliant systems. Template-specific formats can become a maintenance nightmare outside narrow use casess. --- ## Post #5 by @sebastian.iancu [quote="erik.sundvall, post:2, topic:3819"] …should, according to [Common Information Model](https://specifications.openehr.org/releases/RM/latest/common.html#_party_identified_class) allow/force us to add a list of several identifiers ( **identifiers** : `List`) not just a single identifier that looks like is is attached directly to the to the (S_)PARTY_IDENTIFIED object. We do have use cases where it would be convenient to add more than one identifier to (S_)PARTY_IDENTIFIED objects. [/quote] yeap ... these are shortcoming of the flat format. As the name suggests, these are flattening the "structure" - so the more complex the tree-structure is, the more complex the serialization of the gets. At some point you may need to evaluate if not the canonical is not a better option for your purpose. [quote="erik.sundvall, post:2, topic:3819"] Is it not possible (or just undocumented how) to add more than one identifier (e.g. a list of identifiers) to e.g. health_care_facility (and other S_PARTY_IDENTIFIED objects) using simSDT/FLAT format when submitting a COMPOSITION? Is it also impossible to use the external_ref attribute (that points to a PARTY_REF)? [/quote] I have unfortunately no experience with implementations, so cannot answer directly on this - but It seems to me very important to tackle this in the SEC. Just a guess, have you tried something like: ``` "/context/health_care_facility|name":"Facility name", "/context/health_care_facility|identifiers:0":"999999-345", "/context/health_care_facility|identifiers:0|issuer":"issuer-0", "/context/health_care_facility|identifiers:0|type":"type-0", "/context/health_care_facility|identifiers:1":"123-345", "/context/health_care_facility|identifiers:1|issuer":"issuer-1", "/context/health_care_facility|identifiers:1|type":"type-1", ``` which would be the equivalent of: ``` { "context": { "health_care_facility": { "_type": "PARTY_IDENTIFIED", "name": "Facility name", "identifiers": [ { "id": "999999-345", "issuer": "issuer-0", "type": "type-0" }, { "id": "123-345", "issuer": "issuer-1", "type": "type-2" } ] } } } ``` --- ## Post #6 by @joshua.grisham Hi! I work with @erik.sundvall and am also taking a look at the same template in the same environments. I just wanted to add an observation which might be helpful (and/or maybe is driving the differences and thus the point of some confusion?) In a previous (and much more simple) version of the template, it worked for us quite similarly to how you mentioned, except to prefix `health_care_facility` with an underscore as it is an optional node (minOccurs=0), and then to use both optional prefix (underscore) and the "singular" version of the object name ("_identifier:n" instead of "_identifiers:n"), sort of like this: ``` "{webtemplate-top-level-id}/context/_health_care_facility|name":"Facility name", "{webtemplate-top-level-id}/context/_health_care_facility/_identifier:0|id":"999999-345", "{webtemplate-top-level-id}/context/_health_care_facility/_identifier:0|type":"type-0", ``` When I look at the exported Web Template (JSON) file from this version of the template, I can find the `rmType=EVENT_CONTEXT` node listed only once, with `id=context`, with a min/max occurs both of 1, and includes as a "child" an extra item we have added under `/context/other_context[at0001]`. But now in a newer (and more complicated) version of this template, when I look at the exported Web Template file, there are now 2 different nodes with `rmType=EVENT_CONTEXT` * First instance this time has `id=event_context` and `aqlPath=/context[at0002]`, min/max of 1, and has our custom other_context cluster plus `start_time` and `setting`for some reason? (with both min/max as 1) * Second instance with `id=context` and `aqlPath=/context` , but this one has a minOccurs of 0, and only includes `start_time` and `setting` but not our additional item under other_context. I'm not sure exactly why we seem to be getting some kind of "specialized" node of context/"Event Context" (as at0002) -- Erik modeled this and I am about as far as you can get from a modeling expert :smiley: But it seems like in order to send attributes as part of EVENT_CONTEXT then we need to use this "custom" instance with `id=event_context` instead of what we assumed was the "default" one with `id=context`. I guess from my perspective, the two questions that I have are: 1. Is there a good reason that 2 different instances of EVENT_CONTEXT should exist in the Web Template with different IDs (or is this maybe some kind of "bug")? 2. And/or if this is expected, how can we know which path should be used in order to actually populate the various values under EVENT_CONTEXT? (such as `location`, `participations`, `health_care_facility`, etc) (and ideally in a way that our tooling can detect this and we do not have to test and set this manually template-by-template) --- ## Post #7 by @joshua.grisham Also, not sure it is worth a quick mention, but we also see the same difference in the newer version of this template with `/category` * Before there was only `id=category` at `aqlPath=/category` * Now there is both `id=category` at `aqlPath=/category` and a new `id=coded_text` at `aqlPath=/category[at0001]`, plus in some of our implementations it no longer works to create a composition using `{id}/category|code` etc instead now we have to specify against the "new" customized version (at0001) of category using `{id}/coded_text|code` Note sure if this is the same "issue"/interpretation but the smell-check tells me that it could at least be in the same family :) --- ## Post #8 by @sebastian.iancu Is this the way both EHRbase and Better platform behaves? --- ## Post #9 by @joshua.grisham First day back from a short vacation for me, I will try to take some time this afternoon and see if I can test a bit more to be sure. One thing I have now noticed is that it seems from Better's Archetype Designer we are now receiving a node_id (this at0002 and at0001) on these when we export (both as Web Template as well as in OPT) where we did not seem to before. So I wonder if this issue is "new" based on a change in Archetype Designer (which I think we were also involved in one recently :smile: )? Before for example in OPT the export looked like this: ``` ... context ... EVENT_CONTEXT ... ... ``` With special attention that `node_id` is empty. And now the same template exports like this: ``` ... context ... EVENT_CONTEXT ... at0002 ... ``` ... noting that EVENT_CONTEXT now has this `node_id` of `at0002`. Interestingly, if I remove this value so that `node_id` is empty again when I import the OPT template into EHRBase, it seems to "work" as before (that when reading out the Web Template from EHRBase, we get only `/context` with `id=context` and not a 2nd one, and that writing a composition works our "old way" again using `/context`). --- ## Post #10 by @pablo [quote="erik.sundvall, post:4, topic:3819, full:true"] [quote="pablo, post:3, topic:3819"] do you see any advantages on using FLAT and similar formats for openEHR data exchange vs. using the official canonical JSON format [/quote] No, not between two openEHR-compliant systems. Template-specific formats can become a maintenance nightmare outside narrow use casess. [/quote] I'm of the same opinion, but didn't thought about the maintainability aspect. I think flat was proposed to simplify something to developers, but I think it's as difficult to implement flat as it is to implement the official canonical formats. They still need to understand what they are doing, what those paths mean, learn how to transform those paths back and forth to a RM instance, etc. If they don't know what they are doing and are just putting values in a place somebody told them to put the values, it is the same to put the values on certain parts of a flat or a canonical format. Until now nobody has shown a real argument why we need these extra formats as part of the specs, but I guess implementers can do whatever they think they need, and that's OK, they can even coordinate between vendors to share the same formats, though having too many official options is a little messy. --- ## Post #11 by @birger.haarbrandt Hi, without going into all the details, just my 2 cents from the EHRbase perspective (which the developers can further elaborate on). Regarding the use of WebTemplate and FLAT: - I very much disagree with the statement that using FLAT is as difficult as using the RM directly. Our experience is that it lowers the learning curve and improves readability enormously. - Any openEHR procurement I came across the last 2 years asked for the WebTemplate and FLAT formats (maybe even so the STRUCTURED format) - If we want systems to be interoperable and interchangeable, such formats need to be part of of the specifications and this was the view of the SEC already some time ago. On the topic itself: - The implementation in EHRbase is basically reverse engineered from Better's implementation - However, this thread is a good case to further work on the conformance framework to ensure that ALL the tiny bits and pieces are aligned - There was/is awareness that the first version of the format is Better's approach which we implemented (and as far as I know Solit Clouds did the same). When discussing with Better and doing the implementation, we found some things that might be straightened out, but there was also a deliberate decision in the SEC to not let perfect be the enemy of good. - For the use-case stated by @erik.sundvall: you can also have a try by mixing FLAT with RAW (which actually should become canonical JSON when simSDT is used through the official REST API...) like shown in this example: https://github.com/ehrbase/openEHR_SDK/blob/develop/test-data/src/main/resources/composition/flat/simSDT/corona_with_raw.json - This should allow you to use multiple identifiers without any doubts --- ## Post #12 by @ian.mcnicoll Thanks Josuha, There is/was something odd around the way that Web templates are handling context, which coincided, I think, with Archetype Designer adding these (I think redundant) atCodes on category and context. @Eeik - I'll look at your examples but I don't think this is related to or affected by the use of the ctx blocks. Personally I always use/recommend using the OUTPUT format, as I found it causes confusion by the read and write formats being assymmetrical. @joshua.grisham I think you have documented correctly the support for PARTY_IDENTIFIED.identifiers @erik.sundvall - The Simplified Info model in Figure 6 is incorrect in that both PARTY_PROXY.external_ref and PARTY_IDENTIFIED.identifier are supported. ``` .../_health_care_facility|name":"Facility name", ".../_health_care_facility/_identifier:0|id":"999999-345", ".../_health_care_facility/_identifier:0|type":"type-0", ``` which I get to commit in both Ehrbase and Better versus the PARTY_PROXY.external_ref ``` ".../_health_care_facility|id": "999999-345", ".../_health_care_facility|id_scheme": "2.16.840.1.113883.2.1.4.3", ".../_health_care_facility|id_namespace": "NHS-UK", ".../_health_care_facility|name": "Northumbria Community NHS", ``` also working correctly in both EhrBase and Better So I think the main confusion remains around the use of 'context' and 'event_context'. There definitely was an issue but I'm now getting consistent use of .../context/... and /category when asking for example compositions i/e I don't see event_context appearing in paths What version of the Better CDR server are you using ? My guess is at some point that adding the atCodes into the opt caused a snafu in Web templates that has since been corrected? --- ## Post #13 by @stefanspiska Yes this is due to https://discourse.openehr.org/t/composition-archetypes-are-generated-with-atcode-in-eventcontext/3202 So you need to remove this wrong atcode. --- ## Post #14 by @ian.mcnicoll [quote="stefanspiska, post:13, topic:3819"] Yes this is due to [Composition Archetypes are generated with atcode in EventContext ](https://discourse.openehr.org/t/composition-archetypes-are-generated-with-atcode-in-eventcontext/3202) [/quote] Thanks Stefan. That does exactly explain what has been happening. Those redundant codes were only being added to **new** compositions created in AD, which then appeared in the .opt and were messing up the web template/example generation. This only affects fairly new compositions created in AD ? from about 12 months ago. I agree that the fix is to remove the atCodes from the composition and regenerate the .opts - I have reported this directly on the AD JIRA, as it should be fixed properly. --- ## Post #15 by @ian.mcnicoll [quote="pablo, post:10, topic:3819"] They still need to understand what they are doing, what those paths mean, learn how to transform those paths back and forth to a RM instance, etc. [/quote] That's not really true Pablo - the CDR handles transforming the short paths to and from the RM instance, nor doe tey have to understand what the paths mean, technically. Like Birger, our experience has been that using the FLAT formats has been transformative in our ability to upskill new, non-specialist devs into our world, Not perfect by any means but a heck of a lot easier to work with, for newbies, than canonical. FLAT is particularly helpful when documenting integrations against source xpath. I'm obviously very happy if people want to use canonical but so far all of our clients have successfully used FLAT for ? all of our projects. But maybe we should argue the for/against case elsewhere and keep this thread for raising/resolving issues with FLAT. @erik.sundvall - I wonder if it is worth documenting our 'findings' and solutions in a separate thread/channel/wiki. Already I think we have nailed down the issues you raised with Identifiers and context paths - it would be good to have these clearly identifiable to point others towards. --- ## Post #16 by @erik.sundvall [quote="ian.mcnicoll, post:15, topic:3819"] I wonder if it is worth documenting our ‘findings’ and solutions in a separate thread/channel/wiki. [/quote] The top post in this thread is a "wiki-post" so many active users (but not recently registered ones) should be able to edit it, I'll add a "findings" heading where we all can help briefly summimg up findings. (I am not by a computer until Monday so I hope someone else can start to summarize.) --- ## Post #17 by @pablo [quote="ian.mcnicoll, post:15, topic:3819"] [quote="pablo, post:10, topic:3819"] They still need to understand what they are doing, what those paths mean, learn how to transform those paths back and forth to a RM instance, etc. [/quote] That’s not really true Pablo - the CDR handles transforming the short paths to and from the RM instance, nor doe tey have to understand what the paths mean, technically. [/quote] That's the issue, if that is the case we are providing something to developers that don't know what they are doing! So putting a value in a path someone told them to use has the same technical complexity as putting a value in a specific place of a canonical json. In fact we have a tool that generates canonical instances with know tags in the places where values are, and developers can just do a replace of the tags for a value. On the other hand, if developers use in-memory RM instances and put the values there, then is a tool that generates the flat format, so the developer doesn't even know about it, which again, has the same technical complexity as putting the values in an in-memory instance then serializing to a canonical json. If they do this, they actually need to understand the RM because they work with it in-memory. The third option is they generate the json from scratch, even generating the paths. On this case they surely need to know what they are doing since the paths in the flat format are based in the opt definitions. And again, if they can do this, they can do canonical json. That is why I don't see any advantage on using flat but making the messages a little lighter. Is that really worth it? I mean, implementing an specific architecture to generate, validate and process another format is really worth it just to save a few bytes? Consider I'm talking as a developer! [quote="ian.mcnicoll, post:15, topic:3819"] Like Birger, our experience has been that using the FLAT formats has been transformative in our ability to upskill new, non-specialist devs into our world, Not perfect by any means but a heck of a lot easier to work with, for newbies, than canonical. FLAT is particularly helpful when documenting integrations against source xpath. [/quote] I think if that is needed is because there is no formal training in place, which is key for any project. If we just throw developers at it, it won't work. When I work with openEHR JSON I need to have the RM in one screen, the JSON schema in another screen and my code in the third screen. This is not magic, I don't know all the models from memory, and I'm not really smart, but with the right specifications as references when developing, there is no way this approach is complicated. It just takes time to do it right. --- ## Post #18 by @thomas.beale Just catching up with this. I read through the thread (ok, not every single word ;) and have the following thoughts: * **Attribute naming**: we should have a clear model basis for any attribute names - the context/event_context/ctx problem. Ideally we'd just use the ones from the RM, but it's no problem to define another derived model that contains shorter names. Such a model has to have a formal transform to the canonical one though. * **Attribute single/plural names**: some people like using singular names for any use of an attribute whether for a container or a single-valued attribute. The openEHR RM doesn't do this (it uses the old school naming approach). I'm unclear on the formal reasons to use singular naming only. There might be some but I don't see them yet. * **Optional 'underscore' naming**: I didn't know that prepending an underscore to an attribute name indicated it was optional in the model. What is the purpose of doing this? It seems to be an attempt to convey model information in the data - is it to make validation easier somehow? I am (was ;) the editor of he SDT, 'Simplifed Model B' and SDF draft specs. These specs contain information from the various threads of development to do with path-based 'flat templates'. I'd like to help get these rationalised but we would need to get some agreement. I see these as follows: * **Serial Data Formats ([SDF](https://specifications.openehr.org/releases/SM/latest/serial_data_formats.html))**: this was the most recent attempt to document serialised forms of data types and higher-level structures. This spec was purely reverse-engineered from reality. It contains 'EhrScape variants' in some places. I would have thought that progressing this to completion would be helpful and ?easy. Note that it is in the Service Model (SM) component, meaning that the serialisations would be usable for any kind of API, not just REST (might be wrong - it's easy to move). * **Simplified Information Model B ([SIM-B](https://specifications.openehr.org/releases/SM/latest/simplified_im_b.html))**: this was an attempt to define a reduced information model that a) has a formal transform to the RM and b) can act as a direct model basis for shorter names and paths and so on. This model can be repurposed in any way that makes sense to achieve this goal, and I would have thought was still useful for that purpose. But can be removed if of no use as well. * **Simplified Data Template ([SDT](https://specifications.openehr.org/releases/ITS-REST/latest/simplified_data_template.html))**: my original attempt to document what products were doing, with a design basis for formalising flat templates. I still think the design thinking is at least potentially relevant. If not, we remove the spec; it is is, I suggest we merge its useful content with the SIM B spec, assuming that will be used in some form. This spec currently in the ITS group. I have certainly missed some details from current conversations on all this, so feel free to correct any of the above. --- ## Post #19 by @ian.mcnicoll [quote="birger.haarbrandt, post:11, topic:3819"] @borut.fabjan has just told me that this issue with new compositions is an error and will be fixed in AD. --- ## Post #20 by @ian.mcnicoll > the context/event_context/ctx Is a mixture of confusion on how the ctx directives work, and a bug/misinterpretation in how AD creates new compositions (being fixed). [quote="thomas.beale, post:18, topic:3819"] * **Optional ‘underscore’ naming**: I didn’t know that prepending an underscore to an attribute name indicated it was optional in the model. What is the purpose of doing this? It seems to be an attempt to convey model information in the data - is it to make validation easier somehow? [/quote] TBH I was unaware of this, I've certainly never noticed that in the documentation. I had assumed that the underscores were only there where there might be a risk of conflict with an identically named archetyped nodes. --- ## Post #21 by @ian.mcnicoll [quote="ian.mcnicoll, post:20, topic:3819"] TBH I was unaware of this, I’ve certainly never noticed that in the documentation. I had assumed that the underscores were only there where there might be a risk of conflict with an identically named archetyped nodes. [/quote] Digging through the docs, I can now see that the underscore = 'optional RM attribute' might well be the case --- ## Post #22 by @thomas.beale [quote="ian.mcnicoll, post:21, topic:3819"] Digging through the docs, I can now see that the underscore = ‘optional RM attribute’ might well be the case [/quote] It doesn't sound like a good idea, for exactly the reason you mentioned just above - the usual interpretation of such names is they are some special / meta / pseudo attribute... --- ## Post #23 by @joshua.grisham We have looked into this a bit more today and had a few more observations (especially when we hit yet another problem, now with /context/other_context !) Our theory is that at some level (generating a Web Template? or maybe in other places as well?) there is an intrepretation happening where, at a minimum, * "/category" is expected to have a DV_CODED_TEXT child without node_id specified * "/context" is expected to have a child EVENT_CONTEXT without node_id specified * "/context/other_context" is expected to have a child ITEM_TREE with node_id = at0001 It seems like this is generally the case for many of the different specialized Composition archetypes but not for all of them, and especially not for some of the "newer" ones that we are looking at. For example, [openEHR-EHR-COMPOSITION.report-result.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.1324/adl) seems to follow this pattern, but not [openEHR-EHR-COMPOSITION.self_reported_data.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/adl), which instead looks like this: * "/category" has a DV_CODED_TEXT child with node_id at0001 (instead of blank) * "/context" has a child EVENT_CONTEXT with a node_id at0002 (instead of blank) * "/context/other_context" has a child ITEM_TREE with node_id = at0003 (instead of at0001) And then when these kind of differences happen, we don't seem to get the "normal" paths for these nodes in the Web Template, and then have this problem/confusion when we try to write compositions and build more standardized tooling. @thomas.beale Does it seem right/ok to have different node_ids for these different specialized Compositions (e.g. sometimes other_context can be at0002, at0003, etc) or does it seem more that all types of compositions should follow the same pattern when it comes to node_id for these nodes? And as a follow-up question, does it seem reasonable that we should be able to expect for example to "always" write to /context/ and /category (etc) in FLAT format, or that we should instead expect to get different paths for these in every new Web Template (depending on the template itself and its own node_ids)? --- ## Post #24 by @ian.mcnicoll Hi Joshua, I'm pretty sure (and confirmed with @borut.fabjan ) that this is simply down a change made to AD when creating new compositions. The atCodes on /context and /category are not required, but are legal in ADL, but the main problem was that when the related terms were generated, they were added as 'Event Context' not 'context' , and as 'DV coded text' not as 'category'. In any case, I would argue that the web template generator should ignore any non-LOCATABLE name overrides, since these never find there way into the CDR data. > * “/context/other_context” has a child ITEM_TREE with node_id = at0003 (instead of at0001) This is correct and simply reflects that the archetype has had a slot constraint added. The ITEM_TREE atcodes will be inconsistent across archetypes and this is normal and expected. I don't think the specialisation is causing any issues here. and I'm pretty sure the confusion is wholly down to the combination of the redundant atCodes being added to 'context' and 'category', the incorrect terms that were being created, and finally the web template generation picking up these terms, when they should be ignored. --- ## Post #25 by @erik.sundvall [quote="joshua.grisham, post:23, topic:3819"] ...but not [openEHR-EHR-COMPOSITION.self_reported_data.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/adl), which instead... [/quote] Can somebody with enough CKM privileges (@sebastian.garde, @siljelb or somebody else?) then please patch and re-release that [self_reported_data.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/adl) archetype to remove those two extra at codes for /category and /context and perhaps somebody could also later scan through other COMPOSITION archetypes edited ~last year to see if the error has slipped into any other archetype. --- ## Post #26 by @ian.mcnicoll I have just uploaded a 'fixed' self_reported_data composition as a change request on CKM. That is the only published archetype on CKM that is affected as far as I can tell. There are 3 other non-published archetypes with the same issue I think there is an open question on whether this is, or at least should be regarded as, a breaking change. On the one hand, strictly speaking, this is not breaking change since the extra atCodes have no impact on run-time / persisted data, this is really an issue only for design-time tooling. OTOH, given the confusion potentially being caused in folks who use web templates and FLAT formats, there is an argument for bumping it up to v2. @sebastian.garde ?? --- ## Post #27 by @ian.mcnicoll The other compositions (all unpublished) are Obstetric history, Draft archetype [Internet]. openEHR Foundation, openEHR Clinical Knowledge Manager [cited: 2023-04-17]. Available from: https://ckm.openehr.org/ckm/archetypes/1013.1.1630 Advance care, Draft archetype [Internet]. openEHR Foundation, openEHR Clinical Knowledge Manager [cited: 2023-04-17]. Available from: https://ckm.openehr.org/ckm/archetypes/1013.1.5349 Certificate, Draft archetype [Internet]. openEHR Foundation, openEHR Clinical Knowledge Manager [cited: 2023-04-17]. Available from: https://ckm.openehr.org/ckm/archetypes/1013.1.5797 Disease surveillance, Draft archetype [Internet]. openEHR Foundation, openEHR Clinical Knowledge Manager [cited: 2023-04-17]. Available from: https://ckm.openehr.org/ckm/archetypes/1013.1.5723 Note that I had to edit the ADL outside Archetype designer to make it 'stick'. --- ## Post #28 by @pablo [quote="joshua.grisham, post:23, topic:3819"] “/context/other_context” is expected to have a child ITEM_TREE with node_id = at0001 [/quote] That can't be a general rule, it simple doesn't happen in all archetypes. Options are: instead of ITEM_TREE, to have ITEM_TABLE, ITEM_LIST or ITEM_SINGLE. Instead of at0001 you can have whatever at code, like a specialization could have at0001.1, just to show one example. So that couldn't be what is expected for any data structure, that should be expected only for the structures that comply with that, based on the referenced template/archetype. --- ## Post #29 by @ian.mcnicoll That is true Pablo but we agreed (us modellers) many years ago to only use ITEM_TREE and the current AD does not give us the choice (by agreement). There are no CKM archetypes that have other structures now As you say, the actual atCode is not significant. --- ## Post #30 by @pablo I know Ian, though 1. not everyone is using AD, in fact other modeling tools allow other structure types, and I can build my own modeling tool that is compliant with the specs and implement all the types; 2. even if the international CKM doesn't have other structures than trees, that doesn't prevent other modelers to use other structures, the only way this holds from the spec compliance point of view is that the other structure types are removed from the RM, which won't happen any time soon 3. archetypes and templates created on other modeling tools can still have other types as structures, then data will go with those types to the CDR, then if the CDR doesn't allow them, it's a compliance thing of the CDR, it doesn't support the types, which should be stated in the Conformance Statement (defined in the conformance framework https://www.cabolabs.com/blog/article/openehr_conformance_framework-61ef4f513f7c5.html). Not publishing that info would mean users of that CDR might want to use something that is not supported. It will also mess the conformance verification because it will lower the conformance score for that tool by failing the tests associated with different structure types. The real issue is agreeing on a usage thing while the specs are not updated and older ones deprecated. Note that as an interim solution is OK but shouldn't be a long term thing. In fact we should be already working on updating the RM DATA_STRUCTURE model. --- ## Post #31 by @erik.sundvall I believe @ian.mcnicoll's comment in the CKM gives good context and explains some things tings in more detail than some of the posts in this thread, so here is a screenshot: ![image|690x340](upload://u3OYmpiUnEJW71WsxMfYqpYypJo.png) --- ## Post #32 by @sebastian.garde [quote="ian.mcnicoll, post:26, topic:3819"] On the one hand, strictly speaking, this is not breaking change since the extra atCodes have no impact on run-time / persisted data, this is really an issue only for design-time tooling. OTOH, given the confusion potentially being caused in folks who use web templates and FLAT formats, there is an argument for bumping it up to v2. @sebastian.garde ?? [/quote] Well, if this has no chance of having ended in data, I am torn...it is a matter of definition (of what it has to break to be considered a breaking change). But let me clarify, is there really no chance that the now deleted at0002 code / a path containing it could have ended up somewhere in data? If there is a chance, I would expect at0002 to remain present even if it is not particularly meaningful. --- ## Post #33 by @erik.sundvall The archetype [self_reported_data.v1](https://ckm.openehr.org/ckm/archetypes/1013.1.6343/adl) is pretty new, so an early breaking change now is better than keeping any of the errors in the archetype. (Especially since the errors cause really bad errors downstream in many tools and platforms as described in this thread) Also a breaking change can better signal that there was something seriously wrong with the archetype, and that you really need to think through differences and the suitability of any possible continued use of the (then) old v1. At Karolinska we will soon start using it for real patient data, so an internationally agreed complete fix of the broken archetype would be very nice to have in the CKM (instead of us fixing it locally and risking doing something different than what will finally and up in the official CKM archetype). I think it would be good to fix the draft archetypes that Ian listed too fairly soon, but since they are not yet published the exact choice of what version-position to bump up is of less importance. --- ## Post #34 by @ian.mcnicoll I have tested both EhrBase and Better, and in neither case do the redundant atCodes appear in the canonical data, or cause any other apparent issues. ``` "category": { "_type": "DV_CODED_TEXT", "value": "event", "defining_code": { "_type": "CODE_PHRASE", "terminology_id": { "_type": "TERMINOLOGY_ID", "value": "openehr" }, "code_string": "433" } }, "composer": { "_type": "PARTY_IDENTIFIED", "name": "ISO_3166-1" }, "context": { "_type": "EVENT_CONTEXT", "start_time": { "_type": "DV_DATE_TIME", "value": "2023-04-19T09:21:38.415Z" }, "setting": { "_type": "DV_CODED_TEXT", "value": "other care", "defining_code": { "_type": "CODE_PHRASE", "terminology_id": { "_type": "TERMINOLOGY_ID", "value": "openehr" }, "code_string": "238" } } }, ``` So I think the argument for the breaking change is really as an alert, rather than that it is technically necessary. --- ## Post #35 by @joshua.grisham From our perspective what seems to "break" is that we have a template based on `openEHR-EHR-COMPOSITION.self_reported_data.v1` where we have added some stuff under `/context`, and if we make a local copy of this new 1.0.2-alpha revision (but still using the same Archetype ID, `openEHR-EHR-COMPOSITION.self_reported_data.v1`), then when we try to open said template in Archetype Designer, there are two warnings which pop up: > * Template depends on archetype openEHR-EHR-COMPOSITION.self_reported_data.v1 which may have changed since this template was last saved. Template expects integrity hash: 782C49493E523506E8C34B7C5CF7B761, but actual: 8ce513a50cfa52112d9d79313e52cbf7 > * Constraint node at [openEHR-EHR-COMPOSITION.t_self_reported_data.v1/context[at0002]] specializes a constraint missing in parent archetype [openEHR-EHR-COMPOSITION.self_reported_data.v1] and was removed during flattening The first warning is fine/expected, but the second one is where we have the issue -- everything we had added in the template under `/context` has actually disappeared and I assume we will need to add it all back manually? Is this considered a "break" that could be served better in some way by rolling to `.v2` (which I am assuming means we actually need to rebuild the entire template over anyway?) or is there a better way to handle this issue? (which we don't necessarily need to take in this thread, more a question of if this important to consider when deciding if an archetype change is breaking vs non-breaking) --- ## Post #36 by @ian.mcnicoll Thanks Joshua, This is a good point. I had not considered the potential impact on a template which had constraints or slotted archetypes inside context/other_details. As you say, the impact of going to v2 is even more disruptive. There may be a way of editing the native json template files to avoid having to recreate the lost items manually. --- ## Post #37 by @joshua.grisham It gets a bit more interesting actually: I just found that we were not even able to add it back manually with Archetype Designer in this case; I actually needed to manually patch the JSON file outside of Archetype Designer (https://github.com/regionstockholm/CKM-mirror-via-modellbibliotek/commit/c2be87e110be8cea7b323cab076dbd1d4173cdb2) to get our extension back after the node was deleted from the template in AD. The "plus" button did not do anything and the list on the right side to search and add Archetypes would not appear under /context so it felt a bit "stuck" unfortunately :) But now it seems "ok"! So we will do some exports and start to try this out and see how it goes with all of our various components. --- ## Post #38 by @ian.mcnicoll Yes I just managed the same. THe process is 1. Export the affected template using the native json option and open this in a text editor, search for EVENT_CONTEXT then remove the adjacent `"nodeId": "at0002"` line. The 'category' attribute is not included in the native json template so there are no other edits required, as far as I can tell. 2. Import the corrected version of the Composition archetype, which has had the erroneous atCodes and terms removed. 3. Import the corrected native json template, making sure you allow overwrites. --- ## Post #39 by @erik.sundvall @joshua.grisham made a release of our current approach today (based on @ian.mcnicoll's patched archetype) for those that are curious: https://github.com/regionstockholm/CKM-mirror-via-modellbibliotek/releases/tag/ChemoForm-MBA.v5.rc14 --- ## Post #40 by @ian.mcnicoll I believe the original problem when creating new compositions is now fixed with AD 1.2.47 --- ## Post #41 by @OscarKarlsson Hi! I recently created a template based on the **COMPOSITION.review.v0** archetype. However, we ran into some issues when trying to save a composition in Better Studio. After spending some time troubleshooting, @erik.sundvall suspected that it might be a similar issue to the one described in this thread. I’d really appreciate any help in understanding whether this is the same kind of problem, and if so, how it can be resolved. Ping @ian.mcnicoll :D ![image|690x409](upload://hgsjmTFdL5T9BmsFy3qV0D0jMPa.png) --- ## Post #42 by @ian.mcnicoll This does look like the same problem Scroll up yhrough this topic for the fix https://discourse.openehr.org/t/the-good-the-bad-and-the-wat-of-current-simplified-flat-simsdt-openehr-exchange-format/3819/38?u=ian.mcnicoll Can you post the fixed archetype to ckm as a change request to prevent future issues for others. --- ## Post #43 by @OscarKarlsson I have now submitted a change request and attached a corrected archetype, which I believe has been fixed in the same way as **COMPOSITION.self_reported_data.v1**. --- ## Post #44 by @siljelb I've imported and committed the proposed new revision, with some minor fixes. --- ## Post #45 by @erik.sundvall [quote="ian.mcnicoll, post:12, topic:3819"] @erik.sundvall - The Simplified Info model in Figure 6 is incorrect in that both PARTY_PROXY.external_ref and PARTY_IDENTIFIED.identifier are supported. ``` .../_health_care_facility|name":"Facility name", ".../_health_care_facility/_identifier:0|id":"999999-345", ".../_health_care_facility/_identifier:0|type":"type-0", ``` which I get to commit in both Ehrbase and Better versus the PARTY_PROXY.external_ref ``` ".../_health_care_facility|id": "999999-345", ".../_health_care_facility|id_scheme": "2.16.840.1.113883.2.1.4.3", ".../_health_care_facility|id_namespace": "NHS-UK", ".../_health_care_facility|name": "Northumbria Community NHS", ``` also working correctly in both EhrBase and Better [/quote] @ian.mcnicoll's post cited above saved the day again today when some colleagues and I in an implementation tried to understand the "simplifications" done in the "simplified" format. There is a lot of non obvious black magic happening under the hood in the simplified format implementations that make it difficult to understand for somebody looking at the "normal" openEHR UML-diagrams. Now, thanks to heroic work of @sebastian.iancu and inventors from Better and implementers from Vitasytems and others there is finally an explanation of the black magic published at https://specifications.openehr.org/releases/ITS-REST/development/simplified_formats.html Somebody just following an example-document for a specific simplifed format will likely have an easier time than somebody looking at the UML unaware of the black magic, so below is an example regarding the Swedish use of both 2x nested Organisation archetypes (in an ITEM_STRUCTURE under EVENT_CONTEXT.other_context) and the RM-built-in EVENT_CONTEXT.healthcare_facility to represent care units ("vårdenhet") and healthcare provider organisations ("vårdgivare") at different levels as described in https://openehr.atlassian.net/wiki/spaces/SWE/pages/1893105737/PDL+i+openEHR. ![image|690x268](upload://c2vtUpwg0ykqTXrUvw4kQ2LWyWp.png) In the Swedish case we do _not_ want to use external_ref inherited to PARTY_IDENTIFIER but rather use identifiers inside the PARTY_IDENTIFIER object itself: ![image|487x258](upload://xnAx5M8ZrdWykY4METWlA4IwGyA.png) When to use underscores and plural-"s" on the key `identifier` in this mix is far from obvious for beginners... Excerpt in simplified FLAT format: ``` "granskning/context/_health_care_facility|name": "S Bäckencancer MBA QA", "granskning/context/_health_care_facility/_identifier:0": "SE2321000016-C4DV", "granskning/context/_health_care_facility/_identifier:0|type": "urn:oid:1.2.752.29.4.19", "granskning/context/vårdenhet/namn": "OO Cancer TC QA", "granskning/context/vårdenhet/identifierare:0": "SE2321000016-ADNF", "granskning/context/vårdenhet/identifierare:0|type": "urn:oid:1.2.752.29.4.19", "granskning/context/vårdenhet/roll:0|code": "43741000", "granskning/context/vårdenhet/roll:0|value": "vårdenhet", "granskning/context/vårdenhet/roll:0|terminology": "http://snomed.info/sct/900000000000207008", "granskning/context/vårdenhet/vårdgivare/namn": "PDL-vårdgivare SLL IdP QA", "granskning/context/vårdenhet/vårdgivare/identifierare:0": "SE2321000016-I1MN", "granskning/context/vårdenhet/vårdgivare/identifierare:0|type": "urn:oid:1.2.752.29.4.19", "granskning/context/vårdenhet/vårdgivare/organisationsnummer:0": "2321000016", "granskning/context/vårdenhet/vårdgivare/organisationsnummer:0|type": "urn:oid:2.5.4.97", "granskning/context/vårdenhet/vårdgivare/roll:0|code": "143591000052106", "granskning/context/vårdenhet/vårdgivare/roll:0|value": "vårdgivare", "granskning/context/vårdenhet/vårdgivare/roll:0|terminology": "http://snomed.info/sct/45991000052106" ``` Excerpt in simplified STRUCTURED format: ``` { "_health_care_facility": [ { "|name": "S Bäckencancer MBA QA", "_identifier": [ { "|id": "SE2321000016-C4DV", "|type": "urn:oid:1.2.752.29.4.19" } ] } ], "vårdenhet": [ { "namn": [ "OO Cancer TC QA" ], "identifierare": [ { "|id": "SE2321000016-ADNF", "|type": "urn:oid:1.2.752.29.4.19" } ], "roll": [ { "|code": "43741000", "|value": "vårdenhet", "|terminology": "http://snomed.info/sct/900000000000207008" } ], "vårdgivare": [ { "namn": [ "PDL-vårdgivare SLL IdP QA" ], "identifierare": [ { "|id": "SE2321000016-I1MN", "|type": "urn:oid:1.2.752.29.4.19" } ], "organisationsnummer": [ { "|id": "2321000016", "|type": "urn:oid:2.5.4.97" } ], "roll": [ { "|code": "143591000052106", "|value": "vårdgivare", "|terminology": "http://snomed.info/sct/45991000052106" } ] } ] } ], ``` --- ## Post #46 by @pablo I understand you are talking about the current situation, though If I would design this from scratch, I would check for some unambiguous serialization processes for complex structures that might be more optimal than JSON, for instance PHP has a great serialization for complex objects/types. https://en.wikipedia.org/wiki/PHP_serialization_format --- **Canonical:** https://discourse.openehr.org/t/the-good-the-bad-and-the-wat-of-current-simplified-flat-simsdt-openehr-exchange-format/3819 **Original content:** https://discourse.openehr.org/t/the-good-the-bad-and-the-wat-of-current-simplified-flat-simsdt-openehr-exchange-format/3819