# Usage of Compositoin.Category **Category:** [Technical (archive)](https://discourse.openehr.org/c/technical-archive/156) **Created:** 2016-02-17 20:59 UTC **Views:** 3 **Replies:** 19 **URL:** https://discourse.openehr.org/t/usage-of-compositoin-category/15422 --- ## Post #1 by @system We are discussing the use of Composition.Category which is a DV_CODED_TEXT. There is a terminology: Is it required to use only these categories or could an application set any DV_CODED_TEXT? I think it would be ok to allow any category in this. To be concrete: The use-case is discharge summaries. These are Compositions which only (“mostly”) contains links to existing entries. We will be using links but since the Composition should be transferred to another health provider it must be serialized and validated against an template. Technically this Compostions contains a lot of entries which is “link to self”. The idea we are considering is to introduce a category for these Compositions. The content will not be part of AQL results for normal use-cases. But IF you ask explicit for these categories you will be able to query for discharge summaries which contains body weight above 120 kg. If we only add the references as links it will not be possible to add them into forms and neither use a Template to validate the content. This is the reason we are “thinking out of the box”. Any comments on this? Best regards Bjørn Næss Product Owner – Arena EHR DIPS ASA Mobil +47 93 43 29 10 --- ## Post #2 by @system I just added a «composition category» on my fork of the terminology project. [https://github.com/bjornna/terminology/commit/600dec3058cd85f9db3e5859d6bffa7f01a45edf](https://github.com/bjornna/terminology/commit/600dec3058cd85f9db3e5859d6bffa7f01a45edf) + Any comments? --- ## Post #3 by @Heath_Frankel3 Hi Bjorn, How did you come up with the concept id of 434? We need to be careful about assigning our own concept ids, we really need openEHR to assign these, I suggest through the SEC process initiated by a Jira card. At present we have two terminology files, as you know we have agreed to use the java implementation’s terminology xml file as the interim standard representation but there are already concept ids allocated in the Archetype Editor terminology file which existed before the terminology specification and the java implementation. In this case it looks like 434 is safe to use as it is not assigned to an openEHR concept in the Archetype Editor, but 435 is allocated to an openEHR concept in the setting group, which appears to be missing from the terminology specification and the java implementation xml. Let’s start using the SEC process for managing openehr terminology concepts. Regards Heath --- ## Post #4 by @system Re: process, yes, it needs to be managed separately. Ian is the terminology component owner. But I assumed Bjørn was talking about the semantics of the new term - 'Report'. Bjørn - can you elaborate on what Compositions would merit the 'report' Composition category? - thomas --- ## Post #5 by @system Actually – I did this concrete e-mail and addition to get some feedback on my previous e-mail J And as you say Thomas: The most important thing now is to have some input on the concept or semantics of this new term. Currently we are prototyping on this functionality to see how it would work regarding creating, saving and querying content based on these attributes. The concrete number, 434, was picked since it was not used and was in the same serie as the other category number. --- ## Post #6 by @system Hi Björn, I finally got around to thinking a bit about this\. It is an interesting problem and I think I can see the need to specify different data handling requirements but I am not sure that overloading composition\.category is the best approach here\. I suspect this will take a bit of teasing out \(and other commit/query strategy metadata\) \- might it be better to put this in a cluster archetype in the Composition extension slot\. That would let us play around with the requirements before pushing something definitive to the RM? So far I have 3 axes: 1\. Normal commit strategy: persistence vs\. event i\.e do we normally overwrite an existing composition\. 2\. Source\-of\-truth i\.e\. Should this document be regarded as the primary source of truth for certain kinds of data or otherwise e\.g Does a system look into event compositions or e\.g to a Problem list for 'current problems' 3\. Is this a primary document or secondary document? e\.g\. a Discharge letter is a secondary document derived from other primary records\. Just starting the discussion :\) Ian Dr Ian McNicoll mobile \+44 \(0\)775 209 7859 office \+44 \(0\)1536 414994 skype: ianmcnicoll email: ian@freshehr\.com twitter: @ianmcnicoll Co\-Chair, openEHR Foundation ian\.mcnicoll@openehr\.org Director, freshEHR Clinical Informatics Ltd\. Director, HANDIHealth CIC Hon\. Senior Research Associate, CHIME, UCL --- ## Post #7 by @system Hi Ian Great response\. The most important thing for me is to precisely define the semantic meaning of the content in a composition\. In this specific use\-case the content of the composition is always a copy of the primary source\. This means that the Discharge letter only bring one new thing into the EHR \- that is the fact that there is an approved discharge letter\. But the entries in the composition is link and copies of entries in other primary sources\. The requirements to the system is quite small: \* Content of "report" documents MUST not be in the resultset when doing normal AQL queries\. \* It MUST be possible to query for "report" compositions with specific content\. The solution to this problem is simple and I can give an example with an AQL query\. Below is a standard query for body weight\. Look at the WHERE condition\. Here I am looking for all body weights which are NOT part of a report composition\. This WHERE condition will be the default filter on all queries\. If the client would like to query for all body weights in report document, then just change from NOT EQUALS 434 to EQUALS 434\.   SELECT o/data\[at0002\]/events\[at0003\]/data\[at0001\]/items\[at0004\]/value/magnitude   FROM COMPOSITION c   CONTAINS OBSERVATION o\[openEHR\-EHR\-OBSERVATION\.body\_weight\.v1\]   WHERE c/category/defining\_code/terminology\_id/value = 'openehr'AND c/category/defining\_code/code\_string \!= '434' Given that we agree that there is a class of compositions which belongs to the "report" group\. Then we should add such semantic into the RM to make it precise and consistent\. Best regards Bjørn Næss Product owner DIPS ASA Mobil \+47 93 43 29 10 \-\-\-\-\-Opprinnelig melding\-\-\-\-\- --- ## Post #8 by @Ivar_Yrke Hi An interesting discussion that touches the very concept of structured information, in my opinion\. I wonder if the suggested solution looks at the problem from the best angle\. So here is my angle: As a person with some SQL experience I would expect an AQL to return ONLY primary content unless told otherwise\. Any content that lives in a Composition as a link I would not expect to see in that Composition as an entry\. Resolving links is a task for the level "above" \(rendering on a screen etc\.\)\. I can see that there possibly are needs for an AQL that resolves links, but I would rather see this as the special case, much like joining in foreign keys in SQL is an explicit decision \(the SQL analogy have some obvious flaws\!\) Why is this important? Because showing linked information in compositions where they were not originally recorded creates doubt about the origin of the information \(source of truth\)\. The duplication that Bjørn wants to solve is a symptom of un unhealthy structure that undermines an essential aspect of structured information\. If a summary composition, like a discharge letter, only links information from other composition, there should be no duplication\. So there should not be any need for later special handling\. There should be no problem to solve \(well, there would be the need for the optional resolving, but this would be a feature rather than a problem\)\. AQL should relate only to the data and how they are recorded, not to how they are used\. With regards, Ivar Yrke Senior systemutvikler DIPS ASA Telephone \+47 75 59 24 06 Mobil \+47 90 78 89 33 --- ## Post #9 by @system Currently I think we filter on 'report' COMPOSITIONs via something like: FROM COMPOSITION c\[openEHR\-EHR\-COMPOSITION\.report\.v1\]   CONTAINS OBSERVATION o\[openEHR\-EHR\-OBSERVATION\.body\_weight\.v1\] So that would not need any change to the COMPOSITION\.category to be achieved\. Not saying there are not reasons to do it, just that the normal way today seems to satisfy the requirement\. Secondly, just a mechanical AQL thing: it should normally be possible to do: WHERE c/category/defining\_code matches \{\[openehr::434\]\} \- thomas --- ## Post #10 by @system Ivar, yes, this is a reasonable way of looking at things, and it is the way AQL currently works\. There is talk of adding a new operator to follow links, but we need to invent a way to mark the returned data as being targets of references rather than primary in this case\. The reason to allow this kind of reference following is to enable assembling the logical contents of e\.g\. discharge summary into a standalone package to eg\. send in an Extract or process in some other fashion\. So I would modify your final statement and say that AQL should normally be about data as they are recorded, with reference following having special handling\. \- thomas --- ## Post #11 by @system Hi Ivar, I'm not sure the situation is quite as clear\-cut, in that I donlt think there is necessarilly a simple distinction between primary data which should normally be query\-accessible and in\-line vs\. secondary data which is normally query\-inaccessible and referenced\. A few scenarios 1\. Vital signs event \- easy\!\! Primary, in\-line and accessible 2\. Diagnosis event\. Primary, in\-line but depending on whether a secondary problem list is maintained, you may not want to use these original diagnoses events for decision support\. 3\. Problem summary\. Secondary and definitely need to be query\-accessible, but may be in\-line or referenced depending on implementation\. 4\. Discharge summary\. Mostly secondary but may introduce new primary content and again, whether the content is in\-lie or referenced is to some extent an implementation decision\. DIPS have decided to use referencing, others do not\. 5\. End of Life Summary\. The critical aspects of this document are primary e\.g Resuscitation wishes but other aspects are secondary \(though and may be in\-line or referenced\. I think the points raised are valid but we may need to tease out several axes here\. Ian Dr Ian McNicoll mobile \+44 \(0\)775 209 7859 office \+44 \(0\)1536 414994 skype: ianmcnicoll email: ian@freshehr\.com twitter: @ianmcnicoll Co\-Chair, openEHR Foundation ian\.mcnicoll@openehr\.org Director, freshEHR Clinical Informatics Ltd\. Director, HANDIHealth CIC Hon\. Senior Research Associate, CHIME, UCL --- ## Post #12 by @Ivar_Yrke I said: "I can see that there possibly are needs for an AQL that resolves links, but I would rather see this as the special case" which is basically exactly what Thomas says in his rephrasing of my last statement\. My key point is that link handling in AQL must be explicit and predictable\. Your scenarios illustrate why this is important\. mvh Ivar Yrke Senior systemutvikler DIPS ASA Telefon \+47 75 59 24 06 Mobil \+47 90 78 89 33 --- ## Post #13 by @system Hi Ivar, I have no issue with the result of link handling in AQL being explicit and predictable but I don't think this solves the problem of deciding which is 'preferred queryable' data\. Where an active problem list is maintained, the preferred queryable data would, in many implementations \(but not all\) live at the end of a link/reference, rather than being in\-line\. From a clinical perspective, it really does not matter whether the problem list has been implemented as an in\-line persistent\-style composition with entries 'cloned' from their original event compositions or whether those original entries are simply referenced from the event compositions\. I would agree that te latter approach is probably more elegant but from a clinical perspective, it is the Problem List composition that I choose to use as the preferred query route to retrieve problems, how it gets them is really up to the implementer\. I would like to be able to express AQL statements agnostic to that underlying implementation choice\. Ian Dr Ian McNicoll mobile \+44 \(0\)775 209 7859 office \+44 \(0\)1536 414994 skype: ianmcnicoll email: ian@freshehr\.com twitter: @ianmcnicoll Co\-Chair, openEHR Foundation ian\.mcnicoll@openehr\.org Director, freshEHR Clinical Informatics Ltd\. Director, HANDIHealth CIC Hon\. Senior Research Associate, CHIME, UCL --- ## Post #14 by @Ivar_Yrke "Preferred" data is a key issue here\. I think "preferable" is an aspect of the scenario, not of the data itself\. Therefore we must be able to be explicit in AQLs, so that each scenario can express its preference\. mvh Ivar Yrke Senior systemutvikler DIPS ASA Telefon \+47 75 59 24 06 Mobil \+47 90 78 89 33 --- ## Post #15 by @system The problem is not to filter in data\. The most important feature to support is to filter out data\. The proposed solution is to add a new category code to add a new group of Compositions which by default is sorted out\. This could be done by archetypes\. But that creates the need for the implementations to add new filters as new archetypes are developed\. If we agree that there is a large group of "report" compositions with only referred data from their primary sources \- then we should make this a first class citizen of the openEHR domain model\. The discussions about links: Yes we could use links\. But where should we add the links? From my point of view the only place to add these lnks would be on the composition root\. But then you miss the opportunity to add relationship between links\. And there is a large group of archetyped compositions that is added into to EHR , but they are transported to another system on some other information model \(HL7, EDIFACT, PDF, Paper, etc\.\)\. The simple idea behind this proposal is to define a generic system to create openEHR compositions that is the primary source for all this kind of messages\. The only needed thing is to either use TDD or some other transformation of a "compiled" composition\. As far as I can see now this is the simplest and most efficient way to handle this\. Then the content may be archetyped and the transformation could work directly on the RM model to create the expected outcome\. Best regards Bjørn Næss Product owner DIPS ASA Mobil \+47 93 43 29 10 \-\-\-\-\-Opprinnelig melding\-\-\-\-\- --- ## Post #16 by @system we have thought in the past that we should do something directly in the RM to mark 'reports' and other 'derivative' content in some way, but never agreed properly on what to do. Probably your experience reports of trying to convert what I imagine are well understood DIPS data structures and querying capabilities will help us focus on some concrete changes we can make to the RM. Maybe it is as simple as another Composition category. Another alternative is to define a marked in EHR_STATUS e.g. is_derived: Boolean that would enable report-like content to be detected, and ignored in normal querying. We already have some other markers there that affect querying, e.g. is_queryable (). - thomas --- ## Post #17 by @system these examples from Ian illustrate exactly why we never figured this out before - because of the possible mixture of original content, referenced content, and sometimes copied content e.g. in a discharge summary. The problem is that 'derived' content can occur at a lower granularity than Composition. Unless... we impose a discipline that says otherwise. For example, if there is *only* original content and references (links, DvEhrUris) then everything is easy. As far as I can see, the problem is if content is *copied inline* from e.g. a lab result or Dx list into the discharge summary. For a physician, this is pretty natural, and it's easy to see how a nice UI can enable it. But if we want to avoid double results in querying, we need some sort of 'is_derived' or 'is_copy' marker (and a link to original content) on the copy. At least that's where I got to the last time I thought about it. - thomas --- ## Post #18 by @system But if we want to avoid double results in querying, we need some sort of 'is_derived' or 'is_copy' marker (and a link to original content) on the copy. At least that's where I got to the last time I thought about it. Yes – I think we need some kind of marked. We have been thinking about adding this to the link. Some kind of “link to self”. Then to avoid duplicates you MUST include that in every AQL. This is why we ended up with the proposal of a Composition category which solely of re-used (copies) data. The same pattern could be applied on several levels (ENTRY, CLUSTER). I.e. a Blood Pressure with an attribute ‘is_copy’ should by default be excluded from AQL queries. Something like that? --- ## Post #19 by @system I'm not sure if this has the flexibility you (we) really want thought does it? It means that the entire Composition has to be treated as duplicate info to be excluded from query evaluation. Right. If we did it in the most general fashion possible, could have a Boolean flag 'is_copy'. But then you have a False Boolean on 99% of all data, which is not great data design. If we say that any copy has to have a LINK attached pointing to what it is a copy of, then your suggestion above is better (if I understand correctly) - put an 'is_copy' flag there. It could be argued that 'meaning' should encode whether something is a copy or not, but I think a separate flag would be better, and in any case 'meaning' might still carry different reasons for making copies. - thomas --- ## Post #20 by @system But if we want to avoid double results in querying, we need some sort of 'is_derived' or 'is_copy' marker (and a link to original content) on the copy. At least that's where I got to the last time I thought about it. Yes – I think we need some kind of marked. We have been thinking about adding this to the link. Some kind of “link to self”. Then to avoid duplicates you MUST include that in every AQL. This is why we ended up with the proposal of a Composition category which solely of re-used (copies) data. I'm not sure if this has the flexibility you (we) really want thought does it? It means that the entire Composition has to be treated as duplicate info to be excluded from query evaluation. BNA: No you are right. It doesn’t give the right level of flexibility when it comes to generating a Composition. So it is a trade off with the problem of duplicates when querying. We are working on a pattern for the user-inteface to cope with this problem. Because the end-user would like to “feel like” he is editing like before. But what is actually is doing is to create entries which will be referenced as copies. And as you said : all the content of a given Composition MUST be treated like a copy. Only the metadata and mostly author and context.start_time should be new. This is not an ideal requirement/design – but the best of many worse choices so far…. The same pattern could be applied on several levels (ENTRY, CLUSTER). I.e. a Blood Pressure with an attribute ‘is_copy’ should by default be excluded from AQL queries. Something like that? Right. If we did it in the most general fashion possible, [LOCATABLE](http://www.openehr.org/releases/RM/latest/docs/common/common.html#_archetyped_package) could have a Boolean flag 'is_copy'. But then you have a False Boolean on 99% of all data, which is not great data design. If we say that any copy has to have a LINK attached pointing to what it is a copy of, then your suggestion above is better (if I understand correctly) - put an 'is_copy' flag there. It could be argued that 'meaning' should encode whether something is a copy or not, but I think a separate flag would be better, and in any case 'meaning' might still carry different reasons for making copies. BNA: Yes – I think we should maintain meaning to carry optional reason for making a copy. And then we need to have the specific flag to tell if it is a copy. - thomas --- **Canonical:** https://discourse.openehr.org/t/usage-of-compositoin-category/15422 **Original content:** https://discourse.openehr.org/t/usage-of-compositoin-category/15422