I do appreciate these difficulties but if the definition of the binding changes the binding itself may be obsolete.
If the binding changes so much in the terminology that it may be obsolete in the archetype it would probably lead to a new term that is created in the terminology and the old one is kept… but that doesn’t matter in this discussion - just wanted to point it out.
this is technically true, and to avoid it we would have to rely on some kind of publish-subscribe update mechanism to cause affected archetypes to be flagged.
I agree the comment idea is less than satisfactory, it would be better if the term binding contained the rubric as well as the term code for exactly the same reasons that the rubric must always accompany the term code in DV_CODED_TEXT.
I don’t know why the DV_CODED_TEXT is designed this way. I have thought about it and the only reason why this is done is probably to get a faster access to the rubric and maybe also when there are no translations for the term in a specific language in order to get a ‘default’ rubric at all times.
it is to guarantee a readable record / extract by sits not having access to Snomed-ct, or a running copy in their language etc; this has been a long-standing requirement, and I believe a still valid one (just think outside the rich countries).
Maybe you’re right, the definitions could be added as comments, but for proprietary terminology like SNOMED CT this will mean that these kind of archetypes can only be distributed to people that have paid the license.
yes, the position on this needs to be clear. I would have thought free use of terms with no relationships should be safe…
In een bericht met de datum 18-12-2006 18:00:54 West-Europa
(standaardtijd), schrijft mattias.forss@gmail.com:
Maybe you're right, the definitions could be added as comments, but
for proprietary terminology like SNOMED CT this will mean that these
kind of archetypes can only be distributed to people that have paid
the license.
Sorry, but Snomed CT cannot be considered a proprietary terminology
given the formal international SDO status from January onward.
Although it is not at all clear that SNOMED CT will be made freely
available to everyone on Earth after the establishment of the SDO. My
understanding is that individual states (countries) will need to
sublicense it from the SDO, in return for a fee, and can then choose to
sublicense it (for free, or for a fee) to its citizens and residents.
The main role of the SDO is to diversify the ownership of the SNOMED CT
intellectual property from the just the College of American Pathologists
to an international non-profit governance body (the SDO). But
"non-profit" does not necessarily mean "available at no cost to all
god's children".
Thus, it may not be legal to transfer archetype definitions containing
parts of SNOMED CT between jurisdictions, even if both the source and
the target jurisdiction both provide SNOMED CT for free to their
residents, because that free and universal national-level provision will
be under *different* sublicences, which almost certainly forbid transfer
of SNOMED CT outside the jurisdiction in question (I am pretty sure the
Australian sublicense for SNOMED CT imposes such restrictions).
Advice from lawyers who are a) familiar with SNOMED CT licensing and b)
with open source and related concepts, is essential here. Much will
depend on national copyright laws, I suspect, as some countries have
specific "fair use" or "fair dealing" clauses in their copyright laws
which may allow small parts of SNOMED CT to be incorporated into
archetype definitions regardless of licensing arrangements, but many
countries do not have such provisions. Expert advice is required before
proceeding to far with embedding parts of SNOMED CT in archetype
definitions, so that the implications and consequent limitations are
well understood. My view is that embedding aspects and/or parts of
SNOMED CT in archetypes is unavoidable, but the implications for data
interchange between countries needs to be understood. None of these
issues are specific to openEHR - they relate to any mechanism for
embedding or encoding information with SNOMED CT.
It may well be that changes to national SNOMED CT sublicenses are needed
to overcome difficulties. That should not be regarded as impossible -
the SNOMED CT national and international governance bodies are keen for
SNOMED CT to be used, not for it to sit on a shelf due to licensing
problems. And no-one is making huge profits from SNOMED CT, thus greed
does not complicate matters.
That would be nice if it were true, but the conditions of use that seem to have been formulated in Australia seem to be very restrictive indeed - I think we need to check on the conditions of use offered by the relevant national agencies…
Just to say thank you to Tim C and others for the clear and helpful recent
posts concerning SNOMED-CT and openEHR. I am in frequent contact with Prof
Martin Severs, Chairman of the NHS Clinical Data Standards Board, here in
England. Martin is very busy with the international negotiations towards the
new SNOMED SDO and so I have forwarded this and other recent posts about
SNOMED-CT, from the openEHR lists, to him.
Semantic Issues in Integrating Data from Different Models to Achieve
Data Interoperability
Rahil Qamara, Alan Rector
Abstract
Matching clinical data to codes in controlled terminologies is
the first step towards achieving standardisation of data for
safe and accurate data interoperability. The MoST automated
system was used to generate a list of candidate SNOMED CT
code mappings. The paper discusses the semantic issues
which arose when generating lexical and semantic matches of
terms from the archetype model to relevant SNOMED codes.
It also discusses some of the solutions that were developed to
address the issues. The aim of the paper is to highlight the
need to be flexible when integrating data from two separate
models. However, the paper also stresses that the context and
semantics of the data in either model should be taken into
consideration at all times to increase the chances of true
positives and reduce the occurrences of false negatives.
An interesting paper. I'm not sure Rahil or Alan are on this list??
Perhaps they should be cc'ed in on any discussion. Many of
the points about the difficulties of doing archetype binding
to snomed are excellent. I was just wondering about this
quote..
Binding of items in an archetype to coding systems happens at two places (at least):
names of labels and data fields.
For both internal and external coding systems (including Snomed) can be used.
There are at least two different kinds of Archetypes we must consider:
the General Archetype: one that lists all items that can be recorded and exchanged about a specific topic
the Specialised Archetype: the Template. The Template consists of a Labelled structure filled with Archetypes, that is designed for a specific context and therefor has almost no optionallities left. This what a community has decided to use in their specific context.
The General Archetypes in their data fields will have almost no fine grained binding to internal and external code sets. It will have strict bindings to internal or external coding systems.
The Template will have strict bindings to internal and external code sets and coding systems.
We can expect that these coding systems used in Templates are (internal or external) local ones, perhaps national ones, perhaps international ones.
A strict binding to a local service is understandable.
I foresee that an EHR-system without access to a local or remote Terminology Server is to inflexible.
At the EHR-system level there will be increased separations of concerns:
Persistence layer
Document layer: archetype, template layer
Terminology Layer: coding system layer
and some more.
Gerard
– –
Gerard Freriks, MD
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands
An interesting paper. I'm not sure Rahil or Alan are on this list??
Perhaps they should be cc'ed in on any discussion. Many of
the points about the difficulties of doing archetype binding
to snomed are excellent. I was just wondering about this
quote..
--------
The intended purpose of archetypes is to empower clinicians
to define the content, semantics and data-entry interfaces of
systems independently from the information systems [1].
Archetypes were selected because of their feature to separate
the internal model data from terminology. The internal data is
assigned local names which can later be bound or mapped to
external terminology codes. This feature eliminates the risk of
making changes to the model whenever the terminology
changes.
---------
In particular, I am referring to the concept of being bound
'later'.
This is a point of view on archetypes that I had never really
considered. I had always assumed that the construction of
the archetype definition and the selection of terminology
binding would be part of the same process, either done by
the same person or the same clinical group. The paper discusses
some of the mapping problems that can occur when this
process is split, but surely that would never be the case?
Infact the paper does not state that problems in the mapping process occur because of this ‘split’ feature. However, it tries to highlight that one of the problems in finding suitable SNOMED matches arises because the mapping is done at a LATER stage. For instance the evaluation of my MoST system was done by clinicians who were not the original authors of the archetype. This led sometimes to issues of understanding the need for using a particular term in the archetype model and sometimes its semantics. This problem arises mostly when there are different people modeling and mapping the same archetype. However, the aim is to include the mapping feature in Archetype Editors so that the original authors of the archetypes can themselves perform the mapping as well (of course with consensus if and when required). The system at present is performing mappings on pre-modeled archetypes depriving it the luxury of having access to the author. That having said, any model aimed at reuse should be explicit in its use and definitions to keep ambiguity at its minimum! Which is yet another point in case !
The intention would be that within some master archetype
repository (be this a local/organisational/national repository),
the archetypes would include a full set of terminology
codes? (I can understand how one might think this because
none of the sample archetypes in the openehr repository have much
terminology data but that will surely that is a temporary situation
and the intention is that a real official repository i.e. one run
by nehta or nhs etc would have the term codes for their realm?)
Which brings me onto a related point - at the snomed
workshop in Melbourne late last year there was an
(impressive) demonstration of some of the template building
tools written by Ocean. Part of the demonstration involved
creating a complex binding to snomed based on a
small query language (effectively the query was
"select all 'is_a' children of this snomed code up to a maximum
depth of 5"). This query binding was placed into the relevant
archetype as a URL reference to a webservice. Doesn't relying
on a URL in the ADL definition
make archetypes quite brittle. i.e. when the archetype definition
is loaded into the clinical system I either have to consult the
URL straight away and store the resulting codes, or else delay
the binding and risk having the terminology codes for my
ADL disappear in the future?
I agree that this URL feature sounds a bit complex. Not having complete knowledge of the Ocean methodology and objective makes it rather difficult to comment though. However, ‘is_a’ trees are only part of the solution to the binding/mapping process. There are a few archetypes that have ‘is_a’ terms and can be dealt with in a less complex way i.e. without the use of URL’s. Though am not sure whether the Ocean team had something else in mind when using URLs.
Another aspect is to constrain free text entries in archetype models with the use of more intuitive/processable archetype definitions. A simple case of limiting the list of SNOMED codes that can act as ‘legal’ archetype entries should spare the clinician of too much and too frequent access to back-end codes and procedures else it’ll simply discourage them from using the system!
Perhaps someone from the Ocean team might want to throw some more light on this URL feature. It’ll be interesting to know their view point.
The Ocean Archetype Editor allows the designer to map one or more coding systems to a particular term or to map a set of terms from a coding system to a particular element to constrain the values for that element to that term set. The editor actually doesn’t dictate how the terms are retrieved in the latter instance. In the demo that I did at the Oz snomed conference in December, I showed how the term set can be defined using the Ocean terminology service, and in this particular instance, it was mapped to a web service running in another state of Australia. This is really just one way of making it work and could just as easily be retrieved from a locally cached query etc. A URL is just a unique way of identifying the query that is being used.
The point of the demonstration was that you could make snomed easier for clinicians to use by creating these subsets ie medication route. These subsets to be useful would need to be defined at a jurisdictional level or higher so that everyone can use the same one. This allows for a change in the query to be distributed easily and updates to the coding system to be distributed in the same way. To make this work, it will be necessary to have some centrally controlled repository that the URL or other identifier can match the query to. Analogous to archetype identifiers I guess. It may be possible to include term set queries in the archetype if some universal query language is available.
I agree with Gerard and Andrew that using a particular terminology set in a generic archetype probably makes the archetype more brittle and should probably be used in templates in a particular jurisdictional setting where an archetype is constrained further for a specific use case.
It will be interesting to see how this all pans out.
An interesting paper. I'm not sure Rahil or Alan are on this list??
Perhaps they should be cc'ed in on any discussion. Many of
the points about the difficulties of doing archetype binding
to snomed are excellent. I was just wondering about this
quote..
--------
The intended purpose of archetypes is to empower clinicians
to define the content, semantics and data-entry interfaces of
systems independently from the information systems [1].
Archetypes were selected because of their feature to separate
the internal model data from terminology. The internal data is
assigned local names which can later be bound or mapped to
external terminology codes. This feature eliminates the risk of
making changes to the model whenever the terminology
changes.
---------
In particular, I am referring to the concept of being bound
'later'.
This is a point of view on archetypes that I had never really
considered. I had always assumed that the construction of
the archetype definition and the selection of terminology
binding would be part of the same process, either done by
the same person or the same clinical group. The paper discusses
some of the mapping problems that can occur when this
process is split, but surely that would never be the case?
Infact the paper does not state that problems in the mapping process occur because of this ‘split’ feature. However, it tries to highlight that one of the problems in finding suitable SNOMED matches arises because the mapping is done at a LATER stage. For instance the evaluation of my MoST system was done by clinicians who were not the original authors of the archetype. This led sometimes to issues of understanding the need for using a particular term in the archetype model and sometimes its semantics. This problem arises mostly when there are different people modeling and mapping the same archetype. However, the aim is to include the mapping feature in Archetype Editors so that the original authors of the archetypes can themselves perform the mapping as well (of course with consensus if and when required). The system at present is performing mappings on pre-modeled archetypes depriving it the luxury of having access to the author. That having said, any model aimed at reuse should be explicit in its use and definitions to keep ambiguity at its minimum! Which is yet another point in case !
The intention would be that within some master archetype
repository (be this a local/organisational/national repository),
the archetypes would include a full set of terminology
codes? (I can understand how one might think this because
none of the sample archetypes in the openehr repository have much
terminology data but that will surely that is a temporary situation
and the intention is that a real official repository i.e. one run
by nehta or nhs etc would have the term codes for their realm?)
Which brings me onto a related point - at the snomed
workshop in Melbourne late last year there was an
(impressive) demonstration of some of the template building
tools written by Ocean. Part of the demonstration involved
creating a complex binding to snomed based on a
small query language (effectively the query was
"select all 'is_a' children of this snomed code up to a maximum
depth of 5"). This query binding was placed into the relevant
archetype as a URL reference to a webservice. Doesn't relying
on a URL in the ADL definition
make archetypes quite brittle. i.e. when the archetype definition
is loaded into the clinical system I either have to consult the
URL straight away and store the resulting codes, or else delay
the binding and risk having the terminology codes for my
ADL disappear in the future?
I agree that this URL feature sounds a bit complex. Not having complete knowledge of the Ocean methodology and objective makes it rather difficult to comment though. However, ‘is_a’ trees are only part of the solution to the binding/mapping process. There are a few archetypes that have ‘is_a’ terms and can be dealt with in a less complex way i.e. without the use of URL’s. Though am not sure whether the Ocean team had something else in mind when using URLs.
Another aspect is to constrain free text entries in archetype models with the use of more intuitive/processable archetype definitions. A simple case of limiting the list of SNOMED codes that can act as ‘legal’ archetype entries should spare the clinician of too much and too frequent access to back-end codes and procedures else it’ll simply discourage them from using the system!
Perhaps someone from the Ocean team might want to throw some more light on this URL feature. It’ll be interesting to know their view point.
The system at present is performing mappings on pre-modeled archetypes
depriving it the luxury of having access to the author.
This is what I meant by the 'split' case - a split between the people/group
constructing the archetype, and the people doing the binding (in this
Sam writing the archetype and you guys doing the terminology stuff). It
doesn't lessen your points about the difficulties of doing the terminology
mapping - I just wanted to clarify that the plan in the 'best case' is
that there wouldn't be so much of a split (i.e. you'd be in communication
with the people writing the archetype, or it would all be done within one
tool by the same author)
I agree that this URL feature sounds a bit complex. Not having complete
knowledge of the Ocean methodology and objective makes it rather difficult
to comment though. However, 'is_a' trees are only part of the solution to
the binding/mapping process. There are a few archetypes that have 'is_a'
terms and can be dealt with in a less complex way i.e. without the use of
URL's.
Other than actually enumerating the term codes in the ADL file, what other
mechanism is there other than URLs?
Though am not sure whether the Ocean team had something else in mind
when using URLs.
The URL system is not inherently bad - it solves the problem in a
relatively clean way that allows lots of room for future developments
in terminologies without constraining the solutions. I just worry that
with complex terminologies like snomed being used more often
it may be useful to have an inbetween solution i.e.
I actually didn’t use a powerpoint as the whole demo was done with live software showing the process of building a query, binding it to an archetype, creating a template, dragging an element on to a form, compiling the form into an application and then showing that the element was bound to a particular termset that could be searched and selected from.
The point of the demonstration was that you could make snomed easier for
clinicians to use by creating these subsets ie medication route. These
subsets to be useful would need to be defined at a jurisdictional level or
higher so that everyone can use the same one. This allows for a change in
the query to be distributed easily and updates to the coding system to be
distributed in the same way. To make this work, it will be necessary to
have some centrally controlled repository that the URL or other identifier
can match the query to. Analogous to archetype identifiers I guess. It may
be possible to include term set queries in the archetype if some universal
query language is available.
Just thinking long term, just say some archetype was defined for
some little used data entry. The archetype (which includes a URL
term binding) is put into the clinical system. Some data matching
the archetype is entered - the system checks that the terminology codes
for allowed data match those returned in the URL and all is good. 10
years down the
track, someone goes to do the same thing. Given Australian government
departments barely keep their names for more than a few years, what are
the chances the URL is still working? If it is a local reference, what are the
chances the machines still have the same IP addresses or names?
Can the clinical system still rely on the term codes it cached 10 years
ago?
I think having URL's in the archetype definition mixes the 'configuration' of
the system with the definition. I guess I would like to see some sort of
query language in this space so that one could say
<"at0004"> = <
<"snomed"> = <all 'route of medication' where refset('australia')>
<"icd-10"> = <12312-23>
or something similar in the actual ADL. Each terminology/classification
would probably need its own language and I have no idea how to
define any of this, so I guess I'm not really being much help, but I just wanted
to see what the general thinking was about this problem.
Archetypes do allow for the possibility of an internal code set - each
internal code can map to one or more terminologies - analogous to the "list
of codes typed in '123123', '3242342', '123123'".
This solution is probably the best for very small sets such as patient sex
or similar.
<"at0004"> = <
<"snomed"> = <all 'route of medication' where refset('australia')>
<"icd-10"> = <12312-23>
>
I should clarify I'm talking about the "constraint_binding" section of
the ADL file here (section 7.6.6 in the ADL spec)
(and I have completely butchered the syntax here cos I was
doing it off the top of my head)
I don't think any query language would probably be necessary for
the "term_binding" section of an archetype (they could just be
enumerated) - though as pointed out in Rahil's paper there are
still issues in finding the correct codes to put in.
Archetypes do allow for the possibility of an internal code set - each
internal code can map to one or more terminologies - analogous to the "list
of codes typed in '123123', '3242342', '123123'".
Yes, sorry I should have been clearer - archetypes already have the simple
list form and the more powerful yet complex URL form.
I'm arguing for an (as yet unspecified) middle ground to be added..
The system at present is performing mappings on pre-modeled archetypes
depriving it the luxury of having access to the author.
This is what I meant by the ‘split’ case - a split between the people/group
constructing the archetype, and the people doing the binding (in this
Sam writing the archetype and you guys doing the terminology stuff). It
doesn’t lessen your points about the difficulties of doing the terminology
mapping - I just wanted to clarify that the plan in the ‘best case’ is
that there wouldn’t be so much of a split (i.e. you’d be in communication
with the people writing the archetype, or it would all be done within one
tool by the same author)
Right thats fine. Atleast there is a consensus in thought here !
I agree that this URL feature sounds a bit complex. Not having complete
knowledge of the Ocean methodology and objective makes it rather difficult
to comment though. However, ‘is_a’ trees are only part of the solution to
the binding/mapping process. There are a few archetypes that have ‘is_a’
terms and can be dealt with in a less complex way i.e. without the use of
URL’s.
Other than actually enumerating the term codes in the ADL file, what other
mechanism is there other than URLs?
I have a question about the context in which the term URL is being used in this discussion. What does the URL lead to - is it some sort of web page with a block of the terminology displayed enabling the user to pick a few relevant codes or is it some sort of metadata for the location of the list of relevant codes?
Another form could be to have a Tree Graph of a subset of the entire terminology with access to only those paths (to 5 depth level or more/less) that belong to the set of ‘allowable’ codes. It could make the task easier and probably more interesting to the user. However, the graph should be able to display the concept definitions and/or annotations to enable an informed decision to be taken when selecting codes for mapping.
Just thinking long term, just say some archetype was defined for
some little used data entry. The archetype (which includes a URL
term binding) is put into the clinical system. Some data matching
the archetype is entered - the system checks that the terminology codes
for allowed data match those returned in the URL and all is good. 10
years down the
track, someone goes to do the same thing. Given Australian government
departments barely keep their names for more than a few years, what are
the chances the URL is still working? If it is a local reference, what are the
chances the machines still have the same IP addresses or names?
Can the clinical system still rely on the term codes it cached 10 years
ago?
All this is precisely the reason for EuroRec (the European Institute for Health Records) to develop:
an Archetype Repository
an Archetype Inventory
plus a Quality Control Service.
We do this in an European project: Q-Rec.
Francois Mennerat is leading this task.
We could perform this task for others as well.
With regards,
Gerard Freriks
– –
Gerard Freriks, MD
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands
The system at present is performing mappings on pre-modeled archetypes
depriving it the luxury of having access to the author.
This is what I meant by the ‘split’ case - a split between the people/group
constructing the archetype, and the people doing the binding (in this
Sam writing the archetype and you guys doing the terminology stuff). It
doesn’t lessen your points about the difficulties of doing the terminology
mapping - I just wanted to clarify that the plan in the ‘best case’ is
that there wouldn’t be so much of a split (i.e. you’d be in communication
with the people writing the archetype, or it would all be done within one
tool by the same author)
I agree that this URL feature sounds a bit complex. Not having complete
knowledge of the Ocean methodology and objective makes it rather difficult
to comment though. However, ‘is_a’ trees are only part of the solution to
the binding/mapping process. There are a few archetypes that have ‘is_a’
terms and can be dealt with in a less complex way i.e. without the use of
URL’s.
Other than actually enumerating the term codes in the ADL file, what other
mechanism is there other than URLs?
Though am not sure whether the Ocean team had something else in mind
when using URLs.
The URL system is not inherently bad - it solves the problem in a
relatively clean way that allows lots of room for future developments
in terminologies without constraining the solutions. I just worry that
with complex terminologies like snomed being used more often
it may be useful to have an inbetween solution i.e.
simplest)
list of codes typed in ‘123123’, ‘3242342’, ‘123123’
Within our team in Linköping we’ve discussed the potential of using OWL as a query-language for constraint bindings. It might be something worth looking into.