constraint binding error

Hello,

somebody knows which is the correct type of a constraint binding? In
all the examples I have checked and in the ADL grammar (adl.jj), it is
specified by using an URL.

for instance:

[“ac0001”] = <http://terminology.org?query_id=12345&gt;

but I have seen in other archetypes something like this:

["ac0001"] = <[CONSULTA::1]>

The ADL parser throws an error with this last one. is it right?

Thanks!!

Cati Martínez wrote:

["ac0001"] = <[CONSULTA::1]>

The ADL parser throws an error with this last one. is it right?

Hi Cati,

That last one is not a valid constraint binding. It has to be a valid
URI.

- Peter

I know it is on ADL specs, but why limit it to an URI? Second approach
could also be used to identify a subset

I understand the URI need, but I can think more than one occasion
where you have a defined termset and no URI for it

Diego Boscá wrote:

I know it is on ADL specs, but why limit it to an URI? Second approach
could also be used to identify a subset

The URI approach is able to specify subsets, Diego. Here is an
example, generated by the current Archetype Editor beta release
(available from http://www.openehr.org/svn/knowledge_tools_dotnet/TRUNK/ArchetypeEditor/Help/index.html)
:

  constraint_bindings = <
    ["Snomed"] = <
      items = <
        ["ac0001"] = <terminology:Snomed/2002?subset=DrugForm>
      >
    >
  >

- Peter

If that is the valid way of defining in an URI form, it is
undocumented. the example should be put on the ADL specs.

And again not that difficult to support both kind of bindings. In my
opinion, <ORGANIZATIONXXXXX::DrugFormSubset> is way more human
readable and needs the same degree of 'computer interpretation' than
the URI <terminology:...>

Hi Peter, Diego,

I think the URI way to define constraint bindings can be ambiguous and hide some semantics needed to understand where to find the terminology terms and codes.
Please correct me if I’m wrong:

One archetype can have this: [“ac0001”] = terminology:Snomed/2002?subset=DrugForm
And another this: [“ac0001”] = terminology:Snomed/2002?s=DrugForm

So, how can a machine know the difference or equivalency of both URIs? (URIs are universal, but not a unique way to identify a terminology or a subset).
How can we agree on use one URI or the other in global archetypes?
Do we need a centralized terminology/subset URI repository?

What do you think?

Diego Boscá wrote:

And again not that difficult to support both kind of bindings. In my
opinion, <ORGANIZATIONXXXXX::DrugFormSubset> is way more human
readable and needs the same degree of 'computer interpretation' than
the URI <terminology:...>

I would agree that the <TERMINOLOGY::subset> form may be more legible
to humans.

For computer interpretation, however, it would be a big problem. How
would an ADL parser know whether the value of the constraint binding
was a URI or not?

The ADL ontology section is a serialisation of the ontology AOM. Page
78 of http://www.openehr.org/svn/specification/TRUNK/publishing/architecture/am/aom1.5.pdf
  specifies that the ARCHETYPE_ONTOLOGY class's constraint_bindings
attribute contains DV_URI objects. This proposed <TERMINOLOGY::subset>
form is not a serialisation of a DV_URI, so the specification would
have to be changed somehow or other. I doubt that the resulting
serialisation would be the just a matter of putting a
<TERMINOLOGY::subset> where currently we have a URI. There would have
to be some way of identifying whether it's a DV_URI or not. This would
complicate the ADL, probably not making it so nice for humans to read
after all.

But supposing that this did get done, and the ADL parsers were able to
differentiate whether it's a DV_URI or not. We still wouldn't have
solved all of the problems for computer interpretation, because all
tools which currently know how to deal with DV_URIs in the constraint
bindings would now have to cope with the possibility of some other
class of object. Not only would software that is currently working
have to be upgraded, the resulting software would be somewhat more
complicated than it is today.

- Peter

I think the same, Pablo

Another thing, why use a slash ('/') between terminology and version?
why don't use URI modifiers for that also?
<terminology:Snomed?v=2002?s=Drugs>
or sub-subsets
<terminology:Snomed?v=2002?s=Drugs?ss=antiallergenic>
and we have also to deal with spaces!
<terminology:Snomed?v=2002?s=Antiallergenic drugs (product)>

pablo pazos wrote:

I think the URI way to define constraint bindings can be ambiguous
and hide some semantics needed to understand where to find the
terminology terms and codes.
Please correct me if I'm wrong:

One archetype can have this: ["ac0001"] = <terminology:Snomed/2002?
subset=DrugForm>
And another this: ["ac0001"] = <terminology:Snomed/2002?s=DrugForm>

The latest Archetype Editor beta release that I mentioned before would
output the first one. It wouldn't know that s=DrugForm is supposed to
be a subset in the second one.

As Diego mentioned before, this needs to be documented in a
specification somewhere. I don't write the specs, though, so I'm not
sure where that would be :wink:

- Peter

Diego Boscá wrote:

and we have also to deal with spaces!
<terminology:Snomed?v=2002?s=Antiallergenic drugs (product)>

Spaces are illegal in URIs. The correct form for the subset would be:

  subset=Antiallergenic%20drugs%20(product)

- Peter

That was my point, that was a real Snomed attribute, with the spaces.
The result is even more unreadable

I'm confused as to whether the intention here was really URI, URL or
URN?

My understanding was that the use of DV_URI for term binding in archetypes was
more in the vein of global identification of resources (more URN)
rather than actually telling the software how to get to the resource
(ala URL).

So I imagined that each EHR implementation would need to somehow
have a lookup table from terminology resources -> actual terminology
resource access mechanism. I mean, you can't actually use the
terminology reference as an actual URL and do a GET on it? What is
the return value? In the absence of an agreed terminology service
layer definition within openEHR I can't see how that could possibly work.

So that is why I thought that they would be more along the lines of

["ac0001"] = <[urn:fdc:nehta.gov.au:2010:ctppconcepts]>

or (in URL format, but clearly not an actual web location)

["ac0001"] = <[http://nehta.gov.au/cttpconcepts\]>

So to answer Pablo's question

So, how can a machine know the difference or
equivalency of both URIs? (URIs are universal, but
not a unique way to identify a terminology or a subset).

They are equivalent if the canonical form of the URI is equivalent.
If the archetype is published globally, then the URI needs to
globally unique and every implementation that wants to deal
with the archetype will need to have a mechanism for turning
the globally unique URI into an real actual terminology set.

I don't think implementations are meant to be looking inside the
URI to try to extract parameters, subsets etc. If they are then
we are at least one entire specification short of implementing
archetypes..

Andrew

Just to clarify some more, my contention is that you cannot
look inside a arbitrary URI to pick out values without
looking at the formal 'scheme' dependent spec.

So in the case of a 'http' URI, we can read the spec and know
what the bits mean - _for the purposes of fetching data
from web servers using HTTP_. I can't imagine how that
is possibly what is intended by putting a URI into an
archetype - we can't seriously be suggesting that everyone
who uses the archetype is all going to be descending on
some poor webserver named in the URL and fetching data
in some arbitrary format?

So if you want a URI scheme that has identifiable bits
for snomed queries etc, someone needs to specify a

urn:snomed:xxxx,yyyy,zzzz

spec. If not, all you can do is compare URI's for equality
and assume there is some external mechanism for saying
what the URI actually means.

Andrew

and also, binding to URL seems like a bad decision for archetype
maintainability

(just to clarify) I know that constraint bindings URIs are not actual working URIs that you can get a-la HTTP, I understand that here they are used as identifiers, that with a mapping somewhere, our system can access the real terminology source.

With the centralized service I meant not to get the content of the terminology, instead get the global and unique terminologies identifiers for use in archetypes, so for each terminology and subset we will have only one id (URI/URN). We can have a mapping to an OID too (other global identifier, less human friendly but works).

The problems are:

  • we need some way to define/specify what is the canonical form of a URI/URN, we must agree in a terminology of names (of terminologies :D) and subsets.
  • Snomed is the same as SNOMED? or ICD10 is the same as ICD 10 or CIE 10 (CIE = ICD in spanish)?
  • we cannot rely of one tool implementation to take a decision that is not in the specs: other tools can make different decision, so, generated archetype will be inconsistent.

- we need some way to define/specify what is the canonical form of a
URI/URN, we must agree in a terminology of names (of terminologies :D) and
subsets.
- Snomed is the same as SNOMED? or ICD10 is the same as ICD 10 or CIE 10
(CIE = ICD in spanish)?
- we cannot rely of one tool implementation to take a decision that is not
in the specs: other tools can make different decision, so, generated
archetype will be inconsistent.

Yes - agree totally. I think the spec at various points says that the UMLS
terminology identifier name should be used, but I don't think this is
clarity enough - UMLS
is a bit haphazard with how they name things.

Would like to see an agreed upon list of canonical URI/URN for the
terminology bindings that people are using in practice with real
current terminology sources so that we can get some harmonization.

Andrew

And just as a comment, in the ADL 1.4 specs the example shows a URL.
Maybe should be better if a URN was shown

Surely spaces should not be an issue here as these strings do not really identify anything. Instead, one should be using SCTIDs as in:
terminology:Snomed?v=2002?s=135394005

Further issues include:

* the version should be specified using an ISO 8601 basic representation of YYYYMMDD (or YYYYMMDDThhmmss Z for development versions),
* "Snomed" is insufficient - is this the International release, or SNOMED CT-AU or ... My understanding of Release Format 2 (see 7.4.4.13 in the Technical Implementation Guide) indicates that the moduleId (also an SCTID) is the appropriate thing to use, and
* one may also (almost always) wish to use a ReferenceSet for terminology binding. These are also designated by an SCTID (and would require a moduleId and version as well)

This would then give us something like:

    terminology:SNOMED?m=32506021000036107&v=20101130&s=135394005

Diego Boscá wrote:

and we have also to deal with spaces!
<terminology:Snomed?v=2002?s=Antiallergenic drugs (product)>

Spaces are illegal in URIs. The correct form for the subset would be:

        subset=Antiallergenic%20drugs%20(product)

- Peter

Hi Michael,

Not every terminology version is a date. In ICD 10, the version is “10”. I think the version to be a valid date is not a problem here.

Indeed, in Australia, it would be ICD-10-AM but the version would correspond to the particular "Edition" you're using. Hence my example URI still included the string SNOMED so that one knows how to interpret the v=, s=, m= elements. Clearly every standard terminology is going to have it's own mechanisms for indicating versions, etc.

But in this instance, my real point was that SCTIDs need to be used rather than terms.

michael