TERMINOLOGY_ID

I am trying to create a CODE_PHRASE object for
a region such as 'australia'. The code_string is
"AU", but I can't work out if the terminology_id
should be "countries" (the openehr code set
id described in Terminology.pdf) or "ISO_3166-1"
(the external identifier).

If it is meant to be the external identifier ISO_3166,
what is the openehr codeset id for?

Andrew

Andrew Patterson wrote:

I am trying to create a CODE_PHRASE object for
a region such as 'australia'. The code_string is
"AU", but I can't work out if the terminology_id
should be "countries" (the openehr code set
id described in Terminology.pdf) or "ISO_3166-1"
(the external identifier).

If it is meant to be the external identifier ISO_3166,
what is the openehr codeset id for?
  

it should be the external identifier. The ids like "countries" are
internal to the openEHR models and terminology service, as a means of
ensuring the the models never directly mention particular code-sets when
other ones might be used in the future. I realise that the modelling of
this is not as clear as it should be, but it will probably have to do
for now.

- thomas

it should be the external identifier. The ids like "countries" are
internal to the openEHR models and terminology service, as a means of
ensuring the the models never directly mention particular code-sets when
other ones might be used in the future. I realise that the modelling of
this is not as clear as it should be, but it will probably have to do
for now.

ok. We perhaps should have a list of "used" external terminology
identifiers because I can see people constructing the terminolgy_id
strings in lots of different ways (dashes or underscores etc).

Do you envisage different TERMINOLOGY_ID's for the alpha-2 and
alpha-3 variants of ISO 3166-1? i.e. should there be a
code phrase ("AU", "ISO_3166-1_alpha-2") and
("AUS", "ISO_3166-1_alpha-3") or just ("AU","ISO_3166-1") and
("AUS","ISO_3166-1") and have the kernel intelligently deal with
the two and three letter codes (and numeric country codes)?

Andrew

Andrew Patterson wrote:

it should be the external identifier. The ids like "countries" are
internal to the openEHR models and terminology service, as a means of
ensuring the the models never directly mention particular code-sets when
other ones might be used in the future. I realise that the modelling of
this is not as clear as it should be, but it will probably have to do
for now.
    
ok. We perhaps should have a list of "used" external terminology
identifiers because I can see people constructing the terminolgy_id
strings in lots of different ways (dashes or underscores etc).

Do you envisage different TERMINOLOGY_ID's for the alpha-2 and
alpha-3 variants of ISO 3166-1? i.e. should there be a
code phrase ("AU", "ISO_3166-1_alpha-2") and
  

we are already using them - ISO_639-1 means ISO 639-1, which is the
2-character form; ISO_639-2 means the 3-character form (the use of '1'
and '2' are ISO's not our invention). Both forms are allowed in any
CODE_PHRASE attribute of the appropriate type (e.g. DV_TEXT.language);
this is why we write the invariants the way we do - the CODE_PHRASE is
simply constrained to be a code within a code-set within the 'languages'
or 'countries' logical group in the TERMINOLOGY_SERVICE.

- thomas

> Do you envisage different TERMINOLOGY_ID's for the alpha-2 and
> alpha-3 variants of ISO 3166-1? i.e. should there be a
> code phrase ("AU", "ISO_3166-1_alpha-2") and
>
we are already using them - ISO_639-1 means ISO 639-1, which is the
2-character form; ISO_639-2 means the 3-character form (the use of '1'
and '2' are ISO's not our invention). Both forms are allowed in any
CODE_PHRASE attribute of the appropriate type (e.g. DV_TEXT.language);
this is why we write the invariants the way we do - the CODE_PHRASE is
simply constrained to be a code within a code-set within the 'languages'
or 'countries' logical group in the TERMINOLOGY_SERVICE.

Ok, but for _countries_ the two and three letter variants are
all encompassed within ISO 3166-1. So we should be using
the terminology ID's of ISO_3166-1_alpha-2,
ISO_3166-1_alpha-3 and ISO_3166-1_numeric? I think it is important
that the openehr documentation has a list of used terminology
ids (i.e. not just say we use the ISO:3166 standard, but say exactly
what its terminology id string is).
I would hate for systems down the road not to interoperate over
something stupid like whether I used "_2" or "-2" in a terminology
name.

Andrew

Andrew,
See
http://svn.openehr.org/specification/BRANCHES/Release-1.1-candidate/publishi
ng/architecture/terminology.pdf.

Heath

http://svn.openehr.org/specification/BRANCHES/Release-1.1-candidate/publishi
ng/architecture/terminology.pdf.

Ok, I have seen that doc - my confusion is that ISO 3166-1 encompasses
both the 2 and 3 letter code variants (and is disambiguated by
appending "alpha-2" or "alpha-3" or "numeric"). I'm happy for openehr
to say that we are using only the alpha 2 codes (and calling it
"ISO_3166-1"), but this is not really made clear
(for instance, the accompanying terminology.xml
file lists numeric, alpha-2 and alpha-3 codes for each territory).

Andrew

Andrew Patterson wrote:

http://svn.openehr.org/specification/BRANCHES/Release-1.1-candidate/publishi
ng/architecture/terminology.pdf.
    
Ok, I have seen that doc - my confusion is that ISO 3166-1 encompasses
both the 2 and 3 letter code variants (and is disambiguated by
appending "alpha-2" or "alpha-3" or "numeric"). I'm happy for openehr
to say that we are using only the alpha 2 codes (and calling it
"ISO_3166-1"), but this is not really made clear
(for instance, the accompanying terminology.xml
file lists numeric, alpha-2 and alpha-3 codes for each territory).
  

Sorry - didn't mean to be dense - I thought that ISO were naming the
3166 code groups the same way as the 639 ones - i.e. -1 and -2. Our
general approach is to construct an id that is as close as possible to
the ISO approach, but with no spaces in the name (that complicates
things unnecessarily); so in this instance I we would do
"ISO_3166-alpha-1" and "ISO_3166-alpha-2". What we are really doing here
is selecting columns from the standard ISO 3166 multi-column table
(which has other columns like "name"); I am not sure I really like this,
but don't see what the alternative is - I think we need to treat 2-char
codes and 3-char codes as if they were in fact separate code-sets (i.e.
variants of the same semantic idea, viz 3166 country codes) - in fact we
have to, since "au" and "aus" are synonyms, not alternatives.
But the approach in ISO doesn't seem very systematic....

What do others think?

- thomas