Suggestion wrt XML Archetypes & Templates

Adam Flinton wrote:

I would like though to enquire wrt the rationale of containing _id info
in a separate <value/> element.

If you are being consistent
instead of :

       <terminology_id>
           <value>ISO_639-1</value>
       </terminology_id>

it should be simply:

       <terminology_id>ISO_639-1</terminology_id>

or <terminology_id value="ISO_639-1"/>

Adam
  

There is no special rationale. It is simply the default serialisation of
the type TERMINOLOGY_ID.

Lisa

Adam & Lisa,
There is a very specific rationale and it is consistent. The XML Schema is
a direct serialisation of the UML openEHR reference models. Every class is
an XML schema type and every attribute is an element except for
archetype_node_id as it is a metadata attribute. So in the case of
CODE_PHRASE, it has an attribute of terminology_id (hence the element
template_id) of type TERMINOLOGY_ID. TERMINOLOGY_ID has an attribute of
value of type string (hence the element value). Paths are absolutely
critical in openEHR and they are based on XPath. If we start arbitrarily
changing the schema based on what someone thinks is good XML we will break
the openEHR path to XPath correspondence making any mapping rules more
complex and error prone. This is why the terminology ID value is not
represented as a value attribute or element text node. Yes it might seem
inefficient but it was deemed to be important to make sure the logical model
to implementation mapping was consistent and ensure paths worked. In
comparison to HL7 v3, the data:noise ratio is still considerably less.

Regards

Heath

Adam Flinton wrote:

    
I would like though to enquire wrt the rationale of containing _id info
in a separate <value/> element.

If you are being consistent
instead of :

       <terminology_id>
           <value>ISO_639-1</value>
       </terminology_id>

it should be simply:

       <terminology_id>ISO_639-1</terminology_id>

or <terminology_id value="ISO_639-1"/>

Adam,
when you say it 'should' be - either pulled up a level, with an object
attribute removed OR represented as an XML attribute - what is the
driver? Is it semantic (you think there is something wrong with the
reprsentation of the object structure defined by the specification) or
is it to do with space/signal-to-noise (using one of the last two
methods uses less characters)?

The way it currently is is due to a direct machine-performed object
serialisation process - in other words, it simply follows the same rules
for transforming any object data into XML. Your suggestion (I presume)
is a special case of the general idea of representing all so-called
basic types (Strings, Integers, dates etc) as XML attributes rather than
as XML elements. But we have already just discussed and agreed that long
text strings (especially containing unicode, backslash quoting and
whitespace) should be XML elements.

As I have said before, what I think is most important is regular
encoding from data to and from XML, so that a) software is as simple and
clean as possible and b) changes are not needed due to particular
content (i.e. data). Now, ideally we would minimise use of bandwidth /
space with the representation as well. The problem is that XML is pretty
poorly designed for efficiently representing data, and has a poor signal
to noise ratio...making data serialise in a way that is either 'more
aesthetic' or smaller always implies more complex software containing
exceptional rules. Further, although XML isn't well designed for data
representation, in its original design, 'attributes' were intended for
meta-data items, rather than 'data'. Whether this semantic needs to be
retained in the XML we are talking about here is a question.

So the question is: at what level do we include exceptional processing
to reduce space wastage, since this complicates the software? How much
do we compromise the intended semantics of XML, where attributes are
designed for holding meta-data (including real meta-data, e.g. things
like xsi:TYPE etc)?

Any idea of saving space has to be done on the basis of a study of high
volumes of representatively diverse data. Saving 10 bytes is not
interesting, but saving 10Gb/minute in a large data processing system
is. I will go out on a limb and say that 'style' has no place in good
engineering, only good engineering does - correctness, performance,
maintainability etc.

With all that in mind - if the community wants to make the appropriate
analysis of data and propose a more space-efficient schema, I am not
against it. But the needs of correctness (= patient safety) must be
satisfied.

- thomas beale

Any idea of saving space has to be done on the basis of a study of high
volumes of representatively diverse data. Saving 10 bytes is not
interesting, but saving 10Gb/minute in a large data processing system
is. I will go out on a limb and say that 'style' has no place in good
engineering, only good engineering does - correctness, performance,
maintainability etc.

With all that in mind - if the community wants to make the appropriate
analysis of data and propose a more space-efficient schema, I am not
against it. But the needs of correctness (= patient safety) must be
satisfied.

- thomas beale

When will the tooling decorate the generated xml archetypes with the
required attribute?

Pretty printing is the norm.

The text should be normalized & the normalization should be enforceable.

Adam