openEHR XML-schema Questions

Dear all,

Sam Heard at Ocean Informatics has built a schema for openEHR, to be published open source on openEHR.org as soon as possible. There remain some open questions about how best to componentise XML-schemas.. The current structure is as three schemas which correspond to 3 pragmatic groupings of classes in the openEHR reference model. The 3 schemas are available (in HTML viewable form) at:

The problems we might have are:

  1. a top-level type called LOCATABLE is defined in both the Composition and Content schemas above but in the openEHR reference model it is really a base type which is inherited into nearly everything. Would it be better if we put such classes into a separete small schema, and inherited them (without namespacing)?
  2. a bunch of classes representing the openEHR clinical data structures are in the Content schema, along with openEHR ENTRY, OBSERVATION, SECTION and other classes. The data structure classes will be required by the openEHR demographic schema, when we write it, but not the ENTRY etc classes. Should we split it out into another includable schema?
    When the DSTC did an original openEHR XML-schema about 2 years ago, they had one schema per infomration model (i.e. per major top level package) in openEHR, and did not use namespacing with inclusion (i.e. no prefixes on references to included types). They also had a “core” schema which included everything into it, giving the effect of one big schema. Is this approach or the other better? It should be noted that the schemas posted are already in use and work reasonably well, so this is not a question making them work at all, it is a question of good design.
  • thomas beale
  • If you have any questions about using this list, please send a message to d.lloyd@openehr.org

In my opinion the best design is to have one schema per information model.
This facilitate a logical division of the designed types and to keept in
mind the reference model. The use of namespaces is a good idea to clarify
thinks (I belive). To use this philosophy for representing RM in OWL gives
an easy way to understand ontologies so for XML-schema I think it will be
good too.
Regards
Isabel Román

Hello all

I have divided the schema into three pieces and used ‘import’ to reference the content (section, entries, data structures, clusters and elements) and basetype schemas (datatypes, common and support). The idea was that the schemas could be used standalone with queries for returning compositions (see the ‘composition’ element in the composition schema), returning composition fragments (see the ‘items’ element in the Content schema) and data values (the ‘values’ element in the BaseTypes schema).

The alternative is to use the include statement, but then there is no reference to which schema document the type is referenced from.

Sam

Thomas Beale wrote:

Dear all,

Sam Heard at Ocean Informatics has built a schema for openEHR, to be published open source on openEHR.org as soon as possible. There remain some open questions about how best to componentise XML-schemas.. The current structure is as three schemas which correspond to 3 pragmatic groupings of classes in the openEHR reference model. The 3 schemas are available (in HTML viewable form) at:

The problems we might have are:

  1. a top-level type called LOCATABLE is defined in both the Composition and Content schemas above but in the openEHR reference model it is really a base type which is inherited into nearly everything. Would it be better if we put such classes into a separete small schema, and inherited them (without namespacing)?
  2. a bunch of classes representing the openEHR clinical data structures are in the Content schema, along with openEHR ENTRY, OBSERVATION, SECTION and other classes. The data structure classes will be required by the openEHR demographic schema, when we write it, but not the ENTRY etc classes. Should we split it out into another includable schema?
    When the DSTC did an original openEHR XML-schema about 2 years ago, they had one schema per infomration model (i.e. per major top level package) in openEHR, and did not use namespacing with inclusion (i.e. no prefixes on references to included types). They also had a “core” schema which included everything into it, giving the effect of one big schema. Is this approach or the other better? It should be noted that the schemas posted are already in use and work reasonably well, so this is not a question making them work at all, it is a question of good design.
  • thomas beale
  • If you have any questions about using this list, please send a message to
  • If you have any questions about using this list, please send a message to d.lloyd@openehr.org

Dear All

The Schema on the Ocean site are now in four sections and use inheritence of LOCATABLE across the schema.

Sam

  • If you have any questions about using this list, please send a message to d.lloyd@openehr.org

Hi,

Just a doubt.

Why the classes names are not included in the XML-Schema?. For instance, it would convenient for multi-valued attributes in order to distinguish one data instance from another (imagine the potential mess if the attribute is not ordered). It will also make it “more compatible” with paths of the form class\attribute\class\attribute of archetypes. Beside, much typing information is lost, this makes the XML document less self-descriptive.

regards

Sam Heard escribió:

Hi Alberto,

Jose Alberto Maldonado wrote:

Hi,

Just a doubt.

Why the classes names are not included in the XML-Schema?. For instance, it would convenient for multi-valued attributes in order to distinguish one data instance from another (imagine the potential mess if the attribute is not ordered). It will also make it "more compatible" with paths of the form class\attribute\class\attribute of archetypes. Beside, much typing information is lost, this makes the XML document less self-descriptive.

to be more precise, I suspect you are asking why the class names are not in the data - they are in the schema. I think it would be better. But can you provide specific comments to show why the paths won't match the archetype paths (which is what we want - so if it doesn't work, we will change the schema).

- thomas

Hi Tom,

Yes you are right, my concern is about the absence of elements with classes names in the data. Of course, in the schema you can find the class name in the construct:

<xs:complextype name=“class_name”…>

but not elements <xs:element name=“class_name” type=“…”>

Thus, it does not permit to have path class\attribute\class… in XML data.

For instance,

<xs:complexType name=“COMPOSITION”>
xs:complexContent
<xs:extension base=“st:LOCATABLE”>
xs:sequence
<xs:element name=“category” type=“bt:DV_CODED_TEXT”/>

</xs:sequence>
<xs:attribute name=“rm_version” type=“xs:string” use=“required”/>
</xs:extension>
</xs:complexContent>
</xs:complexType>

let’s focus on attribute category of type DV_CODED_TEXT:

<xs:complexType name=“DV_CODED_TEXT”>
xs:complexContent
<xs:extension base=“DV_TEXT”>
xs:sequence
<xs:element name=“defining_code” type=“CODE_PHRASE”/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>

and type “code_phrase” is:

<xs:complexType name=“CODE_PHRASE”>
xs:sequence
<xs:element name=“code_string” type=“xs:string” maxOccurs=“1” minOccurs=“1”/>
<xs:element name=“terminology_id” type=“TERMINOLOGY_ID” maxOccurs=“1” minOccurs=“1”/>
</xs:sequence>
</xs:complexType>

so in the xml documents the element “category” would be:

........ (attributes defined in DV_TEXT) here_a_code .....

so all the path in data are chains of attribute names.

Probably, I have misunderstood the aim of the XML-Schema, I suppose that it defines XML documents compliant with the RM.

regards

Thomas Beale escribió:

Jose

Thank you for your input. The design of the schema is based on two principles:

  1. It should ensure that the data conforms to the reference model
  2. That the paths to the data are as close as possible to the paths in the archetypes.

Due to the recursive nature of the schema (e.g. clusters can contain clusters or elements) a generic term for these classes (ie items) is used within the path. The item is then typed to either an element or a cluster. This means that the path is ‘/items[atXXXX]/items[atXXXX]..’ regardless of the actual class name. The typing approach also means that inheritance can be used to ensure correctness (e.g. the generic type ITEM which is a superclass of ELEMENT and CLUSTER is the type when either of these classes are valid).

The type of the item is always available in the @type= attribute.

Although the more traditional tagging by class name was investigated, the current approach works much better with available tools.

Cheers, Sam

Jose Alberto Maldonado wrote:

Hi Tom,

Yes you are right, my concern is about the absence of elements with classes names in the data. Of course, in the schema you can find the class name in the construct:

<xs:complextype name=“class_name”…>

but not elements <xs:element name=“class_name” type=“…”>

Thus, it does not permit to have path class\attribute\class… in XML data.

For instance,

<xs:complexType name=“COMPOSITION”>
xs:complexContent
<xs:extension base=“st:LOCATABLE”>
xs:sequence
<xs:element name=“category” type=“bt:DV_CODED_TEXT”/>

</xs:sequence>
<xs:attribute name=“rm_version” type=“xs:string” use=“required”/>
</xs:extension>
</xs:complexContent>
</xs:complexType>

let’s focus on attribute category of type DV_CODED_TEXT:

<xs:complexType name=“DV_CODED_TEXT”>
xs:complexContent
<xs:extension base=“DV_TEXT”>
xs:sequence
<xs:element name=“defining_code” type=“CODE_PHRASE”/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>

and type “code_phrase” is:

<xs:complexType name=“CODE_PHRASE”>
xs:sequence
<xs:element name=“code_string” type=“xs:string” maxOccurs=“1” minOccurs=“1”/>
<xs:element name=“terminology_id” type=“TERMINOLOGY_ID” maxOccurs=“1” minOccurs=“1”/>
</xs:sequence>
</xs:complexType>

so in the xml documents the element “category” would be:

........ (attributes defined in DV_TEXT) here_a_code .....

so all the path in data are chains of attribute names.

Probably, I have misunderstood the aim of the XML-Schema, I suppose that it defines XML documents compliant with the RM.

regards

Thomas Beale escribió:

Hi Alberto,

Jose Alberto Maldonado wrote:

Hi,

Just a doubt.

Why the classes names are not included in the XML-Schema?. For instance, it would convenient for multi-valued attributes in order to distinguish one data instance from another (imagine the potential mess if the attribute is not ordered). It will also make it “more compatible” with paths of the form class\attribute\class\attribute of archetypes. Beside, much typing information is lost, this makes the XML document less self-descriptive.

to be more precise, I suspect you are asking why the class names are not in the data - they are in the schema. I think it would be better. But can you provide specific comments to show why the paths won’t match the archetype paths (which is what we want - so if it doesn’t work, we will change the schema).

  • thomas

If you have any questions about using this list,
please send a message to d.lloyd@openehr.org

  • If you have any questions about using this list, please send a message to d.lloyd@openehr.org