openEHR XML-schema Questions

thomas.beale · 20 October 2005 13:22

Dear all,

Sam Heard at Ocean Informatics has built a schema for openEHR, to be published open source on openEHR.org as soon as possible. There remain some open questions about how best to componentise XML-schemas.. The current structure is as three schemas which correspond to 3 pragmatic groupings of classes in the openEHR reference model. The 3 schemas are available (in HTML viewable form) at:

http://oceaninformatics.biz/schema/documentation/Composition.xsd.html
http://oceaninformatics.biz/schema/documentation/Content.xsd.html
http://oceaninformatics.biz/schema/documentation/BaseTypes.xsd.html
The two higher ones use the next ones down, in a simple chained-inclusion fashion; each schema is locally given its own prefix when used in other schemas, e.g. types defined in the 3rd schema are denoted bt:XXXX where they are used in the other schemas (“bt” = base types)..

The problems we might have are:

a top-level type called LOCATABLE is defined in both the Composition and Content schemas above but in the openEHR reference model it is really a base type which is inherited into nearly everything. Would it be better if we put such classes into a separete small schema, and inherited them (without namespacing)?
a bunch of classes representing the openEHR clinical data structures are in the Content schema, along with openEHR ENTRY, OBSERVATION, SECTION and other classes. The data structure classes will be required by the openEHR demographic schema, when we write it, but not the ENTRY etc classes. Should we split it out into another includable schema?
When the DSTC did an original openEHR XML-schema about 2 years ago, they had one schema per infomration model (i.e. per major top level package) in openEHR, and did not use namespacing with inclusion (i.e. no prefixes on references to included types). They also had a “core” schema which included everything into it, giving the effect of one big schema. Is this approach or the other better? It should be noted that the schemas posted are already in use and work reasonably well, so this is not a question making them work at all, it is a question of good design.

thomas beale
If you have any questions about using this list, please send a message to d.lloyd@openehr.org

Isabel_Roman · 20 October 2005 17:00

In my opinion the best design is to have one schema per information model.
This facilitate a logical division of the designed types and to keept in
mind the reference model. The use of namespaces is a good idea to clarify
thinks (I belive). To use this philosophy for representing RM in OWL gives
an easy way to understand ontologies so for XML-schema I think it will be
good too.
Regards
Isabel Román

Sam · 20 October 2005 22:18

Hello all

I have divided the schema into three pieces and used ‘import’ to reference the content (section, entries, data structures, clusters and elements) and basetype schemas (datatypes, common and support). The idea was that the schemas could be used standalone with queries for returning compositions (see the ‘composition’ element in the composition schema), returning composition fragments (see the ‘items’ element in the Content schema) and data values (the ‘values’ element in the BaseTypes schema).

The alternative is to use the include statement, but then there is no reference to which schema document the type is referenced from.

Sam

Thomas Beale wrote:

Dear all,

Sam Heard at Ocean Informatics has built a schema for openEHR, to be published open source on openEHR.org as soon as possible. There remain some open questions about how best to componentise XML-schemas.. The current structure is as three schemas which correspond to 3 pragmatic groupings of classes in the openEHR reference model. The 3 schemas are available (in HTML viewable form) at:

http://oceaninformatics.biz/schema/documentation/Composition.xsd.html

http://oceaninformatics.biz/schema/documentation/Content.xsd.html

http://oceaninformatics.biz/schema/documentation/BaseTypes.xsd.html
The two higher ones use the next ones down, in a simple chained-inclusion fashion; each schema is locally given its own prefix when used in other schemas, e.g. types defined in the 3rd schema are denoted bt:XXXX where they are used in the other schemas (“bt” = base types)..

The problems we might have are:

a top-level type called LOCATABLE is defined in both the Composition and Content schemas above but in the openEHR reference model it is really a base type which is inherited into nearly everything. Would it be better if we put such classes into a separete small schema, and inherited them (without namespacing)?

a bunch of classes representing the openEHR clinical data structures are in the Content schema, along with openEHR ENTRY, OBSERVATION, SECTION and other classes. The data structure classes will be required by the openEHR demographic schema, when we write it, but not the ENTRY etc classes. Should we split it out into another includable schema?
When the DSTC did an original openEHR XML-schema about 2 years ago, they had one schema per infomration model (i.e. per major top level package) in openEHR, and did not use namespacing with inclusion (i.e. no prefixes on references to included types). They also had a “core” schema which included everything into it, giving the effect of one big schema. Is this approach or the other better? It should be noted that the schemas posted are already in use and work reasonably well, so this is not a question making them work at all, it is a question of good design.

thomas beale

If you have any questions about using this list, please send a message to

If you have any questions about using this list, please send a message to d.lloyd@openehr.org

Sam · 21 October 2005 01:26

Dear All

The Schema on the Ocean site are now in four sections and use inheritence of LOCATABLE across the schema.

http://oceaninformatics.biz/schema/documentation/Composition.xsd.html
http://oceaninformatics.biz/schema/documentation/Content.xsd.html
http://oceaninformatics.biz/schema/documentation/Structure.xsd.html
http://oceaninformatics.biz/schema/documentation/BaseTypes.xsd.html
This addresses some of the comments I have received.

Sam

If you have any questions about using this list, please send a message to d.lloyd@openehr.org

Jose_Alberto_Maldona · 21 October 2005 15:16

Hi,

Just a doubt.

Why the classes names are not included in the XML-Schema?. For instance, it would convenient for multi-valued attributes in order to distinguish one data instance from another (imagine the potential mess if the attribute is not ordered). It will also make it “more compatible” with paths of the form class\attribute\class\attribute of archetypes. Beside, much typing information is lost, this makes the XML document less self-descriptive.

regards

Sam Heard escribió:

thomas.beale · 21 October 2005 16:07

Hi Alberto,

Jose Alberto Maldonado wrote:

Hi,

Just a doubt.

Why the classes names are not included in the XML-Schema?. For instance, it would convenient for multi-valued attributes in order to distinguish one data instance from another (imagine the potential mess if the attribute is not ordered). It will also make it "more compatible" with paths of the form class\attribute\class\attribute of archetypes. Beside, much typing information is lost, this makes the XML document less self-descriptive.

to be more precise, I suspect you are asking why the class names are not in the data - they are in the schema. I think it would be better. But can you provide specific comments to show why the paths won't match the archetype paths (which is what we want - so if it doesn't work, we will change the schema).

- thomas

Jose_Alberto_Maldona · 21 October 2005 17:04

Hi Tom,

Yes you are right, my concern is about the absence of elements with classes names in the data. Of course, in the schema you can find the class name in the construct:

<xs:complextype name=“class_name”…>

but not elements <xs:element name=“class_name” type=“…”>

Thus, it does not permit to have path class\attribute\class… in XML data.

For instance,

<xs:complexType name=“COMPOSITION”>
xs:complexContent
<xs:extension base=“st:LOCATABLE”>
xs:sequence
<xs:element name=“category” type=“bt:DV_CODED_TEXT”/>
…

</xs:sequence>
<xs:attribute name=“rm_version” type=“xs:string” use=“required”/>
</xs:extension>
</xs:complexContent>
</xs:complexType>

let’s focus on attribute category of type DV_CODED_TEXT:

<xs:complexType name=“DV_CODED_TEXT”>
xs:complexContent
<xs:extension base=“DV_TEXT”>
xs:sequence
<xs:element name=“defining_code” type=“CODE_PHRASE”/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>

and type “code_phrase” is:

<xs:complexType name=“CODE_PHRASE”>
xs:sequence
<xs:element name=“code_string” type=“xs:string” maxOccurs=“1” minOccurs=“1”/>
<xs:element name=“terminology_id” type=“TERMINOLOGY_ID” maxOccurs=“1” minOccurs=“1”/>
</xs:sequence>
</xs:complexType>

so in the xml documents the element “category” would be:

........ (attributes defined in DV_TEXT) here_a_code .....

so all the path in data are chains of attribute names.

Probably, I have misunderstood the aim of the XML-Schema, I suppose that it defines XML documents compliant with the RM.

regards

Thomas Beale escribió:

Sam · 23 October 2005 21:57

Jose

Thank you for your input. The design of the schema is based on two principles:

It should ensure that the data conforms to the reference model
That the paths to the data are as close as possible to the paths in the archetypes.

Due to the recursive nature of the schema (e.g. clusters can contain clusters or elements) a generic term for these classes (ie items) is used within the path. The item is then typed to either an element or a cluster. This means that the path is ‘/items[atXXXX]/items[atXXXX]..’ regardless of the actual class name. The typing approach also means that inheritance can be used to ensure correctness (e.g. the generic type ITEM which is a superclass of ELEMENT and CLUSTER is the type when either of these classes are valid).

The type of the item is always available in the @type= attribute.

Although the more traditional tagging by class name was investigated, the current approach works much better with available tools.

Cheers, Sam

Jose Alberto Maldonado wrote:

Hi Tom,

Yes you are right, my concern is about the absence of elements with classes names in the data. Of course, in the schema you can find the class name in the construct:

<xs:complextype name=“class_name”…>

but not elements <xs:element name=“class_name” type=“…”>

Thus, it does not permit to have path class\attribute\class… in XML data.

For instance,

<xs:complexType name=“COMPOSITION”>
xs:complexContent
<xs:extension base=“st:LOCATABLE”>
xs:sequence
<xs:element name=“category” type=“bt:DV_CODED_TEXT”/>
…
</xs:sequence>
<xs:attribute name=“rm_version” type=“xs:string” use=“required”/>
</xs:extension>
</xs:complexContent>
</xs:complexType>

let’s focus on attribute category of type DV_CODED_TEXT:

<xs:complexType name=“DV_CODED_TEXT”>
xs:complexContent
<xs:extension base=“DV_TEXT”>
xs:sequence
<xs:element name=“defining_code” type=“CODE_PHRASE”/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>

and type “code_phrase” is:

<xs:complexType name=“CODE_PHRASE”>
xs:sequence
<xs:element name=“code_string” type=“xs:string” maxOccurs=“1” minOccurs=“1”/>
<xs:element name=“terminology_id” type=“TERMINOLOGY_ID” maxOccurs=“1” minOccurs=“1”/>
</xs:sequence>
</xs:complexType>

so in the xml documents the element “category” would be:
........ (attributes defined in DV_TEXT) here_a_code .....
so all the path in data are chains of attribute names.

Probably, I have misunderstood the aim of the XML-Schema, I suppose that it defines XML documents compliant with the RM.

regards

Thomas Beale escribió:

Hi Alberto,

Jose Alberto Maldonado wrote:

Hi,

Just a doubt.

Why the classes names are not included in the XML-Schema?. For instance, it would convenient for multi-valued attributes in order to distinguish one data instance from another (imagine the potential mess if the attribute is not ordered). It will also make it “more compatible” with paths of the form class\attribute\class\attribute of archetypes. Beside, much typing information is lost, this makes the XML document less self-descriptive.

to be more precise, I suspect you are asking why the class names are not in the data - they are in the schema. I think it would be better. But can you provide specific comments to show why the paths won’t match the archetype paths (which is what we want - so if it doesn’t work, we will change the schema).

thomas

If you have any questions about using this list,
please send a message to d.lloyd@openehr.org

If you have any questions about using this list, please send a message to d.lloyd@openehr.org

Topic		Replies	Views
ADL to XML Schema Technical (archive)	29	72	15 March 2005
bugs in domain types in XMLserializer Reference Implementation: Java (archive)	10	20	16 January 2012
EN/ISO 13606 & openEHR - harmonisation possibilities Technical (archive)	24	27	12 September 2011
Issues with Latest Spec Technical (archive)	3	12	3 July 2004
Archetypes and XML-Schemas Technical (archive)	11	22	14 May 2010
openEHR XML schemas Technical (archive)	6	12	18 December 2002
Question about Ocean's Archetype Editor Technical (archive)	6	11	17 November 2005
archetypes - distniguishing multiple alternative constraints of a single-valued attribute Technical (archive)	7	11	17 July 2008
Validating an objecrt against its archetype Technical (archive)	3	7	21 August 2005
Suggestion about the Multi-axial Archetype Indentifier Technical (archive)	6	19	21 February 2006

openEHR XML-schema Questions

Related topics