XML serializer (retry due to too large message)

Hi all,

I’ve just finished implementing an XML serializer. I have followed the XML schema specifications carefully, but I’ve deviated from some parts, particularly the ontology section. In the terms element (defined as ARCHETYPE_TERM [1..]) found in language_set of term_definitions and constraint_definitions I found the definition of ARCHETYPE_TERM (found here: http://svn.openehr.org/specification/BRANCHES/Release-1.1-candidate/publishing/its/XML-schema/documentation/Archetype.xsd.html_h-47810521.html) to be a little bit strange. You will see that the items element has a hashTableStringString, but I think the entire items element should be replaced with dictionaryItem [1..] or something similar, since there is no need to have a hashTableStringString when there is only one local definition per at/ac-code.

However, I am not sure if I have followed the specifications correctly in some cases - for example when printing the hashTableStringString in the description part. Have a look at my example below and at the specification and provide me with feedback if something is wrong. If there are many errors, maybe the XML specifications should be clarified :wink: I hope to contribute with the XMLSerializer component to the openEHR Java reference implementation.

I also found some other issues in the XML specification, e.g. spelling mistakes etc., which I will provide in another mail. When I went through the specification I started thinking that some things probably could be made a bit more compact. Why not include rm_type_name, node_id/cardinality as attributes in the parent element instead of separate elements. This will make the XML archetypes slightly smaller…

Example of openEHR-EHR-OBSERVATION.height.v1

XML: http://www.imt.liu.se/~matfo/openEHR-EHR-OBSERVATION.height.v1.xml
ADL: http://www.imt.liu.se/~matfo/openEHR-EHR-OBSERVATION.height.v1.adl

Regards,

Mattias Forss

Hi Mattias, I've attached my attempt at a serialized adl instance - perhaps
we can converge on a consensus as to what they should look like!

Mine is incomplete - especially around the ontology section - but I have
done the attributes and children nodes differently, using xsi:type to
indicate the sub-type. This is similar to the way Sam did the reference
model serializations I saw, so I thought similar techniques would be
applied here.

Anyhow, I'm off on another project for a few weeks, but I thought I'd
send you this instance as food for thought.

Andrew

<at:archetype xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot;
xmlns:at="openEHR/v1/Archetype">
  <archetype_id>openEHR-EHR-OBSERVATION.blood_pressure.v1</archetype_id>
  <concept_code>at0000</concept_code>
  <original_language>
    <code_string>en</code_string>
    <terminology_id>iso</terminology_id>
  </original_language>
  <description>
    <original_author>
      <item>
        <key>name</key>
        <value>Sam Heard</value>
      </item>
      <item>
        <key>organisation</key>
        <value>Ocean Informatics</value>
      </item>
      <item>
        <key>date</key>
        <value>22/03/2006</value>
      </item>
      <item>
        <key>email</key>
        <value>sam.heard@oceaninformatics.biz</value>
      </item>
    </original_author>
    <other_contributors />
    <lifecycle_state>AuthorDraft</lifecycle_state>
    <details>
      <language>
        <code_string>en</code_string>
        <terminology_id>iso639-1</terminology_id>
      </language>
      <purpose>To record the systemic blood pressure of a person. The
measurement records the systolic and the diastolic pressure by some
means suitable for the result to be seen as a surrogate for the
general and systemic blood pressure.</purpose>
      <keywords>
        <item>observations</item>
        <item>blood pressure</item>
        <item>measurement</item>
      </keywords>
      <use>All blood pressure measurements are recorded using this
archetype. There is a rich state model for use with exercise ECGs and
Tilt Table measurements.</use>
      <misuse>Not to be used for intravascular pressure.</misuse>
    </details>
    <resource_package_uri>http://www.wow.com</resource_package_uri>
  </description>
  <definition>
    <rm_type_name>OBSERVATION</rm_type_name>
    <node_id>at0000</node_id>
    <attributes xsi:type="at:C_SINGLE_ATTRIBUTE" minOccurs="0" maxOccurs="0">
      <rm_attribute_name>guideline_id</rm_attribute_name>
    </attributes>
    <attributes xsi:type="at:C_SINGLE_ATTRIBUTE" minOccurs="1" maxOccurs="1">
      <rm_attribute_name>data</rm_attribute_name>
      <children xsi:type="at:C_COMPLEX_OBJECT">
        <rm_type_name>HISTORY</rm_type_name>
        <node_id>at0001</node_id>
        <attributes xsi:type="at:C_MULTIPLE_ATTRIBUTE" minOccurs="1"
maxOccurs="1">
          <rm_attribute_name>events</rm_attribute_name>
          <children xsi:type="at:C_COMPLEX_OBJECT">
            <rm_type_name>EVENT</rm_type_name>
            <node_id>at0006</node_id>
            <attributes xsi:type="at:C_SINGLE_ATTRIBUTE" minOccurs="1"
maxOccurs="1">
              <rm_attribute_name>data</rm_attribute_name>
              <children xsi:type="at:C_COMPLEX_OBJECT">
                <rm_type_name>ITEM_LIST</rm_type_name>
                <node_id>at0003</node_id>
                <attributes xsi:type="at:C_MULTIPLE_ATTRIBUTE"
minOccurs="1" maxOccurs="1">
                  <rm_attribute_name>items</rm_attribute_name>
                  <children xsi:type="at:C_COMPLEX_OBJECT">
                    <rm_type_name>ELEMENT</rm_type_name>
                    <occurrences>
                      <includes_maximum>true</includes_maximum>
                      <includes_minimum>true</includes_minimum>
                      <maximum>1</maximum>
                      <minimum>0</minimum>
                    </occurrences>
                    <node_id>at0004</node_id>
                    <attributes xsi:type="at:C_SINGLE_ATTRIBUTE"
minOccurs="1" maxOccurs="1">
                      <rm_attribute_name>value</rm_attribute_name>
                      <children xsi:type="at:C_QUANTITY">
                        <property>
                          <code_string>125</code_string>
                          <terminology_id>openehr</terminology_id>
                        </property>
                        <list>
                          <magnitude>
                            <maximum>1000</maximum>
                            <minimum>0</minimum>
                          </magnitude>
                          <units>mm[Hg]</units>
                        </list>
                      </children>
                    </attributes>
                  </children>
                  <children xsi:type="at:C_COMPLEX_OBJECT">
                    <rm_type_name>ELEMENT</rm_type_name>
                    <occurrences>
                      <maximum>1</maximum>
                      <minimum>0</minimum>
                    </occurrences>
                    <node_id>at0005</node_id>
                    <attributes xsi:type="at:C_SINGLE_ATTRIBUTE"
minOccurs="1" maxOccurs="1">
                      <rm_attribute_name>value</rm_attribute_name>
                      <children xsi:type="at:C_QUANTITY">
                        <property>
                          <code_string>125</code_string>
                          <terminology_id>openehr</terminology_id>
                        </property>
                        <list>
                          <magnitude>
                            <maximum>1000</maximum>
                            <minimum>0</minimum>
                          </magnitude>
                          <units>mm[Hg]</units>
                        </list>
                      </children>
                    </attributes>
                  </children>
                  <cardinality>
                    <is_ordered>true</is_ordered>
                    <is_unique>false</is_unique>
                    <interval>
                      <maximum>0</maximum>
                      <minimum>0</minimum>
                    </interval>
                  </cardinality>
                </attributes>
              </children>
            </attributes>
            <attributes xsi:type="at:C_SINGLE_ATTRIBUTE" minOccurs="1"
maxOccurs="1">
              <rm_attribute_name>state</rm_attribute_name>
              <children xsi:type="at:C_COMPLEX_OBJECT">
                <rm_type_name>ITEM_LIST</rm_type_name>
                <node_id>at0007</node_id>
                <attributes xsi:type="at:C_MULTIPLE_ATTRIBUTE"
minOccurs="1" maxOccurs="1">
                  <rm_attribute_name>items</rm_attribute_name>
                  <cardinality>
                    <is_ordered>true</is_ordered>
                    <is_unique>false</is_unique>
                    <interval>
                      <maximum>0</maximum>
                      <minimum>0</minimum>
                    </interval>
                  </cardinality>
                </attributes>
              </children>
            </attributes>
          </children>
          <cardinality>
            <is_ordered>false</is_ordered>
            <is_unique>true</is_unique>
            <interval>
              <maximum>0</maximum>
              <minimum>1</minimum>
            </interval>
          </cardinality>
        </attributes>
      </children>
    </attributes>
  </definition>
  <ontology>
    <term_defintions>
      <language>en</language>
      <terms>
        <code>at0000</code>
        <items>
          <item>
            <key>description</key>
            <value>the measurement of system arterial blood pressure
which is deemed to represent the actual systemic blood
pressure</value>
          </item>
          <item>
            <key>text</key>
            <value>Blood pressure measurement</value>
          </item>
        </items>
      </terms>
      <terms>
        <code>at0001</code>
        <items>
          <item>
            <key>description</key>
            <value>history structural node</value>
          </item>
          <item>
            <key>text</key>
            <value>history</value>
          </item>
        </items>
      </terms>
      <terms>
        <code>at0002</code>
        <items>
          <item>
            <key>description</key>
            <value>baseline event in event history</value>
          </item>
          <item>
            <key>text</key>
            <value>baseline reading</value>
          </item>
        </items>
      </terms>
      <terms>
        <code>at0003</code>
        <items>
          <item>
            <key>description</key>
            <value>systemic arterial blood pressure</value>
          </item>
          <item>
            <key>text</key>
            <value>blood pressure</value>
          </item>
        </items>
      </terms>
    </term_defintions>
  </ontology>
</at:archetype>

Hi Andrew,

I looked at your example and I think it could be a good way to use xsi:type to indicate sub-types where the number of children elements are specified to be only one. This will mean that we don’t need to add an extra sub-element, e.g. (details here: http://svn.openehr.org/specification/BRANCHES/Release-1.1-candidate/publishing/its/XML-schema/documentation/Archetype.xsd.html_h439624612.html). However, I don’t think the XML schema specification of the AOM explicitly state that xsi:type should be in XML archetypes. I would appreciate if openEHR published some XML archetypes that exemplified the standard way to express them. I don’t like the idea of having several ways of representing archetypes in XML so it would be nice if some examples were out to lead the way.

When there are more than one child inside an element, the idea with xsi:type requires us to repeat the container element for each child instead of placing all children inside a single container element, so you have

...



instead of

... ... ...

The first example is of course more compact, but the element name “children” doesn’t make sense, since it doesn’t contain all of the attribute’s children. The second example will collect all the children in one single container element, but again, I don’t know what the specification mean with the occurrences brackets, e.g. what does [0..] refer to in C_OBJECT [0..] ? Does it refer to the element or to the C_OBJECT element? This should be clarified. I have been dealing a lot with ADL and I can say that the second example seems more plausible to me and I see the children element equal to an attribute’s “matches {}” in ADL.

Any thoughts about this?

Regards,

Mattias

2006/11/16, Andrew Patterson <andrewpatto@gmail.com>:

Hi Mattias,
I would have to agree with Andrew's approach to serializing the AOM to XML.
The existence of an XML element representing the AOM type would be
inconsistent with the openEHR RM XML schema and would not work well with
common development framework XML serializers. Ocean has also implemented an
XML archetype serializer but I currently don't have the details on its
output. I am sure Sam will make comment when he grounds himself from
travel.

I think we need to have a more formal approach to gaining consensus on these
kinds of technical artefacts before they are released for community
consumption. Having said that, we don't want this to turn into an endless
process, which when agreeing on an XML representation can easily turn into.

Regards

Heath

Heath Frankel
Product Development Manager
Ocean Informatics

Ground Floor, 64 Hindmarsh Square
Adelaide, SA, 5000
Australia

ph: +61 (0)8 8223 3075
fax:+61 (0)8 8223 2570
mb: +61 (0)412 030 741
email: heath.frankel@oceaninformatics.biz

Mattias,
You don’t seem to follow the AOM when generating your XML instances. For example, the C_MULTIPLE_ATTRIBUTE class has a property of ‘members’ which is a list of C_OBJECT. This property name should be used in the XML instance so you would get:




The alternative is to have the following but I suggest that members is not quite right similar to your use of children below.



I would also suggest that we should follow the AOM more closely and use an existence element rather than minOccurs and maxOccurs. What you are doing by using the later is mimicking ADL rather than following the AOM. Therefore you would get by following based on the openEHR RM for the Interval type.

1 1 ...




Regards

Heath

Hi Heath,

I know that the C_MULTIPLE_ATTRIBUTE class has a property of ‘members’ in the AOM (since I know the AOM very much in detail), but it’s not in the XML schema specification. I have not followed the AOM, because I thought I was only supposed to look at the schema. Here’s the XML schema and XML instance of C_MULTIPLE_ATTRIBUTE with the property ‘children’ and not ‘members’ as in the AOM: http://svn.openehr.org/specification/BRANCHES/Release-1.1-candidate/publishing/its/XML-schema/documentation/Archetype.xsd.html_h783584366.html . If you could explain a bit more which strategy I should use when generating XML instances I would be very grateful. It seems you suggest I should follow the AOM more closely instead of the XML schema of AOM and its instance representations.

By the way, your second example representation of ‘members’ is similar to Andrew’s example and not mine. I have one container element called ‘children’, but no xsi:type specified. Where do you get the element name from? I can’t find it neither in the XML schema nor the AOM.

Regards,

Mattias

2006/11/17, Heath Frankel <heath.frankel@frankelinformatics.com>:

I know that the C_MULTIPLE_ATTRIBUTE class has a property of 'members' in
the AOM (since I know the AOM very much in detail), but it's not in the XML
schema specification. I have not followed the AOM, because I thought I was
only supposed to look at the schema.

The AOM is at fault in this instance - the AOM has a field defined in
C_ATTRIBUTE called 'children', and then proceeds to rename this field
to 'attributes' and 'members' in the two subclasses C_SINGLE_ATTRIBUTE
and C_MULTIPLE_ATTRIBUTE. This of course is not really implementable
in any OO style language or XML.. the XML schema does the correct
thing and just defines 'children' in the base C_ATTRIBUTE class.

I have followed the XSD exactly in my serialization.. I believe the
intention is that the archetype XSD reflects the AOM model 1:1
(as much as possible). I see the archetype XSD as a formal
definition of the cotnent of the AOM document.

Andrew

Mattias,

the usage of xsi:type is solely because object hierarchies are being
used in the AOM. Using xsi:type allows serializers to know the type
they are getting before having to parse it in.. however, even without
xsi:type, your serialization would still not be correct for the xsd
given (i.e. let us pretend there is only a C_ATTRIBUTE, with no
subclasses). Any reference to an element of type C_ATTRIBUTE
in xml should result in an xml entry named by the 'element with
type C_ATTRIBUTE' i.e. 'children'. You never put type names into
the actual xml instances, merely element names. And for sequences,
this means repeating the xml entry name 'children' (augmented in
this case by an xsi:type to help with the subclasses).

Andrew

Mattias,
Sorry, I didn’t realise this schema was available (I overlooked your reference to it in your original email). OK, so based on this schema the instance is similar to my second example (but using children as the element name rather than members) and your first example, which neither of us like due to the plural nature of the element name for a singular element. I think we need to pass this feedback on to Sam and see if we can ensure that the schema fully reflects the Reference Model including element names that reflect model attribute names such as members and existence.

The usual way a list is represented is a container with multiple items, this is how I came up with this representation of a members element with item child elements. You are right in stating that this is not in the XML schema or AOM, I was looking at this from first principles.

Looking deeper into how the openEHR RM XML schemata represent containment, I find that it has used the pattern suggested in the Archetype XML schema. For example SECTION has element called items that is repeatable. I guess we need to continue with that pattern unless we change the openEHR RM XML schemata as well. The problem with changing this is that the openEHR paths are designed to be compatible with XPath and converting a path such as /content[openEHR-EHR-SECTION-findings.v1]/items[openEHR-EHR-OBSERVATION-laboratory.v1] into XPath and evaluating it will expect to have an XML element called items within an element called content.

Therefore I suggest that based on the current XML schema your instance should look like your first example:

...



However, I would advocate that we should submit a change request to change the schema to use the element name of members rather than children. There are probably other AOM alignments required.

Additionally I would like to see the use of an existence element of type INTEGER_INTERVAL (i.e. INTERVAL) rather than minOccurs & maxOccurs. Thoughts?

Heath

The AOM is at fault in this instance - the AOM has a field
defined in C_ATTRIBUTE called 'children', and then proceeds
to rename this field to 'attributes' and 'members' in the two
subclasses C_SINGLE_ATTRIBUTE and C_MULTIPLE_ATTRIBUTE. This
of course is not really implementable in any OO style
language or XML.. the XML schema does the correct thing and
just defines 'children' in the base C_ATTRIBUTE class.

I have followed the XSD exactly in my serialization.. I
believe the intention is that the archetype XSD reflects the
AOM model 1:1 (as much as possible). I see the archetype XSD
as a formal definition of the cotnent of the AOM document.

Oh, so that's why I got confused why members was implemented as a method
rather than an attribute, I didn't make the correlation between members and
children (perhaps I should have read the words rather than just the picture
:>).

In that case, the XML schema does not require a change request for this
issue. I would still like to explore the use of an existence element rather
than minOccurs and maxOccurs attributes. I don't see why existence and
occurrences in C_OBJECT are treated differently. And then I think the
interval_of_integer type should use elements lower and upper as per the
Interval assumed type specified in the openEHR Support package.

Heath

2006/11/17, Andrew Patterson <andrewpatto@gmail.com>:

I know that the C_MULTIPLE_ATTRIBUTE class has a property of ‘members’ in
the AOM (since I know the AOM very much in detail), but it’s not in the XML
schema specification. I have not followed the AOM, because I thought I was
only supposed to look at the schema.

The AOM is at fault in this instance - the AOM has a field defined in
C_ATTRIBUTE called ‘children’, and then proceeds to rename this field
to ‘attributes’ and ‘members’ in the two subclasses C_SINGLE_ATTRIBUTE
and C_MULTIPLE_ATTRIBUTE. This of course is not really implementable
in any OO style language or XML.. the XML schema does the correct
thing and just defines ‘children’ in the base C_ATTRIBUTE class.

No, it proceeds to rename this field to ‘alternatives’ (not attributes) and ‘members’.

I agree, that it’s not OO style, but why isn’t it implementable in XML? XML isn’t OO, it’s just a way of storing structured information, and the guys building the XML parsers to create the AOM objects again can probably deal with that. But since the AOM is OO I guess it wouldn’t be a bad idea if the contents of the XML instances were in OO style.

Mattias

Well, I’m no expert on XSD since I never cared about learning it… but if I go back to your example, why didn’t you use xsi:type in some places, for example:

...

Is you used it here it would be:

...

Regards,

Mattias

2006/11/17, Andrew Patterson <andrewpatto@gmail.com>:

go back to your example, why didn't you use xsi:type in some places, for
example:

<description>
    <original_author>
        <item> ...

Is you used it here it would be:

<description xsi:type="RESOURCE_DESCRIPTION">
    <original_author xsi:type="hashTableStringString">
        <item xsi:type="dictionaryItem"> ...

because there are no doubts about the actual type of
'description' (i.e. it has no subclasses), when the
serialiazer gets to the 'description' node. it knows
what to expect. If we had a RESOURCE_DESCRIPTION_EXTENDED
complexType that was based on RESOURCE_DESCRIPTION,
then we would need to use xsi:type.

Andrew

Heath, I must say that I have come to agree with you (and Andrew concerning the repeating of ‘children’ for multiple children) - but lets hear what Sam has to say about the changes you propose.

Mattias

2006/11/17, Heath Frankel <heath.frankel@frankelinformatics.com>:

Mattias Forss schreef:

Well, I'm no expert on XSD since I never cared about learning it...
but if I go back to your example, why didn't you use xsi:type in some
places, for example:

I don't know if you ever saw this site, I did, it was helpful

Bert

I agree, that it's not OO style, but why isn't it implementable in XML? XML
isn't OO, it's just a way of storing structured information, and the guys
building the XML parsers to create the AOM objects again can probably deal
with that.

The use of complexType with extensions in XSD follows the OO
model. So if it has a field called 'children' in C_ATTRIBUTE, that
field is going to be in all in extensions - called 'children'.. if those
sub classes also define a similar field, then they will have two
fields. I just presumed that the AOM had a textual mistake
(whilst the 'alternatives' and 'members' are more correct descriptions
of the attribute, they technically are still 'children' so I don't see
a problem with them having that inherited field and using it to
store alternatives and members respectively).

Andrew

Bert Verhees schreef:

Mattias Forss schreef:

Well, I'm no expert on XSD since I never cared about learning it...
but if I go back to your example, why didn't you use xsi:type in some
places, for example:

I don't know if you ever saw this site, I did, it was helpful

http://www.w3schools.com/schema/default.asp

(forgot :wink:

2006/11/17, Heath Frankel <heath.frankel@frankelinformatics.com>:

The AOM is at fault in this instance - the AOM has a field
defined in C_ATTRIBUTE called ‘children’, and then proceeds
to rename this field to ‘attributes’ and ‘members’ in the two
subclasses C_SINGLE_ATTRIBUTE and C_MULTIPLE_ATTRIBUTE. This
of course is not really implementable in any OO style
language or XML.. the XML schema does the correct thing and
just defines ‘children’ in the base C_ATTRIBUTE class.

I have followed the XSD exactly in my serialization.. I
believe the intention is that the archetype XSD reflects the
AOM model 1:1 (as much as possible). I see the archetype XSD
as a formal definition of the cotnent of the AOM document.

Oh, so that’s why I got confused why members was implemented as a method
rather than an attribute, I didn’t make the correlation between members and
children (perhaps I should have read the words rather than just the picture
:>).

Oops, I also missed that it was a function :slight_smile:

Maybe the specs could distinct attributes, functions, etc with different coloring.

In that case, the XML schema does not require a change request for this
issue. I would still like to explore the use of an existence element rather
than minOccurs and maxOccurs attributes. I don’t see why existence and
occurrences in C_OBJECT are treated differently. And then I think the
interval_of_integer type should use elements lower and upper as per the
Interval assumed type specified in the openEHR Support package.

Agree

Mattias

All right then, so now I know how I should proceed. Thank you very much for your help. It was a lot faster to understand the schema by your input rather than starting to read some XSD documentation.

Regards,

Mattias

2006/11/17, Andrew Patterson <andrewpatto@gmail.com>:

Actually, the AOM represents alternatives and members as methods not as
attributes. These methods obviously return the children attribute.

Heath