(sorry for the resend - not sure my first one went
through)
Hello all, long time lurker - first time poster..
I have been playing around with the Java reference
implementation and was wondering if anyone could
clarify some of my thinking. I'll just write down some
statements which is what I believe to be true and
if any of them are wrong, setting me straight will sort
out my misconceptions (there is a question
at the end).
- The XML ITS defines the on-the-wire format for an
XML instance of archetyped data (as
opposed to an XML format for the archetype itself).
- The XSD schema that are part of this ITS are
constructed by hand and not automatically
generated
- The XSD schema that are part of the ITS are
just for the reference model classes and one would
need to use these to build larger schema for any
real archetypes
- There is no existing tool for automatically
generating an XSD schema from an ADL file
- If one was to build one of these, a sensible place
to do this would be from the Java AOM classes
(see ADLOutputter.java)
- Any XSD schema can't possibly capture all the
constraints that are able to be expressed in an
ADL archetype
- An ADL archetype could be converted into
an XSD schema to enforce basic structure, and
some other set of constraint statements in another
language
So the question is, is anyone already
working on a tool to do conversion from ADL/AOM to
XSD (or in fact any XML schema def language
- schematron etc) and if not, would it be useful
(or have I completely missed the point somewhere?)
So the question is, is anyone already
working on a tool to do conversion from ADL/AOM to
XSD (or in fact any XML schema def language
- schematron etc) and if not, would it be useful
(or have I completely missed the point somewhere?)
We have been thinking about it for a long time. Turning ADL into XML instance is not too hard - that's just a serilialisation of AOM objects, where the XSD corresponds to the AOM itself. To convert an ADL archetype to an XSD means losing semantics, and making some painful conversions, since XSD basic types are more limited, and constraints don't include much more than min/maxCardinality. The real question is: what purpose does this serve. The obvious one is that people want to use "standard tools" to compare data to archetypes. The problem is that the current XML schema standards are not nearly strong enough to do the job, so the checking that standard tools can do in this area is quite limited. The only real answer is that new tools are needed. That's why archetype processing kernels are being built. Once these exist, they can be downloaded to compare any data to any archetype, and they become part of the information processing eco-system, adding a lot of power to existing formalisms.
However, one reason to make XSD expressions of archetypes (templates in fact, not archetypes) might be so they can act as message specifications to message generating systems, such as pathology systems. This would enable such systems to know (more or less) how to generate template-compliant messages such as path results, that could be received and processed by openEHR EHR systems.
Knowing what to do requires some examination of the perceived needs, so I think we should talk about that first before going too far with solutions.
The obvious one is that people want to use "standard
tools" to compare data to archetypes. The problem is that the current
XML schema standards are not nearly strong enough to do the job, so the
checking that standard tools can do in this area is quite limited. The
only real answer is that new tools are needed. That's why archetype
processing kernels are being built. Once these exist, they can be
downloaded to compare any data to any archetype, and they become part of
the information processing eco-system, adding a lot of power to existing
formalisms.
I worry that a full archetype processing kernel may be a bit
too heavy weight for some situations.. i.e. message generation is the
obvious one where a xsd schema will help out where a system is
generating data to submit to an ehr system. But even in terms of
processing openehr xml, having an xsd schema enables all sorts of
tools to let systems play around with the data in a lightweight
fashion (albiet without all the
strong consistency checks that can be specified in ADL - but in a
system that is merely processing data, not altering it, the adl assertions
etc can be checked at the boundary of the system - I don't then need
a full kernel continually checking my internal object model).
On another note, if the generation of XSD is an open question, where
exactly does serialisation to and from XML (i.e. the xml its) fit into the
big picture? Is it something that is envisaged for the reference
implementations? Should the Java reference model classes be able to
serialise themselves back and forth to XML?
Should the Java reference model classes be able to
serialise themselves back and forth to XML?
Andrew,
I don’t believe that the RM classes have to serialize themselves to XML.
In my opinion this would be better addressed by a different layer that transforms
RM Objects <=> XML (Object-to-XML Mapping).
Actually, I am developing such a layer (using spring-oxm and JAXB2), but there are blocking incompatibilities between the XML Schemas and the current java refereference implementation.
The obvious one is that people want to use "standard
tools" to compare data to archetypes. The problem is that the current
XML schema standards are not nearly strong enough to do the job, so the
checking that standard tools can do in this area is quite limited. The
only real answer is that new tools are needed. That's why archetype
processing kernels are being built. Once these exist, they can be
downloaded to compare any data to any archetype, and they become part of
the information processing eco-system, adding a lot of power to existing
formalisms.
I worry that a full archetype processing kernel may be a bit
too heavy weight for some situations..
ultimately I don't believe it will be. It will do less work than a XML instance/XSD processor. We just think that that's not much work because it is ubiquitous. When these kernels are fully engineered and widely available, I believe they will be just like XSD processors.
i.e. message generation is the
obvious one where a xsd schema will help out where a system is
generating data to submit to an ehr system. But even in terms of
processing openehr xml, having an xsd schema enables all sorts of
tools to let systems play around with the data in a lightweight
fashion (albiet without all the
strong consistency checks that can be specified in ADL - but in a
system that is merely processing data, not altering it, the adl assertions
etc can be checked at the boundary of the system - I don't then need
a full kernel continually checking my internal object model).
it remains to be seen what can be done with just XSD - certainly message generation should be possible. However for validation purposes, archetypes integrate various terminology validity checks as well as being semantically stronger just in terms of constraints. I am not saying we shouldn't attempt something with XSD, but we need to carefully see what we lose and whether it is worth it (for message generation it probably is worth the trouble).
On another note, if the generation of XSD is an open question, where
exactly does serialisation to and from XML (i.e. the xml its) fit into the
big picture? Is it something that is envisaged for the reference
implementations? Should the Java reference model classes be able to
serialise themselves back and forth to XML?
they already can in the current implementation, although the libraries that do this have been found to be buggy. I can't remember the exact problems, but can find out.
I do not believe the others have read your message fully.
The XML ITS defines the on-the-wire format for an
XML instance of archetyped data (as
opposed to an XML format for the archetype itself).
Yes
The XSD schema that are part of this ITS are
constructed by hand and not automatically
generated
Yes - it is optimised for XML tools
The XSD schema that are part of the ITS are
just for the reference model classes and one would
need to use these to build larger schema for any
real archetypes
No - any data is an instance of this schema - no matter what archetypes are used.
There is no existing tool for automatically
generating an XSD schema from an ADL file
This is not what needs to happen - you could have an XSD for archetypes (see thomas’s response).
If one was to build one of these, a sensible place
to do this would be from the Java AOM classes
(see ADLOutputter.java)
Any XSD schema can’t possibly capture all the
constraints that are able to be expressed in an
ADL archetype
That is true
An ADL archetype could be converted into
an XSD schema to enforce basic structure, and
some other set of constraint statements in another
language
I realise from your second message that you are thinking about something we are thinking about - a specific XSD for a template or set of archetypes in a composition. This schema would allow simplification of implementation for a specific purpose - such as the CCR from ASTM. BUT, the schema would actually specify what is expressed in the archetypes and the reference model.
This has been of interest to a number of us for some time as it would make it easy for minor applications build openEHR compliant data. You would need to go from the template to a new XSD. It would be good to try it by hand first to see what it might look like. I can provide you with a ‘prototype’ data (unpopulated XML) which is an instance of a set of archetypes - you will be able to see the archetype IDs and see if you can write a schema that produces the same data (and nothing else!).