Trying to understand the openEHR Information Model

ok but we just agreed that XSD doesn't do the kind of validation that is needed by archetypes, so I think what you are really proposing is XML based on Relax NG as a sufficiently powerful approach that would a) implement the required constraint semantics of archetypes and b) create data that can be queried by Xpath/Xquery out of the box?

Is that your suggestion?

If so, my reaction would be: let's investigate as a community. I don't think anyone has sufficiently investigated Relax NG or Schematron for openEHR purposes, and in hindsight we probably should have. It would be very interesting indeed to see how much better it would work than XSD.

I think if you make any progress in your work on these questions, let's turn them into specs / guidelines / whatever for general use in openEHR.

- thomas

Well, we have explored the use of Schematron and how it can be
automatically generated from archetypes (the idea was to rewrite CDA
implementation guides as CDA archetypes and generate schematron
automatically from them). I won't go into much detail, but we were
able to generate assert and report instructions easily. The only thing
unexplored was the use of phases. I would say we are in the 80 of the
80/20 rule :slight_smile:

hi Tom

you kind of need to set the ground rules for this. It’s not really practical to use schematron to do detailed terminology validation.
Must serious attempts end up creating some kind of web service terminology server that can be invoked from the schematron rules.

Once you’ve done that, you’ve got a relief valve for things that are too difficult to express in xpath - because xpath is really good for some things, and really not so good for others.

But also, once you’ve done that, you’re not really using schematron anymore…

Grahame

Another very important restriction for using XML Schema, in my opinion, is that you cannot have two or more elements with the same name but a different data type. This data type must be in detail the same. XML Schema regards an Element with a Dv_Text as a different datatype from an Element with a Dv_CodedText.

Both elements will be called “items” in an XML schema representing an OpenEhr data structure, and thus is not allowed having them different details in data types. This brought Tim Cook to using the GUIDs in the element-names, which is unworkable in my opinion, and above all, probably unnecessary, because in RelaxNG this restriction does not exist.

There are many reasons and benefits to using Type4 UUIDs. I cannot imagine that RelaxNG has any magic to allow global elements to be the same name and have different types or two elements at the same level have different types. Certainly no programming language allows that. There are other approaches that can be used and not use UUIDs. You can nest all of your complexTypes and create VERY wide artifacts. You can use intricate namespacing if you want.

Other tricks are also possible, for example augmenting element-names during validation-time, but also that is cumbersome code, and that just for avoiding the problems of an ivory tower stupid W3C standard?

Interesting here that you call it a stupid standard and then in a later email you praise it for its industry acceptance.

But, the things you continue to call tricks are not tricks, they are features of the standard that are implemented because one or more people presented sufficient use cases. Just because Priscilla Walmsley doesn’t bless their use doesn’t mean that they are any less valid.

So this is indeed an important restriction, which makes the clean use of XML Schema impossible in OpenEhr-rm, or any other ADL based multi level modeling system. Dirty use, tricks, ignoring validation errors, etc of course remain possible.

Yes, be specific. You probably can’t model an ADL based RM in Haskell or Erlang either. But that doesn’t mean they are not useful in multi-level modeling with a functional design. If you goal is to stay with ADL then you have to live with those requirements. I chose not to in MLHIM for all the other benefits that come with adopting XML technologies.

There are more restrictions, but less important. For example it is not possible to support the Dv_Time constraint/pattern hh:??:??, same for Dv_DateTime. In the Dv_Date is also a problem, but can be worked around by the “alternative” rule, but on another way then it is meant to use.

There are very clean and efficient ways to allow for partial dates in XML Schema.

Anyway, after a few weeks I will probably define the OpenEhr RM and all possible constraints in RelaxNG.

I’m not sure why you think RelaxNG will work any better than XML Schema 1.1
This is a three part intro that was published in Jan. 2009: http://www.ibm.com/developerworks/xml/library/x-xml11pt1/index.html

–Tim

Hi Tim,

There are many reasons and benefits to using Type4 UUIDs. I cannot imagine that RelaxNG has any magic to allow global elements to be the same name and have different types or two elements at the same level

Sometimes study helps to expand imagination. You should go beyond your imagination, it has nothing to do with magic, but we had that discussion before.

two elements at the same level have different types. Certainly no programming language allows that.

First, an XML schema language is not a programming language but a schema language, kind like ADL, second, ever heard of polymorphism in programming languages?
http://en.wikipedia.org/wiki/Polymorphism_in_object-oriented_programming

There are other approaches that can be used and not use UUIDs. You can nest all of your complexTypes and create VERY wide artifacts. You can use intricate namespacing if you want.

Of course, I found some ways too, but I don’t like them. That is my choice, why do you oppose so strong against that, what is the reason?

Other tricks are also possible, for example augmenting element-names during validation-time, but also that is cumbersome code, and that just for avoiding the problems of an ivory tower stupid W3C standard?

Interesting here that you call it a stupid standard and then in a later email you praise it for its industry acceptance.

No, you misunderstood my message, I “praise” XML for being the industry-standard, XML-schema is only a way of “doing” XML.

But, the things you continue to call tricks are not tricks, they are features of the standard that are implemented because one or more people presented sufficient use cases. Just because Priscilla Walmsley doesn’t bless their use doesn’t mean that they are any less valid.

So this is indeed an important restriction, which makes the clean use of XML Schema impossible in OpenEhr-rm, or any other ADL based multi level modeling system. Dirty use, tricks, ignoring validation errors, etc of course remain possible.

Yes, be specific. You probably can’t model an ADL based RM in Haskell or Erlang either. But that doesn’t mean they are not useful in multi-level modeling with a functional design. If you goal is to stay with ADL then you have to live with those requirements. I chose not to in MLHIM for all the other benefits that come with adopting XML technologies.

Your choice, I make another choice. No problem.

There are more restrictions, but less important. For example it is not possible to support the Dv_Time constraint/pattern hh:??:??, same for Dv_DateTime. In the Dv_Date is also a problem, but can be worked around by the “alternative” rule, but on another way then it is meant to use.

Sorry Tim, I am quite sure about this. You can handle them as Strings, then they can have any format, but then you will have a problem in doing queries, because in indexes they are treated as string then.

There are very clean and efficient ways to allow for partial dates in XML Schema.

Anyway, after a few weeks I will probably define the OpenEhr RM and all possible constraints in RelaxNG.

I’m not sure why you think RelaxNG will work any better than XML Schema 1.1

I said already, study it, don’t following your intuition. I already gave you a simple example how RelaxNG could have prevented your GUID-element-names, a few weeks ago.
Anyway, I will come back to this, in the mean time, take a look at the mail archive, you should find the example there.

Ah, I found it

I am not the slightest bit angry. I was just correcting your incorrect assertions about why I made design choices.
If you want to build your openEHR processor in COBOL, I couldn’t care less.
But when you mention my name or my project specifically and make incorrect assertions I will correct them.

Cheers,
Tim

> have ADL, AOM, and object transforms

What is missing is that xml offers validation and query out of the box, which means it has been developed and optimized for years by many companies and communities, and mostly is good quality software.

ok but we just agreed that XSD doesn't do the kind of validation that is needed by archetypes, so I think what you are really proposing is XML based on Relax NG as a sufficiently powerful approach that would a) implement the required constraint semantics of archetypes and b) create data that can be queried by Xpath/Xquery out of the box?

Is that your suggestion?

Exactly, that is my idea. I already did that for XML-schema, walking throug AOM-objects, and defining the structure-constraints and leaf-node constraints in XML-schema, but I don't like that very much because of the tricks I need to use. My hope is that Relax NG does not require these kind of tricks.
But I am not completely ready studying it, but I found that some important restrictions (concerning the restrictions in structure) of XML Schema do not apply to Relax NG, and I also checked the leaf-node-simple-datatypes, where it seems to allow to define all possible ADL-constraints.

So, I can build the structure of an archetype, keeping the paths in original names, keeping the constraints on the structure, and I can apply all leaf-node constraints.

But at this moment I am very busy doing something else, and there is (quite) some time-pressure on that, working in the evening, that kind of thing.

If so, my reaction would be: let's investigate as a community. I don't think anyone has sufficiently investigated Relax NG or Schematron for openEHR purposes, and in hindsight we probably should have. It would be very interesting indeed to see how much better it would work than XSD.

I will write a paper on it, how the structures and leaf-nodes are to be defined in Relax NG and publish that.
Maybe there is still something left to ask or discuss, that is always good. It will take a few weeks before I have time for that.

And what do I have then

ADL -> Relax NG conversion -> automatic validation based on good quality of code
XML-datasets -> XQuery -> full featured query language.

Optional is AQL -> XQuery syntax transformation, some people find that important, I think it is possible.

In fact, the complete OpenEHR ecosystem ready to work with XML, and almost everything running on matured code. That is the goal.

Bert

That is acceptable, but I thought you were criticizing me, and that is not needed to correct incorrect assertions. That gave me the impression that there was some anger.

But I am glad this is not the case.

Regards
Bert

I think Bert is comparing Relax NG to XML Schema, he’s not just talking generic ‘XML’. out of interest Tim, did you look at Releax NG or Schematron? They are , but I have to admit, in a pretty annoying way. But better than not being catered for… The lack of support for hh:??:?? is actually the fault of the ISO8601 standard, and I suspect it’s because the writers never actually implemented a parser, and had the simple realisation that a partial date or time (e.g. “1995”, “12”) is impossible to distinguish syntactically from an integer in a mixed data stream - some other help is always needed. XML Schema solves it with the data types gMonthDay, gYear etc. Ugly, but not really their fault. A slightly better designed 8601 standard would have saved a lot of problems, and the ultimate fault in my view lies at the door of ISO: a completely wrong model of doing standards. - thomas

out of interest Tim, did you look at Releax NG or Schematron?

Yes and 1.1 implements everything that RelaxNG and Schematron had to be implemented for in the first place. I wasn’t involved but it looks like they took lessons learned from the community quite well. So my position is; why introduce the extra complexity of multiple languages?

The lack of support for hh:??:?? is actually the fault of the ISO8601 standard, and I suspect it’s because the writers never actually implemented a parser, and had the simple realisation that a partial date or time (e.g. “1995”, “12”) is impossible to distinguish syntactically from an integer in a mixed data stream - some other help is always needed.

Yes, the non-implementable 8601 is an entirely different issue.

XML Schema solves it with the data types gMonthDay, gYear etc. Ugly, but not really their fault.

The ‘g’ prefix is non-intuitive. But as you say, they had to implement multiple datatypes to cover the use cases.

A slightly better designed 8601 standard would have saved a lot of problems, and the ultimate fault in my view lies at the door of ISO: a completely wrong model of doing standards.

The ISO issue is simply one of an organization not adapting to the changing times. It was probably a good model in the early days. But that is common in organizations. People protect their turf in many ways.

–Tim

Hi Bert,

The “define” is in the second part of the example, that is (called) the compact notation although there is also another notation for Relax NG which is more easier to understand for people used to XML-Schema. The other notation is like … (partially) etc … etc … I can recommend the book of Eric van der Vlist, it is also free for download or read online Just as in the example? This is one of the advantages of RelaxNG. You wonder why RelaxNG still exists, 10 years after the definition of XML-Schema 1.1? Here you have an answer. Bert

I meant ‘and never had the simple realisation…’


Subject:
Re: Trying to understand the openEHR Information Model |


To:
For openEHR technical discussions openehr-technical@lists.openehr.org |

  • |

Hi Bert,

Tim, Bert isn’t trying to replace the multi-level archetype modelling architecture - that already does what it does, he’s talking about physically representing archetypes and templates in Relax NG for use in his production environment. That’s just a question of what you can losslessly serialise to. - thomas

Yes, that it is. I had more difficulties explaining this, must be a rather unconventional way of thinking in my mind :wink:

I "translate" ADL into RelaxNG for validation purpose, and some other purposes, like type-attribute-assignment for index-optimizing which is good for xQuery, and I believe there is some interest from GUI builders for the schema's.

The story is that the kernel is not only multi-modeling, but also multi-reference model, and that simultaneous, so store/query EN13606 or OpenEHR, it doesn't matter for my kernel, also in a single statement.

Just for fun I wrote a "winecellar-RM" type, grape country, region, year, oak, and I can query that too, and have, besides EHR-records of my patients, (coming from OpenEHR environments or EN13606) also their winecellar in vision, and I can query them all three in only one statement.
Give me all patients which have influenza, taken aspirine and Australian Cabernet Sauvignon in cellar.

Of course the winecellar-RM is just for joke, to demonstrate the flexibility of the kernel, but it is true, it no problem.

I wanted to connect Tims RM (MLHIM) too, but I stopped that because of his GUID-thing.

But the outside eco-system will remain ADL and AQL, and the base data-types are also OpenEHR, because they are sufficient, for now. Maybe I write additional basic datatypes later.
So, everything must be definable in ADL 1.4, that is the only requirement.

And you know what I get out of the box, very advanced features: Validation, querying, and versioning of data.

Bert

XML Schema doesn't do it, not even in 1.1
But that is a detail.

Bert

It is an incorrect detail. One that I am sure there are many many people using XML serializations of ADL archetypes will disagree with.

I just downloaded the famous blood pressure archetype, XML serialization, from the openEHR CKM.

Auto generated a 1.0 schema using oXygen and then auto generated sample instance files. There are some namespace quirks that have to be adjusted, that is all that keeps them from being valid out of the box.

But this is not multi-level modelling. There is no reference model for the schema to validate against. What confused me before was your use of “RM”. What you really did was create a knowledge model about wine against the openEHR RM.

While I do not intend to be pedantic about it, there are significant distinctions here that need to be clarified and understood so that the importance of multi-level modelling is well understood. Maybe others can comment as well.

Cheers,
Tim

I don’t know of the term “serialization of ADL archetypes” is correct, it leads to misunderstanding. I am talking about serialization of the AOM belonging to a specific archetypes and effectuate all constraints into an XML-schema, so that it is fit for validation an XML dataset belonging to that archetype. So it is not just serialization of the AOM, is is transforming its constraints to XML-validation. I don’t no other way to make this clear. And for that purpose, XML Schema 1.1 does not offer enough flexibility, because the AOM is conceptually incompatible with the possibilities of XML-Schema. That is why you invented the GUID-thing, remember, Tim? See, Tim, I know better, you try to fool us around. It is easy to proof to. I don’t know why you do that, but to me it is obvious that you do it . You had two problems with XML-Schema, and that are the same problems that also appear when you transform AOM constraints to XML-Schema 1.1 constraints. One problem is about restriction/extension, the other is about having to Complex Types with the same name, but different content. Your traces about this are everywhere to find on the mailinglists of XML.org, Saxony, Oxygen. For example: Do you remember? Are we talking about something different then XML serialization, from the openEHR CKM? You do understand, because you solved the problems of XML Schema yourself with the GUIDs. See my message from a few days ago, I explained this, and I explained why you brought in the GUIDs in element-names. OK? But you are, but that is not important, that is your hangup, you try to be damaging. There is some similarity in data-processing, it doesn’t matter if it is wine or healthcare. It was wrong to bring the example up, I thought I was under friends, understanding, etc, but of course, this is an open list. Now I have to spend time to explain the obvious. The difference is not in the kernel-data-processing. The data-processing is in the modeling, and since we are talking about two/multi-leveling modeling, my kernel keeps all options open. That is exactly the point about two level modeling Did you miss it for a moment? You can model health-care in it, to the complexity of OpenEHR or EN13606, or anything else, even MLHIM. I remember Rong having written a car-archetype, I borrowed it from him. ADL is not about health-care, it is, for example, also about Demographics, about Relations, it can be used inside the OpenEHR RM, in the EN13606 RM, but the AOM understands every RM, even wheels from a car can be modeled in it. It is a modeling language, and a very good one, better then XML Schema 1.1 It is about archetypes, and very useful in health-care. You know that Tim, just as you know what the problem is with XML Schema 1.1. I really have the impression that you are angry with me. Maybe because it is that I don’t think your solution for the XML-Schema 1.1 problems are workable, and I talk freely about that. Maybe that annoys you, maybe you feel I am right in this. Maybe you don’t feel confident. Whatever Tim, I think we should not discuss this any further, because it can easily lead to damaging persons, and for me, the whole thing isn’t worth that. Have a nice day, and a nice product, I think I stop discussing with you, except when it cannot be avoided. Again, have a nice day, time for a drink, maybe, in Rio? Enjoy it. Bert

Bert,

if you want to distribute that, it would be a great example RM for the ADL workbench - do you have it in BMM format?
plus some wine cellar archetypes?

We could add that to the ADL Workbench distribution. In case you want to make people envious of your wine collection :wink:

- thomas