Updated ADL, AOM 1.5 and new ODIN specifications

I have updated the ADL and AOM 1.5 specifications to reflect recent proposals for artefact identification. The main changes are that in the AOM, the archetype id as we know it today is constructed from pieces of meta-data, of which the version identifier is one.

A more interesting change for most people may be that I have now removed the ‘dADL’ part of the ADL specification and given it a new name and its own specification. For those who don’t know or remember, dADL is a pure, generic object serialisation syntax - yes - another thing like JSON, etc. It’s new name is Object Data Instance Notation (ODIN) and the new spec can be found here. You can see this specification is in a new ‘syntaxes’ group at the bottom of the main table in the main specification baseline here.

I have set up an ODIN project at the openEHR Github, here, with the idea that we could collect the parsers and serialisers from various languages in this project, or else point to them from here.

Some may ask why we have ODIN (dADL), given that there is XML, JSON, YAML and other syntaxes. There are reasons: when dADL was first invented (about 2002), there was nothing except XML to use, and it is not a particularly clean object serialisation syntax, nor realistically human readable. dADL was designed to be properly object oriented, human readable and writable, to have rich leaf data types, to support Xpath pathing, and to enable much smaller texts than XML.

Amazingly, dADL / ODIN still has stronger leaf data types, as well as dynamic typing (a key feature lacking in JSON) and object identifiers.

For anyone interested in putting together ODIN parsers/serialisers for the various languages, please make yourself known, and let’s discuss how to do it. A survery of such syntaxes indicates that there is growing interest in non-XML / post-XML data syntaxes (e.g. recent Dr Dobbs article), and I think ODIN could have its place in the wider world.

  • thomas beale

Hi Thomas Beale,

Could you show me some examples of ODIN?
I would like to work on it and create machine readable features by Cucumber.
Cucumber (http://cukes.info/) provides a behavior driven development
environment for many languages.

Regards,
Shinji

Hi Thomas

How are you ?
Regarding your email
“Updated ADL, AOM 1.5 and new ODIN specifications :For anyone interested in putting together
ODIN parsers/serialisers for the various languages, please make yourself known, and let’s discuss
how to do it. A survery of such syntaxes indicates that there is growing interest in non-XML /
post-XML data syntaxes (e.g. recent Dr Dobbs article), and I think ODIN could have its place
in the wider world.”
I’m interested in translating to Persian.

Regards
Shahla
Test Engineer
M 0402 181 292
Skype - shahlafoozonkhah
Ocean Informatics Pty Ltd P.O. Box 509 Indooroopilly Qld 4068
http://www.oceaninformatics.com This e-mail, including any attachments sent
with it, is confidential and for the sole use of the intended
recipient(s).This confidentiality is not waived or lost if you receive it
and you are not the intended recipient(s), or if it transmitted/ received in
error. Any unauthorised use, alteration, disclosure, distribution or review
of this e-mail is prohibited. If you are not the intended recipient(s), or if
you have received this e-mail in error, you are asked to notify the sender
by telephone or by return e-mail. You should also delete this e-mail
message and destroy any hard copies produced.

Hi Thomas Beale,

My comments:
1) Page 33, A2
JSON is no Java Simple Object Notation. JavaScript Object
Notation(http://www.json.org/)
2) How to encode binary data?
In order to serialise binary data as text format, it need to encoding
system, such as Base64. What encoding system will you adopt to ODIN?
3) FYI: This presentation slides show the comparison of seven
serialisation techniques.
XML, JSON, Java binary serialize, Protocol Buffers, Apache Avro,
Protostuff, and Rugson.
I think one of the best feature of ODIN to compare these serialisation
techniques is that has strict type system, object oriented.

Regards,
Shinji

oops, getting old, memory going I wonder why not Base 128, since ODIN already assumed UTF-8 strings. The real question is: how to detect that we have some binary data? ODIN works on the idea that every leaf type is inferrable syntactically. In theory we could just do my_binary_data = where some characters will be from the non-printing characters in the 0-127 range. That wouldn’t be a problem, but it could be a problem to distinguish from Integers, since some binary encoded data might come out to be my_binary_data = <952> So I think some other marker is needed in ODIN. Maybe something simple like my_binary_data = <#952#> I guess you meant to include a link? I found at Rugson.org.. I have to admit, I don’t know how any data format without dynamic type markers can be used with real data… - thomas

Hi Thomas Beale,

At first, I am much sorry about lacking links.
http://www.rugson.org/pdf/PSRCPattaya2012.pdf
In this presentation, serialisation formats were compared with
features and metrics, such as size and (de)serialise performance.

And the second, to address binary data in ODIN format, I think using
typing system might be good.
ex.
my_binary_data = (Binary:Base128)<211234blurblur>

that's not a bad idea at all. I need to think about that....

- thomas

By the way, we should use the momentum to also revamp the available
metadata. A few ideas:

- Move 'copyright' from language specific information to general
metadata (It's not being really translated at the moment).
- Move 'references' from other_details to general metadata (It's
important enough IMHO).
- Information about date of validation, validity time and who validated it.
- RM version this archetype was based on.
- etc.

Personally I would agree with all of the above. I have already added the rm_release to the ARCHETYPE class now in the AOM (not yet pushed up), but for the others, I suggest we try to create a wider discussion to do this exercise with a small amount of discipline, but still be in a crowd-sourcing mode (is that possible :wink: To that end, I added a child page to the , , dedicated to meta-data. I added some tables where we can potentially review the current model and propose changes. If people think this isn’t sufficiently detailed, feel free to rework it in another way. - thomas

I have added a “license” attribute. An archetype can need both a copyright and the applicable license.

David

In fact, 'license' could be translated, but translating 'copyright'
makes less sense

Clearly we are not in the business of creating translations of things like the CC licenses ourselves, which is the license of archetypes (at least openEHR ones). We would need to rely on those ones that are created by creativecommons.org community. talks about translating licences. It’s not obvious to me on a brief look, but I would expect that for any given canonical license URL like to have equivalents in other languages like for Spanish etc. I also suspect that for a CC (and other) license in English language, and with ‘international’ as the jurisdiction, that English is actually the official language of the license, for all users, on the assumption that any court that might process a case based on one of these licenses would be an international court and have English as its working language (like the Hague ICC does). The only use of translations - I think - is to just enable non-EN maternal language users to more easily understand the license. So we either treat the license field as a non-translated field and just include canonical (EN) URL, and assume the user will go and find the translation if they need one - I think this will be easier. If we treat it as a translatable field, then we probably have to figure out a correct URL for each translation, which might just be the ‘en’ one for languages in which the CC license is not yet available. This seems an annoyance with no real gain. - thomas