Term set for DV_PARSABLE.formalism

Reading the specs I realize that there isn’t a term set for DV_PARSABLE.formalism and it is a free text attribute.

Since stuff like XML, JSON, CSV, etc. are in fact modeled by DV_PARSABLE, and those have a MIME type associated (text/xml, application/json, text/csv, …), shouldn’t we define a term set for that attribute like we have for DV_MULTIMEDIA.media_type? (of course this attribute is CODE_PHRASE and not String like “formalism”).

Probably was left this way to deal with the ones that don’t have an official mime, like adl

If that’s the case, we lose the coding system / terminology of the mime types that are defined.

It would be better to make DV_PARSABLE.formalism of type CODE_PHRASE instead of String and use “local” for the terminology_id of those formalisms that doesn’t have a mime type.

thoughtS?

well actually we could do that and put all those other formalisms into the openEHR terminology. The original idea was to allow (encourage) MIME types as strings, and then outside of MIME, any other formats just as their own short string, e.g. ‘mp5’ (imagine it exists), or ‘adl’. There are a lot of formats that are essentially text/plain, but the format is actually parseable, e.g. glif3, most programming languages and so on. So ‘text/plain’ isn’t that useful a thing to know. I wonder if we have to give in and have two fields, one that is a MIME type field (the current one) and a second field that has a term defining the semantic format, mostly applicable when the MIME type field is text/plain, text/xml and other text types. - thomas

I think we can put the formalisms we know in the terminology, but being flexible to allow local formalisms, like using “local” for the terminology_id. If we don’t do that, we’ll need to maintain the terminology for every new formalism.

Not sure about having two fields, it seems the role of both is the same. We can do that with parameters or suffixes on MIME types (the structure allows that): https://en.wikipedia.org/wiki/Internet_media_type

With CODE_PHRASE we can cover both cases, defined and non-defined MIME types.

If the MIME type is String, we lose control over the values, I’m considering trying to process something I receive and not “understanding” that String. With terminology = local, I know that I may not have the code (if “local” is another system), but for a MIME type from IANA I will have it. Also this allow us to define our own openEHR types without the need of registering those at IANA, like ADL or OPT (e.g. text/opt+xml).

Hi Pablo,

in openEHR, a terminology id of 'local' normally refers to terms inside an archetype; each archetype is its own terminology. This makes sense for clinical terms in an archetype, but not, I don't think, for terms to do with data formats. We can easily add these to the openEHR vocabulary, then that provides a reliable single source of such terms, and doesn't require archetype authors to re-invent (different) terms for the same thing repeatedly.

- thomas