Support Markdown in archetype/template metadata, descriptions, comments etc?

ian.mcnicoll · 12 July 2023 15:35

We are starting to see a requirement to support some more complex formatting in ADL ‘metadata’ items, in particular on PROMS scores where their licensed use demands some specific formatting markup e.g use of bold text in some questions or headers.

We have experimented with using markdown to represent bold text, which keeps the text human-readable but it might be useful if tooling recognised this so make it easier to read.

MSK-HQ archetype

It might also be really helpful in making the use/ misuse aspects of the archetype more digestible and allow links, to images etc to be embedded.

One option might be to add some sort of keyed item to other_details to indicate that markdown is being used in a specific archetype.

[“markup”] = <“markdown”>

I’d suggest sticking with Github markdown or a subset GitHub Flavored Markdown Spec

Bold, italic, headings, numbered lists, bulleted lists, image links, web links ?? tables

ian.mcnicoll · 12 July 2023 15:53

FHIR says, which seem pretty sensible, say …

About the markdown datatype:

This specification requires and uses the GFM (Github Flavored Markdown) extensions on CommonMark format

Note that GFM prohibits Raw HTML

Systems are not required to have markdown support, so the content of a string should be readable without markdown processing, per markdown philosophy

Markdown content SHALL NOT contain Unicode character points below 32, except for u0009 (horizontal tab), u0010 (carriage return) and u0013 (line feed)

Markdown is a string, and subject to the same rules (e.g. length limit)

Converting an element that has the type string to markdown in a later version of this FHIR specification is not considered a breaking change (neither is adding markdown as a choice to an optional element that already has a choice of data types)

Note this is purely about archetype metadata not about data in the patient record.

pablo · 12 July 2023 16:55

We would need to define a list of supported formats and specify how these should be interpreted. For instance, markdown isn’t a single thing, there are many flavours.

ian.mcnicoll · 12 July 2023 17:02

That’s why (in line with FHIR) I was suggesting GitHub markdown

heather.leslie · 13 July 2023 01:59

Hi Ian,

A number of thoughts.

Firstly, I note that this archetype is currently going through local review, and it is not aligned with the Editorial guidelines currently in place for the international CKM archetype ecosystem- https://openehr.atlassian.net/wiki/spaces/healthmod/pages/304742407/Archetype+content+style+guide#Scores-and-Scales.

So at some point, if your version is published, it will not conform to the international modelling. To be included within the international CKM, it will require significant changes and we will end up with conflicting models and data divergence.

While the content will essentially be the same, from what I can see in your current version of the archetype, it appears the ID, metadata etc, will be significantly different. The international approach is largely focused that the archetype is simply a data representation of a validated score with enough metadata to support distinguishing one data element from another. We deliberately and strategically avoid representing what the score is or how it is to be used by clinicians - the references point to the original papers or copyright holders to manage that. It is not our job to educate anyone about the score or scale, only to represent it as a computable spec, utilising the original paper or copyright holder descriptions so as not to change the semantics and avoid any breach in copyright. In that way, we attempt to dodge some (?many) of the copyright issues, however we all know that this is a messy area and there are no clear rules.

Secondly, it would be helpful to understand more about the requirements/drivers for your proposed markup changes. It will potentially impact other archetypes if we go down this path. If it is about UI presentation, then that is a bigger discussion that needs to be unpacked and discussed, perhaps with this archetype as the use case. If it is about copyright approval from the copyright holders it may be a reflection of educating them that archetypes are about data representation and not directly about UI representation. Perhaps strict copyright approval might require a different approach - perhaps a combination of the archetype plus concurrent publication of UI instructions or similar - I don’t know, but your reasons for requesting change will be very helpful in formulating the best solution.

I would really prefer that we are not reactive and rush to a solution in the ADL, until the broader impact on the archetype modelling approach & existing models is carefully considered and agreed.

ian.mcnicoll · 13 July 2023 08:53

Many thanks Heather,

There are a few questions here.

We definitely do want his archetype to be fit for international publication and are aware that the current revision does not meet CKM Editorial guidance rules but this was deliberately left unchanged from the originally proposal to give our new Editors a bit of experience, and to allow our reviewers to get a better sense of the overall process. We also want to try to assess the cost/resource for this kind of ‘baseline’ editorial activity - setting up reviews, fixing stylistic issues, aligning with CKM guidance etc. You can be assured that all of these will be addressed as part of the review round.

Completely agree that until now we have avoided getting into any kind of representation of UI elements, but we do think there is case to be made for some limited support for this solely in the context of formal PROMS, particularly where licensed use requires very specific wording not only for the questions and answers but for some heading guidance.

Appreciate this is contentious and needs discussion so I’ll add this to the existing topic on how to handle ‘original questions’, which is related.

Using markdown to indicate e.g. bold text is a further refinement of that idea but I think there is a much stronger case for supporting markdown in the metadata parts of the archetype - purpose ,use, misuse etc. Possibly even more so a template-level, now that we can create template overrides for these in AD.

It is used to very good effect, IMO, in FHIR documentation, IGs etc

I definitely don’t want to rush into a solution in ADL but just raised the possibility of using a ‘whole-archetype’ flag to indicate that markdown is used in a particular archetype, as this would prevent any impact on existing archetypes that might use markdown ‘characters’ like # or *

sebastian.garde · 13 July 2023 11:33

Let’s concentrate on whether we can reliably add markdown or at least one fixed flavour.

There is also the markdown support now in DV_TEXT Data Types Information Model so I think there should be some consistency in how this is spec’ed and implemented if possible.

In particular Ian’s suggestion:
[“markup”] = <“markdown”>
could then be
[“formatting”] = <“markdown”>
and default to
[“formatting”] = <“plain”>

It is strongly recommended in the DV_TEXT specs that the markdown format is CommonMark and I would say we should not “strongly recommend” two different flavours.

GH flavour’s extensions not covered by CM seem to be not so relevant for this?

In any case: a consistent recommendation would be very helpful.

Links are recognized by CKM in the meta data at least I believe and just presented as such.
Looks like this is what one of the two markdowns (GH) is doing as well, where the other (CM) needs explicit <> indicators - in which case we cannot do this automatically if being faithful to the spec if CM is chosen.

This requires some analysis of implications for downstream artefacts, transforms and downstream tooling.

ian.mcnicoll · 13 July 2023 12:12

GHM is an extension of CommMark.
The only potentially useful extension in GHM is tables, so I’d be happy to settle for
CommonMark , or even a subset of that .

thomas.beale · 13 July 2023 15:03

That means there is an extra thing to code for and an extra field for authors to forget to update when markdown is added to some text that never previously had any.

I’d suggest we should just assume all text could have markdown (we should have done this earlier!) and render it according to tool preferences (which might be not to do any rendering).

This seems like a different topic, but - on that topic, I’m inclined to agree with Heather - mainly on the basis that maintenance / change of the semantics of a score is down to the authors of papers or guidelines outside the archetype - so it is better to point to it, not create a snapshot of such documentation. (@heather.leslie - the question of how such external content is documented surely warrants its own topic?)

Do you mean that if there are other archetypes with markdown like asterisk bullet lists etc, these might now be automatically rendered? Or some other use of asterisks (markdown loves asterisks) that may be rendered in unintended ways?

That is probably true. I would suggest a first step would be to implement a Github flavoured markdown renderer in CKM or AD, and just run current archetypes through it and eyeball them. That will be slow, but probably unavoidable.

ian.mcnicoll · 13 July 2023 16:11

We actually agree - we left the ‘incorrect Purpose’ as an exercise for our editors.

I originally thought we should just allow markdown by default as you suggest but it does run the risk of some collisions with current archetypes which was why I suggested the extra flag, perhaps setting it as defaulting to true for new archetypes.

@sebastian.garde and I will have a play with what might happen if we start to add markdown. I’m actually inclined initially to row back to CommonMark - GitHub is just an extension on that and from our POV only adds tables.

Kanthan_Theivendran · 17 July 2023 18:31

Thanks Heather. PROMs are validated patient rated symptoms that are fixed. Therefore the total score with minimum and maximum ranges are helpful based on the original scoring guidance for that particular PROM and forms an integral part of the archetype of that PROMs. However, I do agree that the arbitrary rating of the single score such as mild, moderate and severe according to score groups (e.g 1–30 = mild, 30-60 = moderate etc) should not be represented or modelled within the archetypes.

I agree that the use and misuse metadata Should be included to follow the style guide.