Refreshing archetypes related to pathology reporting

At the SNOMED-meeting in Atlanta (I’m attending virtually :upside_down_face:) we will hear about building FHIR questionnaires using the widget on the Nat’l Library of Medicine site. Nothing new, certainly, but as the CAP and ICCR cancer protocols are all now almost fully bound to their SNOMED-CT codes, it gives an exciting opportunity to see a functional algorithmic digital protocol with S-CT codes take shape.

The widget is at https://lhcformbuilder.nlm.nih.gov/ . Click “start with existing form” and “import from local file”, which is the appended .json file for colorectal ca. Most of the work is by Alejandro Lopez Osornio and Hwang Ji Eun, a Korean postdoc working with Scott Campbell (University of Nebraska) at SNOMED International, I added some stuff, it’s dead easy. This is only a preliminary file shared a few weeks ago and work in progress, but since this came out I think Alejandro and Hwang have continued the build on colon and on other cancers, which they will present.

Colorectal-Cancer-STDU-v1.1.R4.json (42.2 KB)

I have as yet not found a way to extract the generated data into a report or summary, but from previous experience working with Sectra I know that that is not a difficult task - for the initiated! :innocent: What appeals to me is the ease with which one can create a “goal” for a form in openEHR. I see it as a platform to easily add national or regional variables, grasp how it all functions and share a protocol with learned communities / pathologists at an early stage to get their input. I will pursue the matter and keep you posted.

Meanwhile I’m working on the specifications files and MindMaps.

When have we planned our next meeting?

3 Likes

I would also like to now :slight_smile:

1 Like

Good morning all,
Hope you are ok.

I was wondering, should we have a shared document/map for this " mapping all data elements from ICCR reporting for the colon" that is one of the action points from our last meeting?

And would it be good to reconvene on the first week of December?

2 Likes

The mapping for ICCR is pretty much completed in Excel and will be even more accessible in the FHIR questionnaire as mentioned above, as these will contain the SNOMED CT codes too, and allow for checking for updates on S CT codes in something called the “FHIR Questionnaire Terminology Bindings Validation Tool”. I’m currently going through the Palga variables and adding it to the ICCR and CAP mappings, and hope to share the full list shortly. The RCPath mappings can come at a later stage. A shared document / map would be good, yes!

I do not know when the “official” FHIR questionnaire that Alejandro c.s. are working on will be released by SNOMED Int’l, but I’ve requested a peek preview… ! :grin: I do not intend to continue building the colon FHIR questionnaire since it will already be done by Alejandro c.s.

I believe we should also apply our time in chopping up the protocols into common and digestible parts that could be converted to archetypes. I hope to get some possibilities out within a week or two - certainly before our next meeting. The beginning of December sounds excellent.

I hope several people in this thread will be in Arnhem on the 17th of November!

2 Likes

OK, I’ve received the finished .json files on Breast and Colon from SNOMED; prostate to follow some time this week. I will check the files and submit them for your perusal together with the Excel files shortly.

1 Like

Hi everyone,

I have created a poll (hopefully I’ve done it correctly) to try to find the best spot to meet:

Please mark your availability for: Pathology modelling - part 2 https://whenavailable.com/event/Uniz8e2BZj3adsBp6

I set it up so there’s time to reply until the 17th (next Friday)

2 Likes

Great Marlene, thanks. I’ve filled in the poll.

Hi all,
I have attached a draft meeting agenda to the calendar invitation. If you would like to add something to it, please let me know. I made it very high-level.
Marlene

Hi Marlene!
Thanks for your lead.
I can’t find the agenda, I must be looking in the wrong place – please advise!

Hi Stefan, I just sent an update of the invitation which should contain the agenda. I though I had saved it correctly. Please could you confirm that it is attached in there?

It might be that if I only save changes and not send the update it won’t show outside the organisation.

Sorry!

Hi Marlene! Yes, attached: all in good order – thank you!

1 Like

Dear all,

Please excuse my long silence. In recent weeks, I have built, tested and discarded many concepts for capturing microscopy findings in openEHR. The three concepts I tested are: the specialist design, the generic design, and the name/value design using the existing archetypes laboratory test result with the analyte cluster.

One challenge in mapping microscopic findings or any diagnostic reports in openEHR is representing complex, interconnected data elements. For example, in microscopic assessments, all data elements of a tumor must be assignable to that tumor (e.g. squamous cell carcinoma, 2cm, lung). Some information like diameter only contain one data element, while others like IHC analyses contain at least 4 elements (TTF1, 90%, membranous, moderate) that also need linking together. And within one microscopic assessments, an adenocarcinoma could be found alongside the squamous cell carcinoma. The question is how do you want to reflect this?

To better understand here an example. Imagine we found two tumors in one microscopic assessment, how do you link the correct tumor diameter to the respective tumor? In the simplest design (name - value) we can address this by providing specific names e.g. Tumor1 - 3cm and Tumor2 - 5cm. While this would work for information with a single data element in scenarios with n data elements this will be difficult. For instance, how do you map the link among data elements of IHC? E.g. in the example below is ROS1 or ALK 50% positive?

The specialist archetype design
In the specialist archetype model for each report, like prostate cancer, lung cancer etc one archetype is created. The design is straight forward, for every information a data element is created and linked with the NodeID path. Below is the example of the Microscopic findings – prostate cancer archetype that captures all required data elements for prostate carcinoma.

Advantages:

  • Easy querying as data elements assignable via specific node ID paths
  • Can comprehensively capture the complexity of diagnostic workflows
  • Intuitive archetype design as focused on one narrow domain
  • Allows optimizing data capture for certain applications

Disadvantages:

  • Significant duplication across many niche archetypes
  • Cumbersome to govern and maintain numerous archetypes
  • Updates required across multiple redundant archetypes
  • Lack of reusability leads to siloed data
  • Complex to query across diverse specialized archetypes
  • Risk of inconsistent data from slight differences in designs
  • Change management overhead multiplied many fold

The generic archetype design
The generic archetype is designed for one examination modality like microscopic findings, MRI findings etc. Here the Microscopic findings archetype contains core information captured in an microscopic assessment. Customization for different findings can happen at the template level.

Picture 3

Since each data element has a standardized node ID path that is uniform for all microscopic findings, the information can be easily queried.

Advantages:

  • Highly scaleable since it promotes reuse across diverse clinical scenarios
  • Consistent structure enabling simpler governance
  • Easier to manage updates and extensions
  • Avoids duplication of data elements
  • Simplifies querying since node ID among all reports are consistent

Disadvantages:

  • More conceptual complexity designing generically
  • Less optimized for specific workflow needs
  • Can underfit specialized use case requirements
  • Need for localized customization may emerge
  • Monolithic archetypes require bigger change sets
  • Lose intuitive real-world specificity

Name/value design
The last approach is the name/value design. As you can see, I plugged in the analyte cluster multiple times into the laboratory test result archetype to capture all values.

Picture 5

Advantages:

  • Very simple and flexible structure
  • Easy to add new named data points as needed
  • Enables recording almost any test-result combination
  • Loose coupling allows mixing diverse data elements
  • Querying across name-value pairs is straightforward

Disadvantages:

  • Lacks rich semantics to interpret findings
  • Highly limted to structure relationships between findings
  • Unable to express complex clinical concepts
  • Risk of inconsistent naming for same elements
  • Requires significant terminology binding work
  • Inadequate representation of contextual meaning
  • Limited ability for computational inference

While this problem and potential solutions are described in the context of microscopic findings, it is a generic issue affecting radiology, genomics, immunology and other domains too.

I would be very interested in your thoughts.

1 Like

Very many thanks for this. It brings back (un?) happy memories of the challenges we faced the last time around.

I’ll try to simplify the problem space by immediately excluding the name/value pair option. This essentially mimics FHIR Observation / v2 lab pattern, and works well for simple Biochemistry type labs since these are generally simple name/value results

The problem with applying this pattern to histopathology is that you end up having to apply increasingly complex constraints on these simple patterns to represent the constraints in a path report. You are essentially just deferring the complexity, unless there is not attempt to add constraints or validation to the models- Fire and forget message exchange, essentially.

So that leaves

The generic archetype design vs ** specialist archetype design**

What is probably not apparent is that in the earlier work we did actually start out with the aim of doing generic archetype design, albeit broken down into individual clusters like tumour_invasion or tumour_margins but increasingly we found that at the leaf node level , in supporting direct care, the generic pattern tended to break down - we were having to add more and more cancer-specific elements, and to constrain out the generic elements.

So at that point we decided to build cancer-specific archetypes to handle those parts of the data tree that were unique to that tumour. The prostate microscopy cluster is actually a really good example - if you look closely you should see that the element defined are all very specific to the way e.g that tumour margins are defined for prostate cancer, quite differently from e.g bowel cancer.

The generic pattern does allow more reuse, in theory, but as for the name-value pair option, the reality is that different cancers are reported quite differently in many cases, and you end up with just as much of a management burden down stream to turn the generic patterns into something more specific.

The generic pattern does also, in theory, have the potential to make cross-tumour querying possible, and attractive for pre-populating registries e.g for 'Tumour margins".

Again this is less easy than it seems because cross-querying because very often the detailed direct care representation for prostate cancer is quite different from that for bowel or lung and is simply not directly cross-queryable from the original data.

However, this is not too dissimilar from the issue we already faced in Wales when looking at the challenge of Tumour staging - the direct care perspective is often cancer-specific, whereas the reporting/cross queryable dataset is often more generic. The solution/suggestion there is to carry both generic and specific representations, and wherever possible calculate the generic response from the specific in real-time. So using your examples, a prostate cancer template would carry both the generic and specialised archetypes.

For tumour margins, the specialised archetype would carry the detailed prostate specific representation but with a set of rules to also populate as simple generic representation “clear margins”

I think this is probably where we will land

Great work!!

1 Like

I agree, wonderful work, @Maurice247 ! And thank you very much for your critical appraisal and additions, @ian.mcnicoll - as well as the historical perspective! :wink:

I’ve had a little time to look at this using the ICCR protocols as the first reference re content (although not nearly as granular as one may wish, as I will demonstrate with the expanded NLM questionnaire for coloncancer). I would also aim for a hybrid generic-specialised distribution of data elements. We can certainly come a long way with generic, and perhaps dedicate one separate archetype which collects cancer-specific questions from different cancer types. This would maybe help limit the amount of changes/amendments that would become necessary when further ICCR protocols are published.

Maurice and I have arranged to meet online ahead of the upcoming meeting on 8th december. We hope to align our thoughts and find further inroads into this very fundamental question, to be shared with you the 8th.

1 Like

Thanks Stefan,

Agree.

  1. Try to use generic pattern when we can
  2. Accept that this will breakdown or become cumbersome to constrain for quite a number of places. For that accept the need for tumour-specific specialist clusters
  3. Where we need specialist clusters, try to pre-populate the generic and simplified representations from the specific, automatically in real time.
1 Like

Great, thanks Ian.

I don’t understand the “automatically in real time.” Can you explain - or otherwise leave it for next week.

This might be a key point working with structured health data where the complexity and depth of data is infinite. Going deeper in the structures try to leave summaries back on the higher defined models. This way different applications and users can use the same data in multiple contexts.

When adding a new CLUSTER to a shared/generic archetype to i.e represent specific details, leave some data in the generic archetype.

Is this what you meant @ian.mcnicoll ?

2 Likes

@ian.mcnicoll thank you for sharing your experience. During my testing, I met with a former Siemens engineer who worked on the AI Pathway Companion (a product to retrospectively organise clinical information). He confirmes your crictics to the generic approach. I therefore believe that @SDubois’ hybrid proposal is promising. If we create a “core archetype” that covers a basic structure needed in the majority of cancer reports and clone it for the respective cancer reports, we would have advantages from both worlds. The core model allows for centralized governance, provides a clear structure on how to further specialize it (e.g. how to nest information to preserve semantics) and standardized querying. I am super excited to further discuss this with the entire group and create a first draft.

2 Likes

We’re going places with this, nice!

Thank you @bna for your explanation.

I believe strongly that trying to assemble “all” available variables from all cancer protocols, be it in ICCR or elsewhere, should be the first step to have an overview of what is needed, what will be included in the generic and what will necessitate specialist clusters.

The centralized governance @Maurice247 brings up is paramount. Otherwise we may end up with all sorts of dialect protocols.

Thank you @mar.perezcolman for organizing the meeting! For the agenda, Stefan and me would like to suggest we invest some time to discuss how to best map relationships among data elements (semantics). My post about specialist vs. generalist approaches could be a starting point for the discussion.

Looking forward to meet you!

1 Like