Semi structured narrative data

Yes, within openEHR system environments whose software and models were written by semantically conscious people, this is all pretty reasonable.

Just be aware that when the data get sucked into some other environment, users there may make the assumption that the codes embedded in the data express the whole and true semantics of the data. If they do, incautious use of codes in our nice openEHR environment may have unintended consequences later.

This is not to say don’t do it; just that this is the kind of risk being run. It might be a low / no risk.

4 Likes

Aah yes, that’s a valid concern. And another reason I hate mapping to outside systems. Assumptions that make sense in one system are crazy dangerous in another. I hope we can agree that in an openEHR system mapping (a piece of) a DV_TEXT in a EVALUATION.clinical_synopsis.synopsis to a snomed clinical finding, it’s not a diagnosis. (And I would argue the same for other SNOMED uses, a terminology is not a fully computable information system, why else would we need openEHR archetypes.)
Then I’m willing to take the risk other people do something stupid.
But having said that. Could it help to add a char to TERM_MAPPING.match indicating an approximate match, for example:~? This would make the intention of the mapping even clearer in openEHR. And if we would do uri like Ian suggested by markdown url with protocol set to openehr_mapping::// there is an indication for an implementer in another system to have a look at the information in the openEHR TERM_MAPPING class and the ~ match should be a second warning not too issued too much.

No. Most parts of SNOMED CT has a context, although the context often is expressed as a default context. See for example 6.2.3. Default Context - Search and Data Entry Guide.

Btw, I think that the entire SNOMED CT Search and Data Entry Guide would be interesting for this discussion.

1 Like

I think that you oversimplify SNOMED CT here. SNOMED CT is both a terminology and an ontology and you can therefore perfectly well interpret and draw conclusions based on the meaning of a SNOMED CT concept.

Hi Mikael,

I agree - my concern was not so much about the power of SNOMED CT itself but the ability of any NLP to correctly pick up the appropriate context and apply it, or associate other parts of attribution like dates. I know there has been a lot of interest in this approach and I have a UK colleague working on it - I’ll see if I can get him to do a demo of their narrative-> SNOMED CT solution

1 Like

Hi @mikael , interesting, I didn’t know about snomed default contexts. Thank you for educating me.
I read the default context for a finding (e.g. UTI) to be:

* The finding has actually occurred (vs. being absent or not found).
* It is occurring to the subject of the record (the patient).
* It is occurring currently or at a stated past time.

But this still leaves a lot of context out to be able to you need to programatically conclude a patient ‘has’ a UTI. e.g. is it a diagnosis? who made the diagnosis (doctor/nurse/neighbour/facebook)? Is the diagnosis clinically significant or just a mild bacteriuria. etc. etc.
Otherwise we wouldn’t need information models at all, right?

The downside of this default context is that terms that do not match that context ‘family history of UTI’ are not codedable in snomed.

The search and data entry guide sure seems interesting. Any recomandation how to approach it? Aside from start at page 1 and spent multiple weekend days before you end up at page 65? (a)
I do now better appreciated Ian’s concern about automated snomed encoding. But this is also goes for average users, they won’t understand default context, which means the scope of usage of snomed is much smaller than I hoped.

1 Like

Yes @ian.mcnicoll, I know that there are good NLP solutions that can tag text with SNOMED CT concepts and I agree that it is capturing the context that is the hart(est) part of the process.

However, I have also seen less than good NLP solutions for SNOMED CT tagging that have missed the default context and similar SNOMED CT features. Hence my comments.

3 Likes

Keep your comments coming - I think we are somewhat out of date on some aspects of recent Snomed technology, so don’t be afraid to correct us.

1 Like

Hi @joostholslag,

Yes, you have understood the default context correct. I also agree that SNOMED CT, despite the default context, leaves quite much to the information model to specify.

It is also perfectly fine to override the default context of the clinical findings and procedure concepts in SNOMED CT. It is therefore the context is only a default and not a stated context. However, I would argue that the override needs to be done in some machine readable format. It would be perfectly fine to inside a Family history attribute in an archetype use the SNOMED CT concept 254837009 | Malignant neoplasm of breast (disorder) | and it would be formally interpreted as Family history of malignant neoplasm of breast. However I would strongly advice against in free text do some tagging like

The patient has a family history of <code="254837009 | Malignant neoplasm of breast (disorder) |">malignant neoplasm of breast</code>.

Because then the override of the default context would not be stored in a machine readable format and information systems would then, for good reasons, assume that the default context is present. This is the main reason why I think that we should be very careful with allowing partial tagging of free text.

It is true that quite few Family history of X exists as stand alone concepts in SNOMED CT. (Currently there is 680 of them. :smiley:) However, it is possible to use the SNOMED CT Compositional Grammar to express Family history of X with a post-coordinated expression, like

416471007 | Family history of clinical finding (situation) | : 
     246090004 |Associated finding (attribute)| = 254837009 |Malignant neoplasm of breast (disorder)|

for all clinical findings and procedures.

(In this specific case, there actually exists a SNOMED CT concept that express 429740004 | Family history of malignant neoplasm of breast (situation) | and a classifier would automatically understand that this concept is semantically equivalent with the post-coordinated expression above.)

I haven’t read the Search and Data Entry Guide for a while, but I think that chapter 6. Data Entry is the most relevant for this use case.

2 Likes

Hi Mikael, this helps a lot for me to better understand snomed. And it’s valuable advice to be careful with nlp tagging of free text. I understand the issue you present. But I’m curious what precautions do you take (on query, or otherwise) to let the query understand not to return a breast cancer if it’s in a family history archetype, do you use AQL, to filter only snomed findings in problem/diagnosis archetypes? If so we could do the same for clinical synopsis archetype, right?

And could we use the snomed composition grammar to do proper snomed encoding of free text with NLP?

I’m curious of actual use for querying datasets using snomed, it’s even harder than I thought to pick the right code. And I assume many errors are made in implementing systems? I’m quite sceptical our implementation is reliable now that I learn more.

Hi @joostholslag,

I am happy to help.

But I’m curious what precautions do you take (on query, or otherwise) to let the query understand not to return a breast cancer if it’s in a family history archetype, do you use AQL, to filter only snomed findings in problem/diagnosis archetypes? If so we could do the same for clinical synopsis archetype, right?

My view is that a combination of AQL and SNOMED CT Expression Constraint Language is a good combination to query these kinds of content. And it can be used for all kinds of situations where the archetype specify a specific context, including clinical synopsis.

And could we use the snomed composition grammar to do proper snomed encoding of free text with NLP?

Yes, as long as we are careful. :slight_smile: If the example above would be changed to
<code="416471007 | Family history of clinical finding (situation) | : 246090004 |Associated finding (attribute)| = 254837009 |Malignant neoplasm of breast (disorder)|"> The patient has a family history of malignant neoplasm of breast </code>.
it would have been a correct free text tagging.

I’m curious of actual use for querying datasets using snomed, it’s even harder than I thought to pick the right code. And I assume many errors are made in implementing systems? I’m quite sceptical our implementation is reliable now that I learn more.

Well, as usual in the healthcare sector, I assume that you need dedicated people with good knowledge about each method you use. But that also apply to openEHR. :slight_smile:

1 Like

Have a look at the PEN&PAD user interface from the 90’s and pair it with currently available voice recognition and some other context-aware AI. Also look at the generated text summary in the upper right corner of PEN&PAD. https://youtu.be/PGEAmJJ4frU (Demo starts at 11:25)

2 Likes

Finally found time to read chapter 6 of the snomed guide. It now makes much more sense to me. The key takeaway for me is that the soft default context can be overruled by the information model, but must be computer processable. So a findings default context can be overruled by using it in a family history archetype. But it can’t be overruled from free texts ppm since computers cannot be assumed to understand that. I do hope all snomed implementers are aware of this, and they don’t just collect a list off all snomed codes for a patient in a single db column (without specifying the information model context. )

3 Likes

HI

You only can override the default context, if the adopted context is compatible. In “Family History “, clinical findings are not stated that had occurred in the patient, so the meaning of the concept used can be critically affected. The editorial guide has a reference explaining this situation.

4 Likes

Thank you! (I haven’t read the Editorial guide for a while, so I didn’t know that it had been clarified.)

1 Like

Dear Nuno, thank you from your reply.
I’m struggling to understand the piece of the editorial guideline. Am I to understand that contrary to earlier conclusions by @mikael and me from the snomed data search and entry guide, it’s not acceptable to record findings in a family history field?

No. It seems only to be a more complicated way of expressing what we already had concluded.

1 Like

Maybe @joostholslag and others would be interested in this free webinar about “SNOMED CT Terminology Binding - a state-of-the Art Review with Recommendations for Practice and Research” that is scheduled at 2022-01-19 15:00 UTC? More information can be found at the page SNOMED - Events if you scroll down to the “Upcoming Research Webinars” heading.

3 Likes

Hi @joostholslag
It depends how you want to use it. You can use it as value set for the user to pic, but the expression record as to be similar to the example show by @mikael with the composition grammar. I think we should use precoordinated terms if available, but it´s virtually impossible cover all the needs. So, having the possibility to use compositional grammar seems a good way to go.

Just came across this: Amazon does something similar in their snomed omop API:

  • BeginOffset and EndOffset –The beginning and ending location of the text in the input note, respectively.
1 Like