Decision trees for risk stratification deployed at point-of-care on top of openEHR? (Or other informatics / AI / machine learning models)

jaan · 22 February 2022 16:27

Hi,

I have built machine learning models that rely on unstructured data such as clinical notes, and structured data such as observations, labs, diagnoses, treatments, conditions, in order to give clinicians a risk score – for patient readmission, maternal health, mental health, and other areas of medicine. An example is here: [1904.05342] ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission that has been used at academic medical centers in the US.

Do you know if anyone or any organization has deployed clinical decision support tools on top of openEHR that use machine learning, such as a decision tree model?

Would love to see the implementation to understand what would be involved, and build a technical roadmap to implement such a thing. Feel free to forward if someone in your network may know who I should chat with.

Thanks!
Jaan

Jaan Altosaar, PhD
Columbia University Irving Medical Center
NewYork-Presbyterian Hospital

P.S. Sorry if this isn’t the right place to ask! Please let me know and I’ll repost elsewhere

Seref · 22 February 2022 16:41

Probably overkill, but I happen to have a PhD on the topic (more or less, since I aimed for the moon with Bayesian Networks…)

I’m not aware of decision trees running on top of openEHR. Implementation-wise @rong.chen is the person who’s probably ahead of everybody in terms of using openEHR for CDS.

ian.mcnicoll · 22 February 2022 16:49

Not my field at all but on a quick read of your paper @jaan, it looks as if, by providing highly structured. contextualised data, openEHR should remove the need for the kind of clinical note processing that you are doing.

Seems like an interesting new project to do a comparison of highly structured data ‘at source’ vs. what you can do with narrative processing, as input to the decision support?

jaan · 22 February 2022 16:56

Fantastic, thanks so much Seref! Downloaded your dissertation and excited to read.

Seref · 22 February 2022 16:58

Very kind of you Jaan. Always happy to have a chat about it and anything remotely related

jaan · 22 February 2022 16:58

You nailed it @ian.mcnicoll - that is the conclusion @thomas.beale and I arrived at as well

There is no ETL needed with openEHR, so we could train a very large version of the model I have built – and incorporate all the semantics.

Where could we get the resources to do this, and on which data?

Technically, I can build the machine learning component. (At Columbia, we work with ohdsi.org, but this is a single-level system built on OMOP – Chapter 4 The Common Data Model | The Book of OHDSI so not great for semantic scalability.)

borut.jures · 7 January 2023 11:57

I believed I added your thesis to my reading list just a few weeks ago. In reality it is more like a year ago (11 months)

Well I finally read it. Thank you for your clear writing that is easy to follow and enjoyable to read. Even if it spans 257 pages

Thank you for my first introduction to Bayesian Networks @Seref !

Seref · 9 January 2023 13:59

you’re welcome @borut.jures , thanks for the kind words.

I assure you it’s better to be on the reading side of 257 pages than the writing side…

I think they’re great, but they became this niche technique, especially with neural networks blowing up once the problem of training them at scale in parallel was solved. I suspect there’s some convergence now, but BNs in general have fallen out of favour. I personally think it’s a shame but it is what it is.