Decision trees for risk stratification deployed at point-of-care on top of openEHR? (Or other informatics / AI / machine learning models)


I have built machine learning models that rely on unstructured data such as clinical notes, and structured data such as observations, labs, diagnoses, treatments, conditions, in order to give clinicians a risk score – for patient readmission, maternal health, mental health, and other areas of medicine. An example is here: [1904.05342] ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission that has been used at academic medical centers in the US.

Do you know if anyone or any organization has deployed clinical decision support tools on top of openEHR that use machine learning, such as a decision tree model?

Would love to see the implementation to understand what would be involved, and build a technical roadmap to implement such a thing. Feel free to forward if someone in your network may know who I should chat with.


Jaan Altosaar, PhD
Columbia University Irving Medical Center
NewYork-Presbyterian Hospital

P.S. Sorry if this isn’t the right place to ask! Please let me know and I’ll repost elsewhere :slight_smile:

1 Like

Probably overkill, but I happen to have a PhD on the topic :slight_smile: (more or less, since I aimed for the moon with Bayesian Networks…)

I’m not aware of decision trees running on top of openEHR. Implementation-wise @rong.chen is the person who’s probably ahead of everybody in terms of using openEHR for CDS.

1 Like

Not my field at all but on a quick read of your paper @jaan, it looks as if, by providing highly structured. contextualised data, openEHR should remove the need for the kind of clinical note processing that you are doing.

Seems like an interesting new project to do a comparison of highly structured data ‘at source’ vs. what you can do with narrative processing, as input to the decision support?

Fantastic, thanks so much Seref! Downloaded your dissertation and excited to read.

Very kind of you Jaan. Always happy to have a chat about it and anything remotely related :slight_smile:

You nailed it @ian.mcnicoll - that is the conclusion @thomas.beale and I arrived at as well :slight_smile:

There is no ETL needed with openEHR, so we could train a very large version of the model I have built – and incorporate all the semantics.

Where could we get the resources to do this, and on which data?

Technically, I can build the machine learning component. (At Columbia, we work with, but this is a single-level system built on OMOP – Chapter 4 The Common Data Model | The Book of OHDSI so not great for semantic scalability.)