New! Discourse AI / LLM driven features.
Hi, all - just a little announcement about the start of some interesting new features we are exploring in the openEHR Discourse instance. Having started off as a bit of an AI-sceptic, I am now relatively positive about the potential benefits for knowledge discovery and agentic coding. I’m sure there’s a range of views out there in the openEHR community.
Discourse have quietly been releasing a range of AI/LLM-driven features over the past few years and they were fairly early adopters of AI when it first became a commodity resource that could be consumed via API, around 2023.
There are a LOT of potential features and we probably won’t be enabling them all right away. Many of them require a bit of configuration and refinement to be useful. Some will have to be confined to specific user groups for cost control.
On this Discourse
Initially I have enabled:
AI Spam Detection
This uses an LLM to identify spam, resulting in much higher detection rates and a low false positive rate. It’s a quality-of-life imporovement for our moderators and should help us keep the spam level even lower than it already is.
AI Assisted Search
Discourse has an ‘just about OK’ built-in search, but it is overly dependent on the exact lexical matching of the search term. It can be maddening trying to find something you know is in here, but can’t hit on the exact term. Adding AI to this means that as well as the lexical matches you will get matches based on embeddings of the semantic meaning of what was searched for.
For now this gets us set up and we can see how the cost of the API usage looks. Feedback is necessary and essential! Just reply here (unless there is a security or privacy-related issue in which case PM me)
In The Future?
I would like feedback on these ideas in order to prioritise and shape the plan. Reading the Discourse AI docs will help you understand the range of what is possible here.
AI Chat Bot
An ‘openEHR chat’ feature could be useful especially for new users of openEHR - this would allow a user to interact with the forum’s extensive knowledgebase in natural language, through a chatbot type interface similar to ChatGPT. Answers given by the bot are referenced back to the original topics for further reading.
Clinical Modelling AI Assistant?
With some work on our part to understand what kind of assistance would be genuinely useful (any clinical modeller volunteers?), we might be able to create a tool that helps clinical modellers research existing archetypes and templates much quicker and more comprehensively. Essentially we would be adding custom prompt to an agent. Standard agents are getting very good though now - is this even necessary? Would it be measurably better than ChatGPT5.3 searching the GitHub CKM mirror?
(For all the above, we would have to be mindful of the AI resource usage costs this would incur for openEHR CIC).
FAQs:
What is the AI provider we are using? We are using OpenRouter which is an aggregator API that passes on queries to a range of LLM providers. For Spam we are using the open-source Llama 3.3 70B Instruct and for Embeddings (which is how Search works) we are using
Is data from the forum used to improve the model? No, with all the models and providers we’re using, there is a clearly written statement to the effect that data is not used for model training and OpenRouter submits our queries using an anonymous userID.
Are private messages and chat indexed via embeddings? No.