openEHR Discourse AI (LLM) Features

New! Discourse AI / LLM driven features.

Hi, all - just a little announcement about the start of some interesting new features we are exploring in the openEHR Discourse instance. Having started off as a bit of an AI-sceptic, I am now relatively positive about the potential benefits for knowledge discovery and agentic coding. I’m sure there’s a range of views out there in the openEHR community.

Discourse have quietly been releasing a range of AI/LLM-driven features over the past few years and they were fairly early adopters of AI when it first became a commodity resource that could be consumed via API, around 2023.

There are a LOT of potential features and we probably won’t be enabling them all right away. Many of them require a bit of configuration and refinement to be useful. Some will have to be confined to specific user groups for cost control.

On this Discourse

Initially I have enabled:

AI Spam Detection

This uses an LLM to identify spam, resulting in much higher detection rates and a low false positive rate. It’s a quality-of-life imporovement for our moderators and should help us keep the spam level even lower than it already is.

AI Assisted Search

Discourse has an ‘just about OK’ built-in search, but it is overly dependent on the exact lexical matching of the search term. It can be maddening trying to find something you know is in here, but can’t hit on the exact term. Adding AI to this means that as well as the lexical matches you will get matches based on embeddings of the semantic meaning of what was searched for.

For now this gets us set up and we can see how the cost of the API usage looks. Feedback is necessary and essential! Just reply here (unless there is a security or privacy-related issue in which case PM me)

In The Future?

I would like feedback on these ideas in order to prioritise and shape the plan. Reading the Discourse AI docs will help you understand the range of what is possible here.

AI Chat Bot

An ‘openEHR chat’ feature could be useful especially for new users of openEHR - this would allow a user to interact with the forum’s extensive knowledgebase in natural language, through a chatbot type interface similar to ChatGPT. Answers given by the bot are referenced back to the original topics for further reading.

Clinical Modelling AI Assistant?

With some work on our part to understand what kind of assistance would be genuinely useful (any clinical modeller volunteers?), we might be able to create a tool that helps clinical modellers research existing archetypes and templates much quicker and more comprehensively. Essentially we would be adding custom prompt to an agent. Standard agents are getting very good though now - is this even necessary? Would it be measurably better than ChatGPT5.3 searching the GitHub CKM mirror?

(For all the above, we would have to be mindful of the AI resource usage costs this would incur for openEHR CIC).

FAQs:

What is the AI provider we are using? We are using OpenRouter which is an aggregator API that passes on queries to a range of LLM providers. For Spam we are using the open-source Llama 3.3 70B Instruct and for Embeddings (which is how Search works) we are using

Is data from the forum used to improve the model? No, with all the models and providers we’re using, there is a clearly written statement to the effect that data is not used for model training and OpenRouter submits our queries using an anonymous userID.

Are private messages and chat indexed via embeddings? No.

1 Like

Have you seen Cadasto/openehr-assistant-mcp: A Model Context Protocol (MCP) Server to assist you on various openEHR related tasks and APIs ?

It’s already making it quite easy to work with archetypes using Copilot or other LLMs, including some instructions on language use and translation (see resources/guides/archetypes/language-standards.md).

1 Like

Thanks @siljelb - yes I had seen this, it’s excellent work. MCP is one way to give LLMs better access to ‘actions’, but other ways are becoming popular, such as Anthropic’s ‘Claude Skills’ which are a Markdown file(s) with guidance for LLMs on how to achieve a task. For some use-cases these are as effective as MCP servers without the technical overhead.

I am working on such adding and supporting SKILLs as we speak :slight_smile: - as Claude and Cursor plugins.

Also, I am considering/planning CLI interface - for a better integration with OpenClaw. Which brings me to another subject I wanted to ask you @marcusbaw : are you aware of any Discourse MCP/Connector/CLI that could be used to connect (safely) our Discourse to such Personal Agents (like OpenClaw), or perhaps those workflow automatization like Nathan (n8n.ai)?

The other thing I am thinking about is Atlassian Rovo: We have a lot of information in Confluence+Jira, and some others on Discourse; is there any way in this AI agentic space where we could bridge these two worlds together, so that we have have some productivity enhancements?

I’m not sure if that is the case when we take the other approach, exposing endpoints an interfaces, and let end-users to use them in their tooling. This way we’ll not put too much burden on openEHR funding - hence my thoughts above about personal agents.

3 Likes

This all sounds like great work @sebastian.iancu! If you need a call to set anything up let me know.
I think LLM tools can be an enormous accelerator for openEHR.

Excellent point, and maybe this should be our primary AI strategy, in that LLM costs are borne by the user not openEHR. It also gives users more freedom to use whatever models they want. Discourse’s ‘in-browser’ AI implementation is actually not all that amazing.

Discourse MCP

Discourse.org, as I say, have been pretty hot on implementing LLM compatible features. There is a CLI MCP server tool available, which you would run locally, and use to authenticate to this Discourse as your own user.

It is read-only by default, and User API keys are already enabled on this instance. I can restrict the scope of these if we start getting posts from :openclaw: OpenClaw asking to book a restaurant table :rofl:
If anyone needs a (scoped) Admin API key, for specific use-case and approved work, then I can facilitate this.

Github

I found this rather useful site https://gitmcp.io/ which instantly creates a read-only informational MCP server for any public GitHub repo, which might be helpful

In essence: “Simply change the domain from github.com or github.io to gitmcp.io and get instant AI context for any GitHub repository”

Example using openEHR CKM Mirror: GitMCP

Jira/Confluence

I have deliberately kept out of Jira/Confluence because it’s a maddening labyrinth (I have instead focused on the maddening labyrinth that is Discourse :rofl:… ) but in the Discourse MCP demo gif you can see Falco uses an MCP to get information from Meta and make tickets in Jira. I presume he’s using the Atlassian MCP Server.


General thoughts of my own

  • Make it accessible: We need to make sure that we document what we are doing in an accessible way (guides, tutorials, videos) so that it becomes an enabler for even newbie clinical modellers and implementers, not just a tool for the already experienced so we leave everyone else in our dust. The pace is so fast that we need to pave the path for those coming after us. I am happy to bring people onto Everything Digital Health to demo this kind of work, and in addition we need simpler written Getting Started guides for openEHR that reference using agents and MCP.

  • Don’t forget the open models: While I understand that the cutting edge of agentic coding is Claude4.6 and GPT-5.3, and these are so much better than the open source coding models right now (I say this as a regular user of the paid models), we should also take some time to put effort into feeding the open source model ecosystem - for example integrations that work with Deepseek, Qwen, opencode, and LMStudio (to pick a few). That said I don’t expect this all to fall on @sebastian.iancu these comments are for the community as a whole.