`oct` - a new open clinical terminology

I am proposing that nothing is ever deleted from the Namespace.
New codes will be added to represent the new way to describe that concept.
A term which is no longer appropriate to use will be made inactive.
We will support ways to only discover active concepts (eg an index of active concepts)

This is true, but more of a problem for those extracting data from the system than thos inputting it at they time the input it (which is generally ‘now’, historically). The correct solutions will vary for how to extract meaningful data over a long period of time (eg a retrospective longitudinal study) according to the type of project, the question being asked, the clinical situation.

well, it forces you to be very clear on what is the differentiating parts of the descriptions and what is clarifying. But yes, as the set of codes gets longer, and the granularity gets finer, the codes get longer, and harder to manage. Unique numbers per SCT don’t have this problem, but they have obvious disadvantages.

One chooses one’s poison :frowning:

1 Like

Seems people here don’t like RDF, does that mean you don’t like OWL (in any form/dialect) either? Whatever you pick, it would be good to be able to run a reasoner (see also this) to find inconsistencies etc. otherwise things will be hard to maintain. Reasoner capabilities come for free with OWL and some other formalisms with a sufficiently strong logical foundation.

If not using OWL, at least going for some graph that can be queried via GQL would be good. Having the terminology system and the EHR in the same GQL-queryable database would be nice.

Also of course starting a new Snomed CT competitor would be madness yesterday, maybe today too or maybe not with clever use of AI for information gathering and structuring.

[Same madness level goes for starting a new Wikipedia, right Elon? (But Grokipedia does not necessarily contribute to more openess in the world…)]

P.S. Also have a chat with @Daniel_Karlsson

1 Like

Regarding unique identifiers I would go for non-semantic short (alpahnumeric) ones, but also allow an optional canonical (unique) alias (likely in English) for those that prefer to maintain and use that for some common codes. And make sure APIs etc can accept both and cross-translate.

2 Likes

OWL and OBO are my favorite. And also SPARQL.

1 Like

I like this idea. Kind of gives us the best of both worlds.

If we have a vast, immutable namespace, the ‘graphs’ built on top of that can be anything we want. They can add hierarchies, groupings/refsets, or convenience alternative naming layers.

At the moment I am leaning towards using Crockford base32 for the identifier, which gets us 5 bits per character of ID length, while being URL-safe and case-insensitive-filesystem-safe.

A base32 ID 7 chars long would give us 35 billion terms, is this enough for the time being?
SNOMED has a 109 = 1 billion namespace size, of which it is currently using ~360k.

Alternatively, a truncated hash of the Description (ie a UUIDv5) might work, this would look a little bit like a Git commit hash. Yes we won’t be able to change the Description without changing the ID. But since we would have an insanely large namespace this wouldn’t matter. We’d just keep track of the changes in the graphs.

1 Like