Revisiting symptom/sign

The current Symptom/sign archetype was first published in October 2015 following a seven month review process.

Since then the archetype has had several breaking changes applied, and has been sitting in the ‘Reassess’ state since April 2018.

The COVID-19 related work of early 2020 led to the creation of the new ‘screening questionnaire’ family of archetypes, of which Symptom/sign screening questionnaire is one of the most central members. The creation of this archetype also means we can rectify one of the most awkward modelling choices of the original Symptom/sign archetype; the ‘Nil significant’ element. This was intended to support structured questioning where the subject could respond whether they experienced a symptom or not. In effect it was a kind of negation element in an otherwise positive presence archetype, which leads to potential safety issues when querying the data. In the latest revision of the ‘Symptom/sign’ archetype the ‘Nil significant’ element has been removed, to be replaced by the ‘Symptom/sign screening questionnaire’ archetype where appropriate.

Other significant changes include:

  • Revision of the ‘Occurrence’ element (breaking change)
  • Separation of the ‘Precipitating/Resolving factor’ cluster into two separate clusters, removing the need for the run-time name constraint (breaking change).
  • Addition of the ‘Character’ element (non-breaking change)
  • Various non-breaking updates and corrections
  • Corrections to SNOMED CT bindings

We’d like to publish the current trunk revision as v2 of the archetype, but would like community input first, to make sure we catch any errors or other suggestions that would need breaking changes.

I think these are good changes.

‘Nil significant’ was always a bit tricky and the new screening questionnaire is better suited for that ‘closed questioning’ type of situation.

The only other change to Symptom (I have just submitted a CR) is to widen the Severity rating limit to 0.100 from 0…10 as we have come across a few places where 0 to 100 is the range used.

1 Like

Could leaving it unconstrained be another option? It’s not inconceivable that other requirements may pop up?

1 Like

I’d be happy with that - you can be sure someone will want 0…101 :frowning:

1 Like

Which means values in data will not be computably comparable, e.g. graphing over time of severity of symptoms of chronic arthritis would no longer work (well, it might work by accident, but it would not be reliable). If it had to be unconstrained I’d say you should require a numerator and a denominator, i.e. a ratio. Then you can compare severities across Observations…


In theory that should be possible (though probably not as useful as you might think) but it requires everyone to be working of the same hymn sheet and like it or not there are a million different approaches to recording 'severity/ The best we can do is mandate ate national/ regional/ condition-specific contexts where we can get clinical agreement.

The use of proportion might be worth considering but will be counter-intuitive to a lot of clinical folks - I can see the value but but I do know of one score which allows for >100%.

1 Like

How about keeping the DV_QUANTITY at 0…10, and adding DV_PROPORTION as an alternative data type for other ranges?

That would work technically but It would need quite a good explanation of how to use proportion, let’s say to model a score from 0 to 153, just to be awkward! That could be done in description and comment, of course.

Support strongly that values in an individual are reliably computable: time-serialisation is central to practice.
I have researched 20 commonly-used scoring systems, and would observe:

  • those that are indexes are expressed as %, so values never exceed 100.
  • of the others, max value range is 27, for PHQ9, except for the Minnesota Multiphasic Personality Inventory 2 with a max score of 567!
    Resolution is another matter: can a MH measure really be valid to 3 decimal places? Such high-resolution data is very rare in clinical practice IMO.

So suggest that nearly all use cases can be quantified as up to 100, and edge cases such as MMPI could be converted to % of max value limited to 2 decimal places?
Or is there another way to directly model those few scales that use higher-resolution data?


Most standardised scores and scales would be modelled as separate OBSERVATION archetypes, I think.

1 Like

Agree they are their own entities with their own very specific definitions

So for generic “severity” there are currently are 3 semantically-progressive texts that align with SCT codes, and also refer externally to “activity.” That is low resolution.

Likert scales are most often 5-point, ranging from 2 to 10 - so resolution is to 1 decimal. The downside of hi resolution is to magnify inter-user variation, whereas lo resolution forces choices that may be wrong.

% is suggested, and @ian.mcnicoll wrote “we have come across a few places where 0 to 100 is the range used.” Did these use this generic archetype but ideally would have a separate OBSERVATION archetype?

As others have suggested SCT has 2 further terms, for Trivial 162466003 , and for Very Severe 162471005.
So adding these would give a 5-point text scale at 1-decimal resolution.
As these extend the current range, data already recorded would be compatible?

Suggest to define Trivial as “The intensity … is not present during normal activity”
For Very Severe “The intensity … prevents any activity”

(In Severe, the def seems to have a typo: “The intensity … causes prevents normal activity.” but “causes” is redundant)


Could this work?

(from this branch: Clinical Knowledge Manager)