Data quality 2

Thanks for all the responses. I’m glad I’m not the only one struggling with this.

If I try to summarize what I’ve learned form these responses is that there is ‘real’ data (although that also can be subjective scales f.i. for pain) and there is a ‘personal’ interpretation that can make the data ‘trustworthy and therefore re-useable by others.
As far as I’m concerned my questions aren’t about accuracy itself. That part will be taken care of by the certifying instance. A device can only be certified if its accuracy is within a certain tight bandwidth. They’re also not about modelling data reliability. They’re about modelling trust by applying quality systems.

My point is that if we’re not able to provide ‘trustworthy’ data for large groups of healthcare providers people will only use data provided by themselves or their ‘trusted’ peers. All other data will be ‘rejected’. The latter will even become an argument against the use of EHR’s when healthcare providers will be forced to ‘compete’ for efficacy. In order to maintain an affordable healthcare system healthcare professionals will be forced to re-use data generated by others. This then could result in (the perception of a) lower income for the ‘user’ of that data. In such circumstances ‘lack of trust’ will be a strong argument against re-using data. If data can’t/ won’t be re-used what’s the use of an EHR and why should society invest in it.

Therefore it’s important to install quality assurance systems that prove that data is trustworthy. Luckily there is a lot of experience with such quality systems, for instance in clinical research used for the development and testing of potentially new pharmaceuticals (GCP, GMP, GLP). Also in most clinical laboratories quality systems are in place and working well.

The crux of a quality system is that al parties involved agree upon what the criteria are for ‘something’ to become reliable. I agree that it can be cumbersome to establish those criteria (although in many cases they’re already known, f.i. see the NHS criteria for ‘self blood pressure measurement at home). I disagree with the suggestion that it will become an infinite loop, with the possible exception of experimental and non-regular processes. In a normal situation every aspect in a process has criteria and those can be described. If they can’t be described one should seriously reconsider what he’s doing. Without criteria it becomes a lottery: the lucky ones win something, but most people will leave empty handed.

So I guess I have several questions, which I try to separate.

The first part is: how/where can we add the accessory data that necessary to establish data quality. As Thomas pointed out, most of these things are already recorded. In would like to have clarity on to things:

  1. device information (calibration, maintained etc, etc.). Thomas suggested that one should store those in the protocol part of an observation. That’s a possibility but then one has to fill out that data every time an observation is made. Imagine a GP that performs 20 blood pressure measurements per day. I’m certain that that’s not going to work.
    Sam suggested that you could re-use this data once it’s registered once (and unchanged) with a cluster. Sam can you please explain how that works/ could work. I don’t really understand that cluster concept well enough.

My suggestion to create a separate archetype (similar to the demographic one) originates from the idea that (especially in hospitals) devices will be maintained by a technical staff (third party). This could result in a situation where a health care provider doesn’t know all this details about calibration and maintenance and therefore is unable to provide that data. This service person should be the one to record data for that device.
On user interface/ application level the healthcare provider will get a reminder if (based on the data in that device archetype) the device isn’t calibrated/maintained properly. In that case the device shouldn’t be used, because its ‘outcome’ isn’t trustworthy.

So if we can do all of this through a cluster and everybody feels that’s the way to go, it’s fine with me. If not I still would suggest creating a separate device archetype.

  1. capabilities of a non-professional, f.i. to determine whether someone is capable of self-measuring x, y or z. Here in the Netherlands patients determine their INR by themselves at home and adjust their medication accordingly! One can imagine that you want to test the capabilities of that person rigorously on forehand. If they pass that test, they’re capable I my suggestion is to store that capability a part of the demographic archetype. Thomas stated ‘whether the person who did the observation is professionally capable of

doing it is another matter not really controllable by the ICT alone

(other than by access control).’ This remark I don’t understand either. Thomas can you please elaborate on that. Form what I understood form the demographic IM , there is a capability class in which ‘ the qualifications of the performer of the role for this capability’. I guess my question is, is there a role self/ citizen/ non-professional (I don’t know how we should call that role) and can we attach capabilities to that role?

The second part is about the data quality criteria. Although these criteria are subjective, they’re real and useful for the parties who agreed upon those criteria. Look for instance at the ‘report of the independent advisory group on blood pressure monitoring in clinical practice’ by the MHRA. This is an example of quality criteria one could agree upon. Based on such criteria one could rate the recorded data as good or poor (or even an more refined scale if required)
Now my question. Where can/ should we store such a data quality marker. I figured that since these are local/subjective criteria you don’t want to put them in the original/ root archetype that’s intended for ‘worldwide’ use. So my suggestion is to create a ‘local’ specialization of this root archetype in which both the criteria and the data quality label are added (see my previous mail).
Is this a good suggestion and if not where can I store that data after all. I don’t want to create a second system for this and in my opinion this type of data belong somewhere in an EHR.

Cheers,

Stef

Hi Stef

No the cluster is a reusable structure inside the protocol. The codeterminants (as Gerard calls them) are appropriate for the protocol - any information on the actual quality should be in the data - see the laboratory archetype.

You are now raising the idea of credentials - does someone have the training to do X or Y? Generally it is not appropriate to store the credentials of clinicians in each health record - but those of a person (for self management) or a gardian (for home care) may well be worthwhile. I guess patients could ask for a professional to supply their credentials…?

This is a new area for health records…it has generally just been recorded with systems as terms (general practitioner, home care nurse etc). Credentials would have to come from some authority and be able to be validated. I am not sure that the world is ready for this formally as yet - but we need to keep it on the map - your home care brings up a lot of issues.

Cheers, Sam

Stef Verlinden wrote:

Hi Stef,

You say that "it’s important to install quality assurance systems that
prove that data is trustworthy" and "I disagree with the suggestion that
it will become an infinite loop". However, the "archetypal" example of
quality assurance, the Deming Wheel (or PDCA cycle) is essentially an
infinite loop.

Its means that "quality" is not something you can establish, but
something you must continuously be looking for.

In a Plan-Do-Check-Act cycle, the global process is made of several
parts, for example:
- an organization who elect quality criteria and sets target levels for
these criteria (Plan)
- an EHR to record information (Do)
- a data mining process to compare quality assurance objectives vs
actual data (Check)
- an organization who study the actual achievement and decide what
happens next (for example certify some entities and excludes some
others) (Act)

As you can see, the EHR is one of several components here, and his aim
is "just" to store information to be sent to the data mart, and maybe to
help knowing who is certified and who is not.

Just to say that, unless you build a comprehensive PDCA system that
includes all these components, the EHR by itself is not supposed to
manage such a process.

Hope it helps,

Philippe

Stef Verlinden wrote:

Stef Verlinden wrote:

So I guess I have several questions, which I try to separate.

The first part is: how/where can we add the accessory data that
necessary to establish data quality. As Thomas pointed out, most of
these things are already recorded. In would like to have clarity on to
things:

1. device information (calibration, maintained etc, etc.). Thomas
suggested that one should store those in the protocol part of an
observation. That’s a possibility but then one has to fill out that
data every time an observation is made.

the 'proper' way to do this is to have a registry of devices in your
provider enterprise, each with identifiers. In the protocol, you can
record the individual identifier, using a DV_IDENTIFIER. Then it is up
to the health information system as a whole to flag if a certain device
with id 12345 is defective. You could then track down all the
measurements done with that particular device and review them.

The same argument holds for observations or any data in fact coming from
a particular external provider. The demographic ids of the providers are
in the Entries, Composition and also the system user id and system id
are in the audit trail. Again, to represent trust or reliability you
would need to tag the demographic entities with appropriate markers in
the demographic database, and then use that to generate appropriate
indications on the screen or whatever.

Identifying measurements made by the patient is easy in openEHR - every
patient entered item has 'self' as the author in the audit trail, and
also Entry.provider.

But what will you do with this information? Downgrade or ignore Quantity
data from a path lab with only 'medium' trustworthiness? If the patient
is using a home BP machine and you have low confidence in their ability
to measure their BP, what do you do? My aged father's GP taught him how
to use the machine, did a comparison with his measurements with her last
measurement in the clinic, and now she takes his measurements to be as
good as hers. This is like a mini-professional training for a patient.
Sam has already alluded to studies that show patients in general produce
better quality data on themselves (e.g. BP, blood glucose, weight) than
doctors.

I am not sure what can usefully be done with this information.

Imagine a GP that performs 20 blood pressure measurements per day. I’m
certain that that’s not going to work.
Sam suggested that you could re-use this data once it’s registered
once (and unchanged) with a cluster. Sam can you please explain how
that works/ could work. I don’t really understand that cluster concept
well enough.

no this is not the approach; see above.

My suggestion to create a separate archetype (similar to the
demographic one) originates from the idea that (especially in
hospitals) devices will be maintained by a technical staff (third
party). This could result in a situation where a health care provider
doesn’t know all this details about calibration and maintenance and
therefore is unable to provide that data. This service person should
be the one to record data for that device.

you will see that the openEHR Demographic model
(http://www.openehr.org/uml/release-1.0.1/Browsable/_9_5_76d0249_1118674798473_6021_0Report.html)
already includes the subtype Agent of Actor; this is intended to be
archetyped to represent software agents and devices.

On user interface/ application level the healthcare provider will get
a reminder if (based on the data in that device archetype) the device
isn’t calibrated/maintained properly. In that case the device
shouldn’t be used, because its ‘outcome’ isn’t trustworthy.

So if we can do all of this through a cluster and everybody feels
that’s the way to go, it’s fine with me. If not I still would suggest
creating a separate device archetype.

no, to do it properly it has to be done as a demographic entity, with a
DV_IDENTIFIER link from the protocol of the relevant observation.

2. capabilities of a non-professional, f.i. to determine whether
someone is capable of self-measuring x, y or z. Here in the
Netherlands patients determine their INR by themselves at home and
adjust their medication accordingly! One can imagine that you want to
test the capabilities of that person rigorously on forehand. If they
pass that test, they’re capable I my suggestion is to store that
capability a part of the demographic archetype.

yes, you could do this. I must admit we had not thought of storing
patient competences with particular devices there, but the information
structure does in fact exactly suit this purpose. It would need some
demographic archetypes for the Capability class.

Thomas stated ‘whether the person who did the observation is
professionally capable of

doing it is another matter not really controllable by the ICT alone

(other than by access control).’ This remark I don’t understand
either. Thomas can you please elaborate on that.

all I mean is that you can't stop a patient entering data into the
record of particular measurements if the system already lets them add to
the record. If it has been decided to do this, then I think it is a
matter of 10 minutes' training by the GP or specialist (remembering that
this time will easily be saved in the future by patients generating
their own measurements rather than coming in for a checkup). The IT
system can help this by allowing the GP to add capabilities to the
patient demographic record, e.g. just to say that she has seen them do
an INR competently.

Form what I understood form the demographic IM , there is a capability
class in which ‘ the qualifications of the performer of the role for
this capability’. I guess my question is, is there a role self/
citizen/ non-professional (I don’t know how we should call that role)
and can we attach capabilities to that role?

I can see you are already ahead of me;-)

The second part is about the data quality criteria. Although these
criteria are subjective, they’re real and useful for the parties who
agreed upon those criteria. Look for instance at the ‘report of the
independent advisory group on blood pressure monitoring in clinical
practice’ by the MHRA. This is an example of quality criteria one
could agree upon. Based on such criteria one could rate the recorded
data as good or poor (or even an more refined scale if required)
Now my question. Where can/ should we store such a data quality
marker. I figured that since these are local/subjective criteria you
don’t want to put them in the original/ root archetype that’s intended
for ‘worldwide’ use. So my suggestion is to create a ‘local’
specialization of this root archetype in which both the criteria and
the data quality label are added (see my previous mail).
Is this a good suggestion and if not where can I store that data after
all. I don’t want to create a second system for this and in my opinion
this type of data belong somewhere in an EHR.

Various places it could go in the current openEHR model:
- in the protocol (it is essentially meta-data to do with 'how' the
measurement was done)
- in the 'data' art of the archetype, which might be sensible,
particularly for patient questionnaires

but this is assuming that you don't want to use the Quantity.accuracy
approach. You could already record +/-20% (95% confidence, normal
distribution) in a Quantity. Some people here e.g. Tim Churches who has
a population health perspective - have correctly suggested that to be
comprehensive, we should allow for other distributions, and therefore a
more elaborate model of 'accuracy'. If the community indeed sees the
added complexity (over and above the simple approach of recording
accuracy = +/-5% etc) is worth it, then we will add it to the model.

- thomas

Dear all,

There are several types of models:
-first order: models about what really happened
-second order: models about observations of what happened in real life (perhaps). The direct observation.
-third order: models about what somebody (something) observed and reports about. The indirect observation.

In our discussion about Observation Archetypes we need to be clear what we are talking about.

In contrast to HL7 (they mix all three in one bag and create chaos) we CEN, OpenEHR, never talk about first order models.
We talk about second and third order models.

The Observation archetype must have two distinct variants: the direct observation and the indirect observation.

A Blood pressure measurement by a clinician recorded in his ICT-system is a direct observation.
A Blood pressure measurement by a patient recorded by the clinician in his ICT-system is an indirect observation.
A Blood pressure measurement by an automatic device recorded in the ICT system of the patient or clinician is an indirect measurement.

In all cases the origin and its characteristics of the observation must be recordable.
In many cases these actors (entities) are to be found in a registry where this resource is managed.
When information is exchanged between jurisdictions (that do not use the same resource registries) this information ends up in a message/template.

In the example of the Blood Pressure Archetype we see aspects from the second and third order models.
Second order models do not describe the source explicitly because this is the author.
Third order models define the actors in one level of more detail. The source and its characteristics must be described.
The characteristics of persons differ from those of machines.
Machines have device dependent characteristics but many times include persons as operators.

This brings me to a cascade of archetypic building blocks.
These building blocks I would like to call atomic archetypes.
And many GPICS (an European CEN standard) will provide input to archetypes.

1 Observation
a- first order (implicit actor)
b- second order (explicit actors)
I- Person characteristics
II- Medical device
i- Hardware, software characteristics
ii- Person as operator characteristics

Based on all recorded objective characteristics each author allowing data in his ICT-system must make a subjective value judgement.
Next to each entry it must be possible to indicate at a certain time the value judgement.
Many times this indication will be empty, some times it will indicate its trust value.

Gerard Freriks

conexis

Hi Stef,

You say that "it’s important to install quality assurance systems that
prove that data is trustworthy" and "I disagree with the suggestion
that
it will become an infinite loop". However, the "archetypal" example of
quality assurance, the Deming Wheel (or PDCA cycle) is essentially an
infinite loop.

Its means that "quality" is not something you can establish, but
something you must continuously be looking for.

I agree partially. It's true that you re-evaluate quality based on
the feed back that you receive and changing insights (the PCDA cycle)
but that doesn't mean that it's a moving target. Once a good sense of
quality is established by all parties involved and the underlying
insights don't change (f.i. by new lines of research), a quite stable
situation can occur. Here in the Netherlands we've a lot of
experience with quality protocols to asses common ailments in the GP
office. Some of these protocols are stable for over 5 years now. So
yes the quality cycle is infinite but the work to create and maintain
such a system not.

In a Plan-Do-Check-Act cycle, the global process is made of several
parts, for example:
- an organization who elect quality criteria and sets target levels
for
these criteria (Plan)
- an EHR to record information (Do)
- a data mining process to compare quality assurance objectives vs
actual data (Check)
- an organization who study the actual achievement and decide what
happens next (for example certify some entities and excludes some
others) (Act)

As you can see, the EHR is one of several components here, and his aim
is "just" to store information to be sent to the data mart, and
maybe to
help knowing who is certified and who is not.

Just to say that, unless you build a comprehensive PDCA system that
includes all these components, the EHR by itself is not supposed to
manage such a process.

I agree, the thing I'm looking/asking for is that we can store
additional data required for a quality system. I realize that,
especially in the beginning, quality systems wills only be used in a
few (sub)processes and probably will never aply to all of them. As
Ian McNicoll points out, the cost/benefit factor will kick in there,
but i totally agree with Thomas that in the end it's all about
quality of care we provide to the patient/citizen.

Hope it helps,

Absolutely, thanks,

stef

Dear all,

There are several types of models:
-first order: models about what really happened
-second order: models about observations of what happened in real life (perhaps). The direct observation.
-third order: models about what somebody (something) observed and reports about. The indirect observation.

In our discussion about Observation Archetypes we need to be clear what we are talking about.

In contrast to HL7 (they mix all three in one bag and create chaos) we CEN, OpenEHR, never talk about first order models.
We talk about second and third order models.

The Observation archetype must have two distinct variants: the direct observation and the indirect observation.

A Blood pressure measurement by a clinician recorded in his ICT-system is a direct observation.
A Blood pressure measurement by a patient recorded by the clinician in his ICT-system is an indirect observation.
A Blood pressure measurement by an automatic device recorded in the ICT system of the patient or clinician is an indirect measurement.

Nice distinctions, this works for me.

In all cases the origin and its characteristics of the observation must be recordable.
In many cases these actors (entities) are to be found in a registry where this resource is managed.
When information is exchanged between jurisdictions (that do not use the same resource registries) this information ends up in a message/template.

In the example of the Blood Pressure Archetype we see aspects from the second and third order models.
Second order models do not describe the source explicitly because this is the author.
Third order models define the actors in one level of more detail. The source and its characteristics must be described.
The characteristics of persons differ from those of machines.
Machines have device dependent characteristics but many times include persons as operators.

Very well put, I guess this is exactly the thing I’m struggling with.

This brings me to a cascade of archetypic building blocks.
These building blocks I would like to call atomic archetypes.
And many GPICS (an European CEN standard) will provide input to archetypes.

1 Observation
a- first order (implicit actor)
b- second order (explicit actors)
I- Person characteristics
II- Medical device
i- Hardware, software characteristics
ii- Person as operator characteristics

I really like this model and would love to see if we can use this to ‘model trust’

Based on all recorded objective characteristics each author allowing data in his ICT-system must make a subjective value judgement.
Next to each entry it must be possible to indicate at a certain time the value judgement.
Many times this indication will be empty, some times it will indicate its trust value.

Indeed, but we have to start somewhere.

Cheers,

Stef