# Data quality 2 **Category:** [Clinical (archive)](https://discourse.openehr.org/c/clinical-archive/153) **Created:** 2007-07-12 21:19 UTC **Views:** 1 **Replies:** 6 **URL:** https://discourse.openehr.org/t/data-quality-2/14660 --- ## Post #1 by @Stef_Verlinden1 Thanks for all the responses. I’m glad I’m not the only one struggling with this. If I try to summarize what I’ve learned form these responses is that there is ‘real’ data (although that also can be subjective scales f.i. for pain) and there is a ‘personal’ interpretation that can make the data ‘trustworthy and therefore re-useable by others. As far as I’m concerned my questions aren’t about accuracy itself. That part will be taken care of by the certifying instance. A device can only be certified if its accuracy is within a certain tight bandwidth. They’re also not about modelling data reliability. They’re about modelling trust by applying quality systems. My point is that if we’re not able to provide ‘trustworthy’ data for large groups of healthcare providers people will only use data provided by themselves or their ‘trusted’ peers. All other data will be ‘rejected’. The latter will even become an argument against the use of EHR’s when healthcare providers will be forced to ‘compete’ for efficacy. In order to maintain an affordable healthcare system healthcare professionals will be forced to re-use data generated by others. This then could result in (the perception of a) lower income for the ‘user’ of that data. In such circumstances ‘lack of trust’ will be a strong argument against re-using data. If data can’t/ won’t be re-used what’s the use of an EHR and why should society invest in it. Therefore it’s important to install quality assurance systems that prove that data is trustworthy. Luckily there is a lot of experience with such quality systems, for instance in clinical research used for the development and testing of potentially new pharmaceuticals (GCP, GMP, GLP). Also in most clinical laboratories quality systems are in place and working well. The crux of a quality system is that al parties involved agree upon what the criteria are for ‘something’ to become reliable. I agree that it can be cumbersome to establish those criteria (although in many cases they’re already known, f.i. see the NHS criteria for ‘self blood pressure measurement at home). I disagree with the suggestion that it will become an infinite loop, with the possible exception of experimental and non-regular processes. In a normal situation every aspect in a process has criteria and those can be described. If they can’t be described one should seriously reconsider what he’s doing. Without criteria it becomes a lottery: the lucky ones win something, but most people will leave empty handed. So I guess I have several questions, which I try to separate. The first part is: how/where can we add the accessory data that necessary to establish data quality. As Thomas pointed out, most of these things are already recorded. In would like to have clarity on to things: 1. device information (calibration, maintained etc, etc.). Thomas suggested that one should store those in the protocol part of an observation. That’s a possibility but then one has to fill out that data every time an observation is made. Imagine a GP that performs 20 blood pressure measurements per day. I’m certain that that’s not going to work. Sam suggested that you could re-use this data once it’s registered once (and unchanged) with a cluster. Sam can you please explain how that works/ could work. I don’t really understand that cluster concept well enough. My suggestion to create a separate archetype (similar to the demographic one) originates from the idea that (especially in hospitals) devices will be maintained by a technical staff (third party). This could result in a situation where a health care provider doesn’t know all this details about calibration and maintenance and therefore is unable to provide that data. This service person should be the one to record data for that device. On user interface/ application level the healthcare provider will get a reminder if (based on the data in that device archetype) the device isn’t calibrated/maintained properly. In that case the device shouldn’t be used, because its ‘outcome’ isn’t trustworthy. So if we can do all of this through a cluster and everybody feels that’s the way to go, it’s fine with me. If not I still would suggest creating a separate device archetype. 2. capabilities of a non-professional, f.i. to determine whether someone is capable of self-measuring x, y or z. Here in the Netherlands patients determine their INR by themselves at home and adjust their medication accordingly! One can imagine that you want to test the capabilities of that person rigorously on forehand. If they pass that test, they’re capable I my suggestion is to store that capability a part of the demographic archetype. Thomas stated ‘whether the person who did the observation is professionally capable of doing it is another matter not really controllable by the ICT alone (other than by access control).’ This remark I don’t understand either. Thomas can you please elaborate on that. Form what I understood form the demographic IM , there is a capability class in which ‘ the qualifications of the performer of the role for this capability’. I guess my question is, is there a role self/ citizen/ non-professional (I don’t know how we should call that role) and can we attach capabilities to that role? The second part is about the data quality criteria. Although these criteria are subjective, they’re real and useful for the parties who agreed upon those criteria. Look for instance at the ‘report of the independent advisory group on blood pressure monitoring in clinical practice’ by the MHRA. This is an example of quality criteria one could agree upon. Based on such criteria one could rate the recorded data as good or poor (or even an more refined scale if required) Now my question. Where can/ should we store such a data quality marker. I figured that since these are local/subjective criteria you don’t want to put them in the original/ root archetype that’s intended for ‘worldwide’ use. So my suggestion is to create a ‘local’ specialization of this root archetype in which both the criteria and the data quality label are added (see my previous mail). Is this a good suggestion and if not where can I store that data after all. I don’t want to create a second system for this and in my opinion this type of data belong somewhere in an EHR. Cheers, Stef --- ## Post #2 by @Sam Hi Stef No the cluster is a reusable structure inside the protocol. The codeterminants (as Gerard calls them) are appropriate for the protocol - any information on the actual quality should be in the data - see the laboratory archetype. You are now raising the idea of credentials - does someone have the training to do X or Y? Generally it is not appropriate to store the credentials of clinicians in each health record - but those of a person (for self management) or a gardian (for home care) may well be worthwhile. I guess patients could ask for a professional to supply their credentials...? This is a new area for health records.....it has generally just been recorded with systems as terms (general practitioner, home care nurse etc). Credentials would have to come from some authority and be able to be validated. I am not sure that the world is ready for this formally as yet - but we need to keep it on the map - your home care brings up a lot of issues. Cheers, Sam Stef Verlinden wrote: --- ## Post #3 by @Philippe_AMELINE Hi Stef, You say that "it’s important to install quality assurance systems that prove that data is trustworthy" and "I disagree with the suggestion that it will become an infinite loop"\. However, the "archetypal" example of quality assurance, the Deming Wheel \(or PDCA cycle\) is essentially an infinite loop\. Its means that "quality" is not something you can establish, but something you must continuously be looking for\. In a Plan\-Do\-Check\-Act cycle, the global process is made of several parts, for example: \- an organization who elect quality criteria and sets target levels for these criteria \(Plan\) \- an EHR to record information \(Do\) \- a data mining process to compare quality assurance objectives vs actual data \(Check\) \- an organization who study the actual achievement and decide what happens next \(for example certify some entities and excludes some others\) \(Act\) As you can see, the EHR is one of several components here, and his aim is "just" to store information to be sent to the data mart, and maybe to help knowing who is certified and who is not\. Just to say that, unless you build a comprehensive PDCA system that includes all these components, the EHR by itself is not supposed to manage such a process\. Hope it helps, Philippe Stef Verlinden wrote: --- ## Post #4 by @thomas.beale Stef Verlinden wrote: > > So I guess I have several questions, which I try to separate\. > > The first part is: how/where can we add the accessory data that > necessary to establish data quality\. As Thomas pointed out, most of > these things are already recorded\. In would like to have clarity on to > things: > > 1\. device information \(calibration, maintained etc, etc\.\)\. Thomas > suggested that one should store those in the protocol part of an > observation\. That’s a possibility but then one has to fill out that > data every time an observation is made\. the 'proper' way to do this is to have a registry of devices in your provider enterprise, each with identifiers\. In the protocol, you can record the individual identifier, using a DV\_IDENTIFIER\. Then it is up to the health information system as a whole to flag if a certain device with id 12345 is defective\. You could then track down all the measurements done with that particular device and review them\. The same argument holds for observations or any data in fact coming from a particular external provider\. The demographic ids of the providers are in the Entries, Composition and also the system user id and system id are in the audit trail\. Again, to represent trust or reliability you would need to tag the demographic entities with appropriate markers in the demographic database, and then use that to generate appropriate indications on the screen or whatever\. Identifying measurements made by the patient is easy in openEHR \- every patient entered item has 'self' as the author in the audit trail, and also Entry\.provider\. But what will you do with this information? Downgrade or ignore Quantity data from a path lab with only 'medium' trustworthiness? If the patient is using a home BP machine and you have low confidence in their ability to measure their BP, what do you do? My aged father's GP taught him how to use the machine, did a comparison with his measurements with her last measurement in the clinic, and now she takes his measurements to be as good as hers\. This is like a mini\-professional training for a patient\. Sam has already alluded to studies that show patients in general produce better quality data on themselves \(e\.g\. BP, blood glucose, weight\) than doctors\. I am not sure what can usefully be done with this information\. > Imagine a GP that performs 20 blood pressure measurements per day\. I’m > certain that that’s not going to work\. > Sam suggested that you could re\-use this data once it’s registered > once \(and unchanged\) with a cluster\. Sam can you please explain how > that works/ could work\. I don’t really understand that cluster concept > well enough\. no this is not the approach; see above\. > > My suggestion to create a separate archetype \(similar to the > demographic one\) originates from the idea that \(especially in > hospitals\) devices will be maintained by a technical staff \(third > party\)\. This could result in a situation where a health care provider > doesn’t know all this details about calibration and maintenance and > therefore is unable to provide that data\. This service person should > be the one to record data for that device\. you will see that the openEHR Demographic model \(http://www.openehr.org/uml/release-1.0.1/Browsable/_9_5_76d0249_1118674798473_6021_0Report.html) already includes the subtype Agent of Actor; this is intended to be archetyped to represent software agents and devices\. > On user interface/ application level the healthcare provider will get > a reminder if \(based on the data in that device archetype\) the device > isn’t calibrated/maintained properly\. In that case the device > shouldn’t be used, because its ‘outcome’ isn’t trustworthy\. > > So if we can do all of this through a cluster and everybody feels > that’s the way to go, it’s fine with me\. If not I still would suggest > creating a separate device archetype\. no, to do it properly it has to be done as a demographic entity, with a DV\_IDENTIFIER link from the protocol of the relevant observation\. > > 2\. capabilities of a non\-professional, f\.i\. to determine whether > someone is capable of self\-measuring x, y or z\. Here in the > Netherlands patients determine their INR by themselves at home and > adjust their medication accordingly\! One can imagine that you want to > test the capabilities of that person rigorously on forehand\. If they > pass that test, they’re capable I my suggestion is to store that > capability a part of the demographic archetype\. > yes, you could do this\. I must admit we had not thought of storing patient competences with particular devices there, but the information structure does in fact exactly suit this purpose\. It would need some demographic archetypes for the Capability class\. > > Thomas stated ‘whether the person who did the observation is > professionally capable of > > doing it is another matter not really controllable by the ICT alone > > \(other than by access control\)\.’ This remark I don’t understand > either\. Thomas can you please elaborate on that\. all I mean is that you can't stop a patient entering data into the record of particular measurements if the system already lets them add to the record\. If it has been decided to do this, then I think it is a matter of 10 minutes' training by the GP or specialist \(remembering that this time will easily be saved in the future by patients generating their own measurements rather than coming in for a checkup\)\. The IT system can help this by allowing the GP to add capabilities to the patient demographic record, e\.g\. just to say that she has seen them do an INR competently\. > Form what I understood form the demographic IM , there is a capability > class in which ‘ the qualifications of the performer of the role for > this capability’\. I guess my question is, is there a role self/ > citizen/ non\-professional \(I don’t know how we should call that role\) > and can we attach capabilities to that role? I can see you are already ahead of me;\-\) > > The second part is about the data quality criteria\. Although these > criteria are subjective, they’re real and useful for the parties who > agreed upon those criteria\. Look for instance at the ‘report of the > independent advisory group on blood pressure monitoring in clinical > practice’ by the MHRA\. This is an example of quality criteria one > could agree upon\. Based on such criteria one could rate the recorded > data as good or poor \(or even an more refined scale if required\) > Now my question\. Where can/ should we store such a data quality > marker\. I figured that since these are local/subjective criteria you > don’t want to put them in the original/ root archetype that’s intended > for ‘worldwide’ use\. So my suggestion is to create a ‘local’ > specialization of this root archetype in which both the criteria and > the data quality label are added \(see my previous mail\)\. > Is this a good suggestion and if not where can I store that data after > all\. I don’t want to create a second system for this and in my opinion > this type of data belong somewhere in an EHR\. Various places it could go in the current openEHR model: \- in the protocol \(it is essentially meta\-data to do with 'how' the measurement was done\) \- in the 'data' art of the archetype, which might be sensible, particularly for patient questionnaires but this is assuming that you don't want to use the Quantity\.accuracy approach\. You could already record \+/\-20% \(95% confidence, normal distribution\) in a Quantity\. Some people here e\.g\. Tim Churches who has a population health perspective \- have correctly suggested that to be comprehensive, we should allow for other distributions, and therefore a more elaborate model of 'accuracy'\. If the community indeed sees the added complexity \(over and above the simple approach of recording accuracy = \+/\-5% etc\) is worth it, then we will add it to the model\. \- thomas --- ## Post #5 by @system Dear all, There are several types of models: -first order: models about what really happened -second order: models about observations of what happened in real life (perhaps). The direct observation. -third order: models about what somebody (something) observed and reports about. The indirect observation. In our discussion about Observation Archetypes we need to be clear what we are talking about. In contrast to HL7 (they mix all three in one bag and create chaos) we CEN, OpenEHR, never talk about first order models. We talk about second and third order models. The Observation archetype must have two distinct variants: the direct observation and the indirect observation. A Blood pressure measurement by a clinician recorded in his ICT-system is a direct observation. A Blood pressure measurement by a patient recorded by the clinician in his ICT-system is an indirect observation. A Blood pressure measurement by an automatic device recorded in the ICT system of the patient or clinician is an indirect measurement. In all cases the origin and its characteristics of the observation must be recordable. In many cases these actors (entities) are to be found in a registry where this resource is managed. When information is exchanged between jurisdictions (that do not use the same resource registries) this information ends up in a message/template. In the example of the Blood Pressure Archetype we see aspects from the second and third order models. Second order models do not describe the source explicitly because this is the author. Third order models define the actors in one level of more detail. The source and its characteristics must be described. The characteristics of persons differ from those of machines. Machines have device dependent characteristics but many times include persons as operators. This brings me to a cascade of archetypic building blocks. These building blocks I would like to call atomic archetypes. And many GPICS (an European CEN standard) will provide input to archetypes. 1 Observation a- first order (implicit actor) b- second order (explicit actors) I- Person characteristics II- Medical device i- Hardware, software characteristics ii- Person as operator characteristics Based on all recorded objective characteristics each author allowing data in his ICT-system must make a subjective value judgement. Next to each entry it must be possible to indicate at a certain time the value judgement. Many times this indication will be empty, some times it will indicate its trust value. Gerard Freriks **conexis** --- ## Post #6 by @Stef_Verlinden1 > Hi Stef, > > You say that "it’s important to install quality assurance systems that > prove that data is trustworthy" and "I disagree with the suggestion > that > it will become an infinite loop"\. However, the "archetypal" example of > quality assurance, the Deming Wheel \(or PDCA cycle\) is essentially an > infinite loop\. > > Its means that "quality" is not something you can establish, but > something you must continuously be looking for\. I agree partially\. It's true that you re\-evaluate quality based on the feed back that you receive and changing insights \(the PCDA cycle\) but that doesn't mean that it's a moving target\. Once a good sense of quality is established by all parties involved and the underlying insights don't change \(f\.i\. by new lines of research\), a quite stable situation can occur\. Here in the Netherlands we've a lot of experience with quality protocols to asses common ailments in the GP office\. Some of these protocols are stable for over 5 years now\. So yes the quality cycle is infinite but the work to create and maintain such a system not\. > > In a Plan\-Do\-Check\-Act cycle, the global process is made of several > parts, for example: > \- an organization who elect quality criteria and sets target levels > for > these criteria \(Plan\) > \- an EHR to record information \(Do\) > \- a data mining process to compare quality assurance objectives vs > actual data \(Check\) > \- an organization who study the actual achievement and decide what > happens next \(for example certify some entities and excludes some > others\) \(Act\) > > As you can see, the EHR is one of several components here, and his aim > is "just" to store information to be sent to the data mart, and > maybe to > help knowing who is certified and who is not\. > > Just to say that, unless you build a comprehensive PDCA system that > includes all these components, the EHR by itself is not supposed to > manage such a process\. I agree, the thing I'm looking/asking for is that we can store additional data required for a quality system\. I realize that, especially in the beginning, quality systems wills only be used in a few \(sub\)processes and probably will never aply to all of them\. As Ian McNicoll points out, the cost/benefit factor will kick in there, but i totally agree with Thomas that in the end it's all about quality of care we provide to the patient/citizen\. > > Hope it helps, Absolutely, thanks, stef --- ## Post #7 by @Stef_Verlinden1 > Dear all, > > There are several types of models: > -first order: models about what really happened > -second order: models about observations of what happened in real life (perhaps). The direct observation. > -third order: models about what somebody (something) observed and reports about. The indirect observation. > > In our discussion about Observation Archetypes we need to be clear what we are talking about. > > In contrast to HL7 (they mix all three in one bag and create chaos) we CEN, OpenEHR, never talk about first order models. > We talk about second and third order models. > > The Observation archetype must have two distinct variants: the direct observation and the indirect observation. > > A Blood pressure measurement by a clinician recorded in his ICT-system is a direct observation. > A Blood pressure measurement by a patient recorded by the clinician in his ICT-system is an indirect observation. > A Blood pressure measurement by an automatic device recorded in the ICT system of the patient or clinician is an indirect measurement. Nice distinctions, this works for me. > In all cases the origin and its characteristics of the observation must be recordable. > In many cases these actors (entities) are to be found in a registry where this resource is managed. > When information is exchanged between jurisdictions (that do not use the same resource registries) this information ends up in a message/template. > > In the example of the Blood Pressure Archetype we see aspects from the second and third order models. > Second order models do not describe the source explicitly because this is the author. > Third order models define the actors in one level of more detail. The source and its characteristics must be described. > The characteristics of persons differ from those of machines. > Machines have device dependent characteristics but many times include persons as operators. Very well put, I guess this is exactly the thing I'm struggling with. > This brings me to a cascade of archetypic building blocks. > These building blocks I would like to call atomic archetypes. > And many GPICS (an European CEN standard) will provide input to archetypes. > > 1 Observation > a- first order (implicit actor) > b- second order (explicit actors) > I- Person characteristics > II- Medical device > i- Hardware, software characteristics > ii- Person as operator characteristics I really like this model and would love to see if we can use this to 'model trust' > Based on all recorded objective characteristics each author allowing data in his ICT-system must make a subjective value judgement. > Next to each entry it must be possible to indicate at a certain time the value judgement. > Many times this indication will be empty, some times it will indicate its trust value. Indeed, but we have to start somewhere. Cheers, Stef --- **Canonical:** https://discourse.openehr.org/t/data-quality-2/14660 **Original content:** https://discourse.openehr.org/t/data-quality-2/14660