# persistence layer **Category:** [Technical (archive)](https://discourse.openehr.org/c/technical-archive/156) **Created:** 2006-03-03 13:04 UTC **Views:** 8 **Replies:** 26 **URL:** https://discourse.openehr.org/t/persistence-layer/14526 --- ## Post #1 by @system Are there plans to work out concepts for an persistence layer Thanks in advance for an answer --- ## Post #2 by @thomas.beale Bert Verhees wrote: > Are there plans to work out concepts for an persistence layer > > Thanks in advance for an answer >   We are working on two or 3 variants at the moment\. The general form of the stored information is logically a 2\-column table of: <path to info node, seriliased info node> you can convert an entire object network to this form and store it efficiently\. Because the paths are archetype\-based paths, you can query quickly into the data\. We are working on the details\. \- thomas --- ## Post #3 by @system > We are working on two or 3 variants at the moment\. The general form of the stored information is logically a 2\-column table of: > > <path to info node, seriliased info node> > > you can convert an entire object network to this form and store it efficiently\. Because the paths are archetype\-based paths, you can query quickly into the data\. > > We are working on the details\. > > \- thomas > Thanks, I would really appreciate this being ready\. I am glad experienced people are working on this\. Not for pushing, but for making my plans, is there any idea when some information about this will be released? Thanks Bert --- ## Post #4 by @thomas.beale Bert Verhees wrote: > >> We are working on two or 3 variants at the moment\. The general form of the stored information is logically a 2\-column table of: >> >> <path to info node, seriliased info node> >> >> you can convert an entire object network to this form and store it efficiently\. Because the paths are archetype\-based paths, you can query quickly into the data\. >> >> We are working on the details\. >> >> \- thomas >> > > Thanks, I would really appreciate this being ready\. I am glad experienced people are working on this\. > Not for pushing, but for making my plans, is there any idea when some information about this will be released? I will put up an FAQ on openEHR persistence in a day or two\.\.\.\.just as a start \- thomas --- ## Post #5 by @Tom_Tuddenham Hi folks Standard qualifier for probably not understanding what's going on, but isn't XML persistence slightly orthogonal to managing Archetypes? I mean to say, persistence is important but can't it be deferred to a systems that already offer XML native based persistence, e\.g\. http://www.sleepycat.com/products/bdbxml.html or one of the existing XML\-to\-RDBMS mapping solutions\. Again, I may misunderstand what you're trying to achieve, but won't storing serialized data in the second column make for potentially complex queries, especially if you're doing keyword searches\. I'm undoubtedly showing my cluelessness here \- but wouldn't an XQuery based interrogation of the repository work be more applicable than SQL? Any advice appreciated\. Cheers, Tom --- ## Post #6 by @thomas.beale Tom Tuddenham wrote: > Hi folks > > Standard qualifier for probably not understanding what's going on, but isn't > XML persistence slightly orthogonal to managing Archetypes? I mean to say, > persistence is important but can't it be deferred to a systems that already > offer XML native based persistence, e\.g\. > http://www.sleepycat.com/products/bdbxml.html or one of the existing > XML\-to\-RDBMS mapping solutions\. > > Again, I may misunderstand what you're trying to achieve, but won't storing > serialized data in the second column make for potentially complex queries, > especially if you're doing keyword searches\. I'm undoubtedly showing my > cluelessness here \- but wouldn't an XQuery based interrogation of the > repository work be more applicable than SQL? >   none of this is clueless, don't worry\. In fact you have hit several nails on the head\. I am in the middle of writing an FAQ on all this still which I will try and get up in the next few days\. In summary:     \* persisting openEHR data structures e\.g\. Compositions etc is no       different from persisting any other typically fine\-grained       hierarchical / object data     \* so you can do normal analysis on it; you can store serilialised       data blobs, and queries do then have to be thought about more       carefully; but the other side of the coin is that not every       fine\-grained node needs tobe directly queryable\.\.\.\.     \* except for one little exception \- due to being archetyped, the       data is full of node ids and hence every node is guaranteed to be       addressable with an Xpath\-style path \(we do a couple of       machine\-processable shortcuts on our paths to reduce the length,       but otherwise they are Xpaths \- see the section on Paths in       http://svn.openehr.org/specification/TRUNK/publishing/architecture/overview.pdf.       So one very interesting option for storing the data is as <path,       serialised node> pairs, which is dead easy in a relational       database, and is very queryable, due to the paths being       semantically meaningful \(due to being defined by archetypes\)\. What I aim to do is to just put up an initial FAQ; there is no doubt a lot of expertise in object and O/R persistence here in the community, and I would like to get others to contribute their knowledge as well \(without trying to write just a general purpose lesson on O/R persistence \- there are many of those already available\)\. \(Probably we should set up a wiki at some point\.\.\.\.\) \- thomas --- ## Post #7 by @Kevin_Coonan_MD Shouldn't heavy weight queries be limited to a separate OLAP platform? This would include most QA/QM and research uses\. If you set up the persistence to populate the OLAP as it is saving clinical information to the data repository, you can even have real\-time OLAP\. I don't know how much of a real difference the RDBMS optimization for a transaction based \(i\.e\. the CDR\) and query based \(OLAP\) make when using a persistence layer like Hibernate\. Kevin Kevin M\. Coonan, MD Adjunct Assistant Professor, Division of Emergency Medicine NLM Fellow, Department of Medical Informatics University of Utah School of Medicine Co\-chair, HL7 Emergency Care Special Interest Group kevin\.coonan@utah\.edu --- ## Post #8 by @system Kevin M\. Coonan, M\.D\. wrote: > Shouldn't heavy weight queries be limited to a separate OLAP platform? This > This is what I have seen at a large data\-processing company where I used to work for\. They ran separate tables, optimized for OLAP\-queries \(wide flat ugly tables with lots of data\-redundancy\), some tables were kept up to date with triggers, other OLAP tables were updated once every day at night with SQL\-select/insert\-queries, depending on the purpose\. The 'normal' data processing application did not suffer from noteable performance lost\. Oracle supports this in SQL\. REFRESH and FAST are optional They call it MATERIALIZED VIEW, http://www.akadia.com/services/ora_materialized_views.html CREATE MATERIALIZED VIEW mv\_emp\_pk REFRESH FAST START WITH SYSDATE NEXT SYSDATE \+ 1/48 WITH PRIMARY KEY AS SELECT \* FROM emp@remote\_db; Bert --- ## Post #9 by @mikael Dear all, I think before we discuss how we are going to build a persistence layer we need to discuss how we are going to use it\. Is it to support a simple electronic healthcare record application which only collects basic information, print the information on a computer screen or on a paper on a small center for primary health care? Or is it to support an information system for electronic healthcare record information used everywhere on a large hospital \(or a country\!\) and where the system is able to amongst others support data intensive applications like real time data driven decision support systems? If it is the first case the persistence layer can be built in many different ways and where some of the ways are simple and fast to build\. If it is the second case there are much fewer ways to build the persistence layer and probably none of them are simple and fast to build\.   Regards,   Mikael Mikael Nyström M Sc in Computer Science and Engineering Ph D student in Medical Informatics Department of Biomedical Engineering Linköping University --- ## Post #10 by @system Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are\. A good persistence\-layer reflects the classes which want to be persistent\. The classes do not need to know what DB\-vendor is involved\. There will be no db\-vendor specific code in the main application, the db\-vendor specific code will be in the persistence layer There is an abstraction between DB\-vendor and classes that want to be persistent\. Post\-relational\-db's, XML\-db's, relational DB's of all vendors should be accessible over the same code\-interface\. The purpose doesn't matter, the complexity doesn't matter\. This concept is scaleable, just changing a few strings makes the application work against another database, of another or same vendor, on another or same machine\. It makes an application fit to work on a notebook with a small DB, as on a remote DB\-server, the application, same code, different branch below in persistence layer, which is already generic programmed against most databases and situations\. That is what I mean, when talking about a persistence layer Mikael Nyström wrote: --- ## Post #11 by @Karsten_Hilbert I want to use it for that\. Karsten --- ## Post #12 by @rong.chen Bert Verhees wrote: > Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are\. > > A good persistence\-layer reflects the classes which want to be persistent\. Do you have any specific need for queries? If you don't need to query the internals of the objects \(or tree of them\), it will be quite simple to just serialize the whole object\. > The classes do not need to know what DB\-vendor is involved\. > There will be no db\-vendor specific code in the main application, the db\-vendor specific code will be in the persistence layer > There is an abstraction between DB\-vendor and classes that want to be persistent\. > Post\-relational\-db's, XML\-db's, relational DB's of all vendors should be accessible over the same code\-interface\. Agree with all these\. Hiding these details behind some interface for accessing the persistence layer would work\. Rong --- ## Post #13 by @mikael Hi all, Rong Chen wrote: > Bert Verhees wrote: >> Dear Mikael, since I initially started the discussion, I explain what >> my needs/opinions are\. >> >> A good persistence\-layer reflects the classes which want to be >> persistent\. > > Do you have any specific need for queries? If you don't need to query > the internals of the objects \(or tree of them\), it will be quite simple > to just serialize the whole object\. I agree with Rong that in simple cases is it enough with serialized objects\. \(But I must say that the alternative with applications like "real time data driven decision support systems" is much more scientifically interesting and fun\. Otherwise we probably not use the full potential of openEHR\.\) >> The classes do not need to know what DB\-vendor is involved\. >> There will be no db\-vendor specific code in the main application, the >> db\-vendor specific code will be in the persistence layer There is an >> abstraction between DB\-vendor and classes that want to be persistent\. >> Post\-relational\-db's, XML\-db's, relational DB's of all vendors should >> be accessible over the same code\-interface\. > > Agree with all these\. Hiding these details behind some interface for > accessing the persistence layer would work\. Of cause it is possible to hide everything behind a persistence layer \_interface\_ and doesn’t talk anything about what we have behind the interface\. But the subject and quite many of the mails had talked about the persistence layer \(and not the interface\), then it is necessarily to know what kind of application it should support\.   Greetings,   Mikael --- ## Post #14 by @rong.chen Mikael Nyström wrote: > Hi all, > > Rong Chen wrote: > >> Bert Verhees wrote: >>     >>> Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are\. >>> >>> A good persistence\-layer reflects the classes which want to be persistent\. >>>       >> >> Do you have any specific need for queries? If you don't need to query >> the internals of the objects \(or tree of them\), it will be quite simple >> to just serialize the whole object\. >>     > I agree with Rong that in simple cases is it enough with serialized objects\. > \(But I must say that the alternative with applications like "real time data > driven decision support systems" is much more scientifically interesting and > fun\. Otherwise we probably not use the full potential of openEHR\.\) >   This could be achieved by querying in\-memory objects, why it has to be done in the persistence layer? Maybe you can give some example, Mikael\. A generic, full\-featured query service is tricky enough to do, so why not separate the persistence concern from query related logic\. >   >>> The classes do not need to know what DB\-vendor is involved\. >>> There will be no db\-vendor specific code in the main application, the db\-vendor specific code will be in the persistence layer There is an abstraction between DB\-vendor and classes that want to be persistent\. >>> Post\-relational\-db's, XML\-db's, relational DB's of all vendors should be accessible over the same code\-interface\. >>>       >> >> Agree with all these\. Hiding these details behind some interface for >> accessing the persistence layer would work\. >>     > Of cause it is possible to hide everything behind a persistence layer > \_interface\_ and doesn’t talk anything about what we have behind the > interface\. But the subject and quite many of the mails had talked about the > persistence layer \(and not the interface\), then it is necessarily to know > what kind of application it should support\. >   Well, I would prefer to see a generic, all\-purpose persistence layer, defined by clear interface\. Of course, the characteristics will be different depending which implementation is actually used\. Applications should be probably built on top of EHR services, which are also generic, all\-purposed\. So I wouldn't agree that the persistence layer should be very much defined by any particular application\. These two, application and persistence, should be rather decoupled, and the services layer will be in between\. \[application specific logic\] \[some advanced use: decision support, automated alert etc\] \[generic service layer: EHR, Demographics, Terminology, Security\.\.\., QueryService\(?\) \] \[generic persistence layer\] Cheers, Rong --- ## Post #15 by @Kenneth_Bogelund_Ahr I have tried to be removed from this list several times \- how can I be removed \-\-\-\-\-Oprindelig meddelelse\-\-\-\-\- --- ## Post #16 by @system Rong Chen wrote: > Bert Verhees wrote: > >> Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are\. >> >> A good persistence\-layer reflects the classes which want to be persistent\. > > Do you have any specific need for queries? If you don't need to query the internals of the objects \(or tree of them\), it will be quite simple to just serialize the whole object\. This would be bad from point of view of control, MS\-Word does this, and serializes all kinds of uninitialized buffers which contain random memory data\. Also it is not possible to perform datamining techniques over the data\. But serializing objects is a valid way of storing them, so it should be possible over the same interface as all other databases are connected\. I would not recommend it\. No way Bert --- ## Post #17 by @system > Of cause it is possible to hide everything behind a persistence layer > \_interface\_ and doesn’t talk anything about what we have behind the > interface\. But the subject and quite many of the mails had talked about the > persistence layer \(and not the interface\), then it is necessarily to know > what kind of application it should support\. > It was me bringing up the subject a few times, and it was because that was a part which was not worked out in the openeher specs, and because I am quite new here, I thought I had missed something\. It was difficult to make myself clear, and therefore difficult to get a clear answer, that is why some mails about this were sent\. I thought there had to be a clear point where one could point, like Lego\-blocks \(www\.lego\.com\), you take a persistence layer block, and put it below the reference kernel, put the archetype kernel\-block on top and an GUI block on top of that, and you have build an information system\. I could'n't find a clear spot where to put the persistence layer block, nor could I find a description of the interface that openehr was prepared for, and I heard in Holland people shout, just put a database below and run it, and I thought, am I a fool? But now it is clear to me\. Those shouting people did not have much knowledge on the subject\. There are no persistence layer defined in the OpenEhr specs, Thomas said there will be specs for this after some time\. Agree? Bert --- ## Post #18 by @system > > There are no persistence layer defined in the OpenEhr specs, Thomas said there will be specs for this after some time\. While reading the rest of the mails \(I was a week or more behind\), I see there is a lot of work being done \(I have not yet read it all\)\. I agree with Rong Chen that persistence should be generic and decoupled, and a good persistence layer fits on almost every application and on every db\. It is like magic glue\. The paper of Scott W\. Ambler is very interesting in this context \(google for it\) Bert --- ## Post #19 by @system > The paper of Scott W\. Ambler is very interesting in this context \(google for it\) http://www.ambysoft.com/essays/persistenceLayer.html --- ## Post #20 by @thomas.beale Kenneth Bøgelund Ahrensberg wrote: > I have tried to be removed from this list several times \- how can I be removed >   just login at http://www.openehr.org/cgi-bin/membDB-login and remove yourself\. \- thomas beale --- ## Post #21 by @thomas.beale There is some basic architecture overview information in section 6 of http://svn.openehr.org/specification/TRUNK/publishing/architecture/overview.pdf Otherwise the persistence notes I just uploaded may help the discussion\. \- thomas Bert Verhees wrote: --- ## Post #22 by @James_Heywood Dear Group, That only works if you know your username and password\. I do not have those and did not sign up for the mailing\. Emails to the webmaster have not produced any reply\.    Help\! \-Jamie James Allen Heywood d'Arbeloff Founding Director ALS Therapy Development Foundation 215 First Street Cambridge MA 02142 617 441 7222 \- www\.als\.net \- jheywood@als\.net --- ## Post #23 by @Conrado_Vina Hi, I am on the exact same situation. Thanks in advance, Conrado James Heywood, el 17/03/2006 01:18 p.m. dijo: --- ## Post #24 by @mikael Hi all, Rong Chen wrote: > Mikael Nyström wrote: >> Rong Chen wrote: >>> >>> Do you have any specific need for queries? If you don't need to query >>> the internals of the objects \(or tree of them\), it will be quite >>> simple to just serialize the whole object\. >> >> I agree with Rong that in simple cases is it enough with serialized objects\. >> \(But I must say that the alternative with applications like "real time >> data driven decision support systems" is much more scientifically >> interesting and fun\. Otherwise we probably not use the full potential >> of openEHR\.\) >   > This could be achieved by querying in\-memory objects, why it has to be done > in the persistence layer? Maybe you can give some example, Mikael\. A data driven real time system \(in this example a decision support system\) relies heavily on fast and full access to the data\. The systems often works with rules like “if parameter A and parameter B is < C then check if parameter D1, D2, D3, D4, D5 and D6 is > 10 and in that case trigger rules R1, R2, R3, R4, R5, R6 and R7”\. Of cause it is possible to implement data driven systems completely in Java and only relies on a database management system to store serialized data objects in BLOBs\. For every parameter the system needs to check the system then needs to take the path to the object which contain the parameter the system needs, ask the database management system to retrieve the serialized Java object from its BLOB\-storage in the database, deserialize the object to a readable form, store the object it in the computer memory, read the parameter you need from the object, invalidate your newly created object and not get back the memory used for the object until the garbage collector run it’s next turn\. But this strategy needs large resources to run\. If the parameters are stored in direct readable form in the database management system it is possible for the application to just ask for the single data it needs and haven’t the need to spend time on deserializing of BLOBs into readable objects, store large object instead of just the needed parameter in the computer’s memory and the need for large object to be garbage collected\. An even more efficient alternative is to implement the rules directly in the database management system as triggers\. Then the rules are applied by the database management system in the same native way as the data are handled in the database management system\. But this alternative is only possible when the database management system is able to read the single data values and not, as in the serialized objects alternative, know that the single data values are inside a BLOB\. If we are talking about small data driven systems can I accept the “pure Java guy’s alternative” and do most of the processing in Java\. But if we would like to have large data driven systems this alternative is needs in most cases to large resources to work, and we then need to use the “database engineer’s alternatives” instead\. > A generic, full\-featured query service is tricky enough to do, so why > not separate the persistence concern from query related logic\. I can accept that it is trickier to implement a good persistence layer without any serialized objects, but it is definitely not rocket science to do it\. So why not try, so we are able to use the data in more effective ways? > Well, I would prefer to see a generic, all\-purpose persistence layer, defined > by clear interface\. Of course, the characteristics will be different > depending which implementation is actually used\. Applications should be > probably built on top of EHR services, which are also generic, all\-purposed\. > So I wouldn't agree that the persistence layer should be very much defined > by any particular application\. These two, application and persistence, > should be rather decoupled, and the services layer will be in between\. I agree that we need one well defined independent and good persistence layer interface\. We therefore need to design the persistence layer interface for the most demanding applications\.   Greetings,   Mikael --- ## Post #25 by @Karsten_Hilbert Well, even in a GP EMR system we want decision support\. I don't think there's any such thing as a "small" system in your sense\. It's rather a question of what the system is going to be used for no matter large or small\. Karsten --- ## Post #26 by @mikael Hi all, Karsten Hilbert wrote: > > Mikael Nyström wrote: > >> If we are talking about small data driven systems can I accept the “pure >> Java guy’s alternative” and do most of the processing in Java\. But if we >> would like to have large data driven systems this alternative is needs in >> most cases to large resources to work, and we then need to use the >> “database engineer’s alternatives” instead\. > > Well, even in a GP EMR system we want decision support\. I > don't think there's any such thing as a "small" system in > your sense\. It's rather a question of what the system is > going to be used for no matter large or small\. I used "small" and "large" in a broad sense\. If anyone feel more comfortable to read "small" as "low usage" and "large" as "high usage" or "not intensive" and "intensive" or "few users" and "many users" or similar things is it ok for me\. :\-\) I just need to be able compare two alternatives with different demands\.   Greetings,   Mikael Nyström --- ## Post #27 by @David_Forslund This good reference, and there are now numerous persistence enabling systems built around it\. One which we have found to work very well and quite efficiently is ObjectRelationalBridge \(OJB\) from Apache\. We use a rather generic persistence interface in OpenEMed and map it to a number of persistence mechanisms\. This enables us to leverage the performance of different systems for different applications\. The most efficient usually is an object database\. Dave Forslund Bert Verhees wrote: --- **Canonical:** https://discourse.openehr.org/t/persistence-layer/14526 **Original content:** https://discourse.openehr.org/t/persistence-layer/14526