Are there plans to work out concepts for an persistence layer
Thanks in advance for an answer
Are there plans to work out concepts for an persistence layer
Thanks in advance for an answer
Bert Verhees wrote:
Are there plans to work out concepts for an persistence layer
Thanks in advance for an answer
We are working on two or 3 variants at the moment. The general form of the stored information is logically a 2-column table of:
<path to info node, seriliased info node>
you can convert an entire object network to this form and store it efficiently. Because the paths are archetype-based paths, you can query quickly into the data.
We are working on the details.
- thomas
We are working on two or 3 variants at the moment. The general form of the stored information is logically a 2-column table of:
<path to info node, seriliased info node>
you can convert an entire object network to this form and store it efficiently. Because the paths are archetype-based paths, you can query quickly into the data.
We are working on the details.
- thomas
Thanks, I would really appreciate this being ready. I am glad experienced people are working on this.
Not for pushing, but for making my plans, is there any idea when some information about this will be released?
Thanks
Bert
Bert Verhees wrote:
We are working on two or 3 variants at the moment. The general form of the stored information is logically a 2-column table of:
<path to info node, seriliased info node>
you can convert an entire object network to this form and store it efficiently. Because the paths are archetype-based paths, you can query quickly into the data.
We are working on the details.
- thomas
Thanks, I would really appreciate this being ready. I am glad experienced people are working on this.
Not for pushing, but for making my plans, is there any idea when some information about this will be released?
I will put up an FAQ on openEHR persistence in a day or two....just as a start
- thomas
Hi folks
Standard qualifier for probably not understanding what's going on, but isn't
XML persistence slightly orthogonal to managing Archetypes? I mean to say,
persistence is important but can't it be deferred to a systems that already
offer XML native based persistence, e.g.
http://www.sleepycat.com/products/bdbxml.html or one of the existing
XML-to-RDBMS mapping solutions.
Again, I may misunderstand what you're trying to achieve, but won't storing
serialized data in the second column make for potentially complex queries,
especially if you're doing keyword searches. I'm undoubtedly showing my
cluelessness here - but wouldn't an XQuery based interrogation of the
repository work be more applicable than SQL?
Any advice appreciated.
Cheers,
Tom
Tom Tuddenham wrote:
Hi folks
Standard qualifier for probably not understanding what's going on, but isn't
XML persistence slightly orthogonal to managing Archetypes? I mean to say,
persistence is important but can't it be deferred to a systems that already
offer XML native based persistence, e.g.
http://www.sleepycat.com/products/bdbxml.html or one of the existing
XML-to-RDBMS mapping solutions.Again, I may misunderstand what you're trying to achieve, but won't storing
serialized data in the second column make for potentially complex queries,
especially if you're doing keyword searches. I'm undoubtedly showing my
cluelessness here - but wouldn't an XQuery based interrogation of the
repository work be more applicable than SQL?
none of this is clueless, don't worry. In fact you have hit several nails on the head. I am in the middle of writing an FAQ on all this still which I will try and get up in the next few days. In summary:
* persisting openEHR data structures e.g. Compositions etc is no
different from persisting any other typically fine-grained
hierarchical / object data
* so you can do normal analysis on it; you can store serilialised
data blobs, and queries do then have to be thought about more
carefully; but the other side of the coin is that not every
fine-grained node needs tobe directly queryable....
* except for one little exception - due to being archetyped, the
data is full of node ids and hence every node is guaranteed to be
addressable with an Xpath-style path (we do a couple of
machine-processable shortcuts on our paths to reduce the length,
but otherwise they are Xpaths - see the section on Paths in
http://svn.openehr.org/specification/TRUNK/publishing/architecture/overview.pdf.
So one very interesting option for storing the data is as <path,
serialised node> pairs, which is dead easy in a relational
database, and is very queryable, due to the paths being
semantically meaningful (due to being defined by archetypes).
What I aim to do is to just put up an initial FAQ; there is no doubt a lot of expertise in object and O/R persistence here in the community, and I would like to get others to contribute their knowledge as well (without trying to write just a general purpose lesson on O/R persistence - there are many of those already available).
(Probably we should set up a wiki at some point....)
- thomas
Shouldn't heavy weight queries be limited to a separate OLAP platform? This
would include most QA/QM and research uses. If you set up the persistence
to populate the OLAP as it is saving clinical information to the data
repository, you can even have real-time OLAP. I don't know how much of a
real difference the RDBMS optimization for a transaction based (i.e. the
CDR) and query based (OLAP) make when using a persistence layer like
Hibernate.
Kevin
Kevin M. Coonan, MD
Adjunct Assistant Professor, Division of Emergency Medicine
NLM Fellow, Department of Medical Informatics
University of Utah School of Medicine
Co-chair, HL7 Emergency Care Special Interest Group
kevin.coonan@utah.edu
Kevin M. Coonan, M.D. wrote:
Shouldn't heavy weight queries be limited to a separate OLAP platform? This
This is what I have seen at a large data-processing company where I used to work for.
They ran separate tables, optimized for OLAP-queries (wide flat ugly tables with lots of data-redundancy), some tables were kept up to date with triggers, other OLAP tables were updated once every day at night with SQL-select/insert-queries, depending on the purpose.
The 'normal' data processing application did not suffer from noteable performance lost.
Oracle supports this in SQL. REFRESH and FAST are optional
They call it MATERIALIZED VIEW, http://www.akadia.com/services/ora_materialized_views.html
CREATE MATERIALIZED VIEW mv_emp_pk REFRESH FAST START WITH SYSDATE NEXT SYSDATE + 1/48 WITH PRIMARY KEY AS SELECT * FROM emp@remote_db;
Bert
Dear all,
I think before we discuss how we are going to build a persistence layer we
need to discuss how we are going to use it. Is it to support a simple
electronic healthcare record application which only collects basic
information, print the information on a computer screen or on a paper on a
small center for primary health care? Or is it to support an information
system for electronic healthcare record information used everywhere on a
large hospital (or a country!) and where the system is able to amongst
others support data intensive applications like real time data driven
decision support systems? If it is the first case the persistence layer can
be built in many different ways and where some of the ways are simple and
fast to build. If it is the second case there are much fewer ways to build
the persistence layer and probably none of them are simple and fast to
build.
Regards,
Mikael
Mikael Nyström
M Sc in Computer Science and Engineering
Ph D student in Medical Informatics
Department of Biomedical Engineering
Linköping University
Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are.
A good persistence-layer reflects the classes which want to be persistent.
The classes do not need to know what DB-vendor is involved.
There will be no db-vendor specific code in the main application, the db-vendor specific code will be in the persistence layer
There is an abstraction between DB-vendor and classes that want to be persistent.
Post-relational-db's, XML-db's, relational DB's of all vendors should be accessible over the same code-interface.
The purpose doesn't matter, the complexity doesn't matter.
This concept is scaleable, just changing a few strings makes the application work against another database, of another or same vendor, on another or same machine.
It makes an application fit to work on a notebook with a small DB, as on a remote DB-server, the application, same code, different branch below in persistence layer, which is already generic programmed against most databases and situations.
That is what I mean, when talking about a persistence layer
Mikael Nyström wrote:
I want to use it for that.
Karsten
Bert Verhees wrote:
Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are.
A good persistence-layer reflects the classes which want to be persistent.
Do you have any specific need for queries? If you don't need to query the internals of the objects (or tree of them), it will be quite simple to just serialize the whole object.
The classes do not need to know what DB-vendor is involved.
There will be no db-vendor specific code in the main application, the db-vendor specific code will be in the persistence layer
There is an abstraction between DB-vendor and classes that want to be persistent.
Post-relational-db's, XML-db's, relational DB's of all vendors should be accessible over the same code-interface.
Agree with all these. Hiding these details behind some interface for accessing the persistence layer would work.
Rong
Hi all,
Rong Chen wrote:
Bert Verhees wrote:
Dear Mikael, since I initially started the discussion, I explain what
my needs/opinions are.A good persistence-layer reflects the classes which want to be
persistent.Do you have any specific need for queries? If you don't need to query
the internals of the objects (or tree of them), it will be quite simple
to just serialize the whole object.
I agree with Rong that in simple cases is it enough with serialized objects.
(But I must say that the alternative with applications like "real time data
driven decision support systems" is much more scientifically interesting and
fun. Otherwise we probably not use the full potential of openEHR.)
The classes do not need to know what DB-vendor is involved.
There will be no db-vendor specific code in the main application, the
db-vendor specific code will be in the persistence layer There is an
abstraction between DB-vendor and classes that want to be persistent.
Post-relational-db's, XML-db's, relational DB's of all vendors should
be accessible over the same code-interface.Agree with all these. Hiding these details behind some interface for
accessing the persistence layer would work.
Of cause it is possible to hide everything behind a persistence layer
_interface_ and doesn’t talk anything about what we have behind the
interface. But the subject and quite many of the mails had talked about the
persistence layer (and not the interface), then it is necessarily to know
what kind of application it should support.
Greetings,
Mikael
Mikael Nyström wrote:
Hi all,
Rong Chen wrote:
Bert Verhees wrote:
Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are.
A good persistence-layer reflects the classes which want to be persistent.
Do you have any specific need for queries? If you don't need to query
the internals of the objects (or tree of them), it will be quite simple
to just serialize the whole object.
I agree with Rong that in simple cases is it enough with serialized objects.
(But I must say that the alternative with applications like "real time data
driven decision support systems" is much more scientifically interesting and
fun. Otherwise we probably not use the full potential of openEHR.)
This could be achieved by querying in-memory objects, why it has to be done in the persistence layer? Maybe you can give some example, Mikael.
A generic, full-featured query service is tricky enough to do, so why not separate the persistence concern from query related logic.
The classes do not need to know what DB-vendor is involved.
There will be no db-vendor specific code in the main application, the db-vendor specific code will be in the persistence layer There is an abstraction between DB-vendor and classes that want to be persistent.
Post-relational-db's, XML-db's, relational DB's of all vendors should be accessible over the same code-interface.
Agree with all these. Hiding these details behind some interface for
accessing the persistence layer would work.
Of cause it is possible to hide everything behind a persistence layer
_interface_ and doesn’t talk anything about what we have behind the
interface. But the subject and quite many of the mails had talked about the
persistence layer (and not the interface), then it is necessarily to know
what kind of application it should support.
Well, I would prefer to see a generic, all-purpose persistence layer, defined by clear interface. Of course, the characteristics will be different depending which implementation is actually used. Applications should be probably built on top of EHR services, which are also generic, all-purposed. So I wouldn't agree that the persistence layer should be very much defined by any particular application. These two, application and persistence, should be rather decoupled, and the services layer will be in between.
[application specific logic] [some advanced use: decision support, automated alert etc]
[generic service layer: EHR, Demographics, Terminology, Security..., QueryService(?) ]
[generic persistence layer]
Cheers,
Rong
I have tried to be removed from this list several times - how can I be removed
-----Oprindelig meddelelse-----
Rong Chen wrote:
Bert Verhees wrote:
Dear Mikael, since I initially started the discussion, I explain what my needs/opinions are.
A good persistence-layer reflects the classes which want to be persistent.
Do you have any specific need for queries? If you don't need to query the internals of the objects (or tree of them), it will be quite simple to just serialize the whole object.
This would be bad from point of view of control, MS-Word does this, and serializes all kinds of uninitialized buffers which contain random memory data.
Also it is not possible to perform datamining techniques over the data.
But serializing objects is a valid way of storing them, so it should be possible over the same interface as all other databases are connected.
I would not recommend it. No way
Bert
Of cause it is possible to hide everything behind a persistence layer
_interface_ and doesn’t talk anything about what we have behind the
interface. But the subject and quite many of the mails had talked about the
persistence layer (and not the interface), then it is necessarily to know
what kind of application it should support.
It was me bringing up the subject a few times, and it was because that was a part which was not worked out in the openeher specs, and because I am quite new here, I thought I had missed something.
It was difficult to make myself clear, and therefore difficult to get a clear answer, that is why some mails about this were sent.
I thought there had to be a clear point where one could point, like Lego-blocks (www.lego.com), you take a persistence layer block, and put it below the reference kernel, put the archetype kernel-block on top and an GUI block on top of that, and you have build an information system.
I could'n't find a clear spot where to put the persistence layer block, nor could I find a description of the interface that openehr was prepared for, and I heard in Holland people shout, just put a database below and run it, and I thought, am I a fool?
But now it is clear to me. Those shouting people did not have much knowledge on the subject.
There are no persistence layer defined in the OpenEhr specs, Thomas said there will be specs for this after some time.
Agree?
Bert
There are no persistence layer defined in the OpenEhr specs, Thomas said there will be specs for this after some time.
While reading the rest of the mails (I was a week or more behind), I see there is a lot of work being done (I have not yet read it all).
I agree with Rong Chen that persistence should be generic and decoupled, and a good persistence layer fits on almost every application and on every db. It is like magic glue.
The paper of Scott W. Ambler is very interesting in this context (google for it)
Bert
The paper of Scott W. Ambler is very interesting in this context (google for it)
Kenneth Bøgelund Ahrensberg wrote:
I have tried to be removed from this list several times - how can I be removed
just login at http://www.openehr.org/cgi-bin/membDB-login and remove yourself.
- thomas beale