Hi @vidi42 the example in the clip you linked to looks like it will work 100% with this use case we are looking at right now, that is great!
Regarding @Chrizi 's second question I think it is semi related but maybe with FEEDER_AUDIT you are avoiding the problem in using CONTAINS now, however, just to be sure…
I think the original problem as well as the 2nd problem seem to mostly be rooted in the fact that these are arrays of elements we are dealing with as opposed to a cluster+items or list of archetypes (meaning, with a specific archetype type or archetype_node_id, which is the case that does seem to work well with arrays).
I have tried to troubleshoot this now a bit by turning on full logging in the database and examining the actual SQL query that is executed against the database when trying different type of AQL queries. Basically what I have seen is that as long as you are querying against arrays of items which have an archetype_node_id (and/or maybe with the attributes “name” and “value” – the “name_value_predicate” case that can be found in your code?) then the query generated against the database seems to be handling it in the right way (specifically that it is eventually using the PostgreSQL function jsonb_array_elements
to first parse and split the array into elements before then trying to extract the values), but if you can’t specify an archetype_node_id in the query (and/or maybe this “name_value_predicate” case which does not match List<DV_IDENTIFIER>
of things like originating_system_item_ids
either), then it is not doing anything with jsonb_array_elements
but instead trying to execute jsonb_extract_path_text
directly against the array itself (which will never match anything, I think?).
I have tried a lot of different “hacks” now to try and force the path to be interpreted as an array, for example to SELECT
with the following different variations:
c/feeder_audit/originating_system_item_ids/id
- (same as original example) rows are returned but values are null
c/feeder_audit/originating_system_item_ids[type='SomeSystemId']/id
- rows are returned but values are null
c/feeder_audit/originating_system_item_ids[DV_IDENTIFIER]/id
- ParseCancellationException from API
c/feeder_audit/originating_system_item_ids[DvIdentifier]/id
- ParseCancellationException from API
c/feeder_audit/originating_system_item_ids[*]/id
- ParseCancellationException from API
c/feeder_audit/originating_system_item_ids[0]/id
- ParseCancellationException from API
c/feeder_audit/originating_system_item_ids[]/id
-ParseCancellationException from API
- etc etc
All with no real success, unfortunately.
When I monitor the logs from the database at the same time, it never seems to succeed in interpreting this path as a JSON array (never tries to use jsonb_array_elements
to handle the array itself).
Here is the database query I am seeing from the original example:
select
jsonb_extract_path_text(cast("composition_join"."feeder_audit" as jsonb), 'originating_system_item_ids', 'id') as "/feeder_audit/originating_system_item_ids/id"
from "ehr"."entry"
right outer join "ehr"."composition" as "composition_join"
on "composition_join"."id" = "ehr"."entry"."composition_id"
right outer join "ehr"."ehr" as "ehr_join"
on "ehr_join"."id" = "composition_join"."ehr_id"
where
("ehr"."entry"."template_id" = $1 and ("ehr_join"."id" = 'bbe596e6-567d-44e9-871e-b718a66f8beb'))
If I take my second variant (c/feeder_audit/originating_system_item_ids[type='SomeSystemId']/id
) then the query becomes like this:
select
array_239963547_43.var_239963547_44 as "/feeder_audit/originating_system_item_ids[type='SomeSystemId']/id"
from "ehr"."entry"
right outer join "ehr"."composition" as "composition_join"
on "composition_join"."id" = "ehr"."entry"."composition_id"
right outer join "ehr"."ehr" as "ehr_join"
on "ehr_join"."id" = "composition_join"."ehr_id"
left outer join lateral
(select
(select jsonb_extract_path_text(cast("composition_join"."feeder_audit" as jsonb), '/originating_system_item_ids[type=''SomeSystemId'']', '/id') as "/feeder_audit/originating_system_item_ids[type='SomeSystemId']/id"
where
cast(jsonb_extract_path_text(cast("ehr"."js_composition"(cast("composition_join"."id" as uuid), cast('local.ehrbase.org' as text)) as jsonb), 'type') as varchar)
= cast('SomeSystemId' as varchar)) as "var_239963547_44") as "array_239963547_43"
on $1
where
("ehr"."entry"."template_id" = $2 and ("ehr_join"."id" = 'bbe596e6-567d-44e9-871e-b718a66f8beb'))
(a lot more complex but still never parsing the array via jsonb_array_elements
so it will still not match the array items correctly, plus it seems that it is taking the full syntax including the constraint value as plain text of the field name? field name it is looking for is originating_system_item_ids[type=''SomeSystemId'']
instead of originating_system_item_ids
?)
If I execute an AQL query against an array which has archetype types / node IDs then you see a totally different kind of query being generated (which does execute jsonb_array_elements
(your wrapped version of it, anyway)):
SELECT c/context/other_context/items[openEHR-EHR-CLUSTER.organisation.v1]/items[at0001] FROM EHR e[ehr_id/value='bbe596e6-567d-44e9-871e-b718a66f8beb'] CONTAINS COMPOSITION c
becomes:
select (array_77380713_39.var_77380713_40)::TEXT as "/context/other_context/items[openEHR-EHR-CLUSTER.organisation.v1]/items[at0001]" from "ehr"."entry" right outer join "ehr"."composition" as "composition_join" on "composition_join"."id" = "ehr"."entry"."composition_id" join "ehr"."event_context" on "ehr"."event_context"."composition_id" = "ehr"."entry"."composition_id" right outer join "ehr"."ehr" as "ehr_join" on "ehr_join"."id" = "composition_join"."ehr_id" join lateral (
select jsonb_extract_path_text(
ehr.xjsonb_array_elements(cast(jsonb_extract_path(
cast("ehr"."js_context"("ehr"."event_context"."id") as jsonb),
'other_context',
'/items[openEHR-EHR-CLUSTER.organisation.v1]'
) as jsonb)),
'/items[at0001]'
)
AS var_77380713_40) as "array_77380713_39" on true where ("ehr"."entry"."template_id" = $1 and ("ehr_join"."id" = 'bbe596e6-567d-44e9-871e-b718a66f8beb'))
I think this then becomes the key question for me now: how will this work in general with arrays in the new release of EHRbase? Is there a new way to specify these paths when the array items do not have an archetype type or node id, and/or when querying directly on the path without constraining the array (via []
) will it return a row in the AQL response for each item in the array?
Good that we will be able to use CONTAINS
on what seems like any object type now (including FEEDER_AUDIT) which seems to handle this well (based on the clip) but I assume that there will be other cases where one might still need to do something against a path without using CONTAINS where there could be arrays of elements like this?