Aql teaser 4 for implementers

Seref · 3 March 2020 09:37

Dear implementers:
@bna @matijap @ian.mcnicoll

Given the composition on the left, and the AQL query after that, what should the result set for the select clause on the right be? Same as before: think on it whenever you have some free time

SELECT
    c,s,i_1,i_2
FROM
    COMPOSITION C c
        CONTAINS
    SECTION s
        CONTAINS 
    (INSTRUCTION i_1 OR INSTRUCTION i_2)

<c|s|i|> on the leftmost tree is a composition|section|instruction instance, i.e. a single composition instance.

matijap · 4 March 2020 08:00

In my head, the answer is clear: you get two rows, one with i1 in i_1 and i_2=null, and one where i_1=null and i_2=i1.

That is not what you’d get with SQL outer join and an additional condition that i_1 != i_2: you’d get no rows, because there would be no row with any of the variables being null before applying the non-equality condition.

To my horror, that is even not what you get with Better EHR Server: you also get no rows. Which seems wrong to me. I’m sure @bna’s team handles this properly given what he said in teaser #1 (and with what I tend do disagree, but when applied to this case it would probably lead to more logica behaviour).

Seref · 4 March 2020 09:05

Thank you. That’s exactly what I’d suggest as the correct behaviour.

Not only that, but in more complex cases, outer joins would still fail you depending on the results of previous outer joins, if you’re thinking in terms of relational algebra. I won’t go into that here, but let me just say that I use this example to show that CONTAINS semantics with disjunction cannot be represented with relational joins as a first class concept.

Don’t worry, you’re not the only one

And this is why I’m laying out different cases, to build an argument towards consistent behaviour of logical operators. The price of consistency is inconvenience but dealing with inconvenience is much,much cheaper in cost compared to cost of inconsistency, especially among vendors.