AQL ORDER BY doesn't specify the columns should appear in the projection

Reading about SQL ORDER BY, detected it is mandatory that a projection is done over the column that appears in the ORDER BY which makes sense because without that projection the client won’t see any order.

But we don’t have anything in the AQL spec mentioning this constraint.

I think we need to add some note about that. What do others think?

Funny thing about hierarchical data is the plain data rules of SQL don’t apply, so we could have something like:

SELECT TOP 5 c FROM COMPOSITION c ORDER BY c

Is that even valid? What is the sorting criteria?

Or, is something like this valid? (note the order by attribute is not in the projection, but is a subnode of it)

SELECT TOP 5 c FROM COMPOSITION c ORDER BY c/context/start_time/value

is certainly valid (or should be).

The issue of what is comparable or sortable is up dfor discussion - Bjorn leading the way but possibly largely figure out by Cambio as part of GDL. I can’t see how anything other than some Element values (with specific datatypes) could be, or should be sortable.

What can come after ORDER BY could be limited to types that are descendants of the primitive type Ordered, as defined here in the Foundation Types. That includes Integer, Real, String, the various Iso8601_xx types, and also DV_ORDERED descendants.

Note that you would not even have to use ORDER by c/context/start_time/value, since the path c/context/start_time is already an Ordered type.

This is another thing that can be determined from today’s BMM, simply by getting the descendants of the type Ordered (e.g. using Archie), or as per earlier discussions, could be represented more simply as a table or list (but that won’t get updated with the BMM of course, so it has to be maintained).

I guess simple attributes from the RM are also sortable.

But related to the first question, do we need to project the attribute mentioned in the ORDER BY?

In some cases I think yes, like when projecting only data, but in other cases it could be a containment situation (when the projection contains the attribute used in the ORDER BY) like:

SELECT TOP 5 c FROM COMPOSITION c ORDER BY c/context/start_time/value

That is something missing from the AQL spec that has to do with query validity, and might be related with the missing definition of the internal processing of AQL expressions (which we agreed to work on in the near future).

No SQL implementation has ever imposed such a limit on me and I think it’s completely silly and we should definitely not make it a part of the specification. What’s wrong with the query “give me patient’s body weight ordered from the most recent one to the least recent one” without also inquiring about the exact timestamp of the measurement?

I really see no use in forcing constraints into language specification that do not simplify implementation nor usage and also do not make the use of language any safer. I think the one who wrote the query is entirely aware of the fact that she’s doing ordering on an attribute she chose not to see.

4 Likes

I second @matijap Agreed on both counts.