AQL: 3. Rules for parameter substitution on AQL: which types will add or not quotes

According to the current grammar:

DATE:	'\'' DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT 'T' DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT '.' DIGIT DIGIT DIGIT '+' DIGIT DIGIT DIGIT DIGIT '\'';

STRING:  '\'' ( ESC_SEQ | ~('\\'|'\'') )* '\''
    	|  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    	;

So this means that:

  • ISO 8601 date/times are currently the non-expanded form, i.e. without dashes / colon (and are all full date/times, with TZ included), and require single quotes;
  • Strings are either single-quoted or double-quoted.

I could not find any other use of quoting.

For the future, we need to allow the expanded ISO8601 patterns, also dates without times etc. If we want to get rid of quotes on date/times, we need type checking in the AQL environment (e.g. running off a BMM file), because of the problem of partial years, e.g. ‘2002’ as a string is a valid ISO 8601 date (the developers of that standard didn’t have the common sense to require something like ‘2002-??-??’, which would have ensured partial dates are not confused with integers or strings…

About this issue, IIRC I detected inconsistencies between implementations on the quotes applied to the parameters, for instance:

select e/ehr_id, e/time_created, e/system_id from EHR e CONTAINS COMPOSITION c WHERE c/archetype_node_id=’$archetype_id’

vs.

select e/ehr_id, e/time_created, e/system_id from EHR e CONTAINS COMPOSITION c WHERE c/archetype_node_id=$archetype_id

If the implementation does a simple substitution of values, option 1 would be correct, since what finishes in the WHERE condition is a string constant. But if the replacement already considers adding the quotes if those are missing, the second example will work as well.

The issue is: there is no specification around this, or at least I didn’t find it.

This might also be considered as one of those processing rules that we need to define, maybe a “pre-processing” rule in this case.

1 Like

I would vote for (1), simply on the basis that it is the just much easier for people to understand the rules. - where a string is required , it must be quoted. Otherwise we need a whole other set of rules that need to be agreed, documented and communicated on very fine points of the RM.

2 Likes

I second Ian’s vote since for developers familiar with SQL, single quoted text (string) is standard.

2 Likes

@thomas.beale @ian.mcnicoll @christian I have added a decision page on the wiki about this https://openehr.atlassian.net/wiki/spaces/spec/pages/624754806/AQL+quotes+for+parameter+substitution

There was another topic about this: AQL: Clarify use of parameters in AQL queries

As suggested there, the option-1 " ... WHERE c/archetype_node_id=’$archetype_id’ ... " is (or might be) unsafe against SQL injection. It is going to work but, is considered bad practice.

For backward compatibility we should still support it, with an appropriate notice in the spec about related risks. More important is what should we should facilitate and promote: it is actually the option-2, where the parameters are substituted by the engine in a smart way, depending on expected types. This is also done by most of the SQL engines (if I’m not mistaken).

As @sebastian.iancu pointed out later, this is rather unsafe if the engine just inserts verbatim the parameter value passed to it: if it contains a quote, the AQL will become invalid or, if carefully crafted by an expert, will have an entirely different meaning which will make it possible to leak data from the repository by people that do not have direct access to it (but only through an application that ensures access control).

Our (Better[tm], but perhaps not the best in this case) solution is to have another parameter placeholder type. This type actually inserts parameter value verbatim, but :archetype_id would add quotes (after figuring out the value was a string) and properly “escape” any quotes that might be within the string. This is all equivalent to what the proper solution would be implementation-wise that the parameters values are not inserted into the AQL text, but are properly supported all the way up to the AQL engine…