Feature #3026

CQP, straighten CQL queries before using them

Added by Serge Heiden about 1 month ago.

Status:New Start date:02/25/2021
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:SearchEngine Spent time: -
Target version:TXM 0.8.2

Description

Users use word processing software (MS Word & LO Writer) to edit and manage CQL queries.

In those softwares, automatic writing assistants can silently replace some characters while writing or editing queries and the replacement can be difficult to see.

For example, when writing some <space> characters before or after some specific characters, they can be replaced by <unbreakable space> characters for typographic reasons (depending on the language of the document), which difference cannot be seen by the user. As <unbreakable space> characters are not <space> characters for CQL queries syntax, those queries may get syntactic errors when parsed by CQP.

To prevent some difficult to diagnoze CQL syntactic errors, we can straighten CQL queries before using them.

Solution

  • before sending CQL queries to CQP, replace all <unbreakable space> characters (U+00A0 NO-BREAK SPACE, UTF-8: c2 a0 UTF-16BE: 00a0 Decimal:   Octal: \0240) by <space> characters

Also available in: Atom PDF