Bug #2413: RCP: 0.8.0, Sending edition text selection to concordance doesn't work if the selection contains punctuation marks - Plateforme TXM - Forge du Centre Blaise Pascal

Bug #2413

Mis à jour par Serge Heiden il y a presque 7 ans

Sending edition text selection to concordance doesn't work if the selection contains punctuation.

There's also a problem with "spaces" that can lead to this kind of query {"donc" ""} that does nothing in concordance.

May need to fix "more" the query to extract the punctuation chars.

New location after split:
/org.txm.edition.rcp/src/org/txm/edition/rcp/handlers/SendEditionTo.java

h3. Solution

SLH: edition text _characters_ selection should really be edition text *words* selection, that is the internal CQP tokens matching the surface 'browser text characters selection' representing the CQP tokens word forms.

So it is more a matter of matching and selecting the correct word tokens under the surface than cleaning selected characters.

See the URS Annotation extension code to see how to select the token IDs under the browser characters selection, and how to extend the character selection to the characters corresponding to the real graphical form boundaries of the words.

h3. Solution 2

# define the wanted behavior for multiple words/token selection
## sequential or not sequential query? -> SLH: always sequential. If the characters sequence denotes a sequence of words, the query must search for a sequence of words.
# fix the query according to the above choice, eg. [word = "domaines"][word = "politiques"] or [word = "domaines"]|[word = "politique"]

NOTE: sequential queries behavior is different according to current CQP matching strategy, by default concordance doesn't build the pivot with all te sequence

Retour

Laboratoire ICAR » Plateforme TXM

Bug #2413