Bug #871

RCP: 0.7, Under Windows, TXM does not work with accented paths

Added by Matthieu Decorde almost 5 years ago. Updated almost 3 years ago.

Status:New Start date:06/18/2014
Priority:Urgent Due date:
Assignee:- % Done:

80%

Category:SearchEngine Spent time: -
Target version:TXM 0.7.8

Description

When using the search engine in 'memory mode', CQP cannot access corpora with paths composed of accented or special characters.

A) Document the problem: See (FR) https://groupes.renater.fr/wiki/txm-users/public/faq#sous_windows_txm_07_et_versions_ulterieures_aucune_requete_cql_ne_fonctionne_sur_aucun_corpus

B) Temporary Solution:
  • under Windows, check if the corpus directory path obeys current CQP character constraints (accented or special characters)
  • if not, ask the user to change the 'TXM User Home directory' preference to something compatible with the current CQP constraints
  • corpus directory path must be checked at installation and when the 'TXM User Home directory' preference is changed
  • TXM should not give the impression to the user that it works until the 'TXM User Home directory' complies with the current CQP constraints

C) Definitive Solution

Change CQP registry directory access code to comply with current operating systems pathnames constraints.

Solution

  • Replace FileSystem IO functions with glib FileSystem IO functions (fopen -> g_fopen, etc.)
  • change registry files encoding to UTF-8 : On Windows and Mac OS X the system encoding is no UTF-8 and these files contain the DATA path to the index files.

see changes : https://groupes.renater.fr/wiki/txm-info/specs_search_engine

Validation test

In a Windows TXM session:
  • if necessary, create an account with accented characters
  • if necessary, install TXM
  • run TXM (from C:/Program files/TXM/TXM.bat if installed with another user)
  • when selecting the VOEUX corpus, the corpus size must be shown in TXM status bar.

History

#1 Updated by Serge Heiden almost 5 years ago

  • Subject changed from With Windows, corpus queries don't work when the user home directory contains special characters to Under Windows, CQL queries don't work when the user home directory contains accented or special characters

#2 Updated by Matthieu Decorde almost 5 years ago

  • Category set to SearchEngine

When using the memory mode of CQi, registry paths with accents (or special characters) are not well encoded and CQi fails to find the corpus registry files

#3 Updated by Serge Heiden almost 5 years ago

  • Subject changed from Under Windows, CQL queries don't work when the user home directory contains accented or special characters to Under Windows, impossible to use search engine (CQL queries) when the user home directory contains accented or special characters
  • Description updated (diff)

#4 Updated by Serge Heiden almost 5 years ago

  • Tracker changed from Support to Bug
  • Description updated (diff)
  • Target version changed from Known bugs to TXM 0.7.7

#5 Updated by Serge Heiden over 4 years ago

  • Priority changed from Normal to High

#6 Updated by Matthieu Decorde over 4 years ago

  • Subject changed from Under Windows, impossible to use search engine (CQL queries) when the user home directory contains accented or special characters to RCP: 0.7, Under Windows, impossible to use search engine with accented registry directory path
  • Description updated (diff)
  • % Done changed from 0 to 60

#7 Updated by Matthieu Decorde over 4 years ago

  • Subject changed from RCP: 0.7, Under Windows, impossible to use search engine with accented registry directory path to RCP: 0.7, Under Windows, TXM does not work with accented paths

#8 Updated by Matthieu Decorde over 4 years ago

  • Description updated (diff)

#9 Updated by Matthieu Decorde over 3 years ago

  • Priority changed from High to Urgent
  • Target version changed from TXM 0.7.7 to TXM 0.8.0a (split/restructuration)
  • % Done changed from 60 to 50

#10 Updated by Matthieu Decorde almost 3 years ago

  • Description updated (diff)
  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.7.8
  • % Done changed from 50 to 80

#11 Updated by Matthieu Decorde almost 3 years ago

  • Description updated (diff)

Also available in: Atom PDF