Statistiques
| Révision :

root / Portal / configurations / html / Help_SrcmfProject.jsp @ 3

Historique | Voir | Annoter | Télécharger (50,69 ko)

1 3 alavrent
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2 3 alavrent
<html xmlns="http://www.w3.org/1999/xhtml">
3 3 alavrent
    <head>
4 3 alavrent
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
5 3 alavrent
        <title></title>
6 3 alavrent
    </head>
7 3 alavrent
    <body>
8 3 alavrent
        <div style="padding-left:25px;">
9 3 alavrent
            <h1>SRCMF corpus: TIGERSearch web interface</h1>
10 3 alavrent
            <h2>Contents</h2>
11 3 alavrent
            <ul>
12 3 alavrent
                <li><a href="#interface">Using the TIGERSearch web interface</a></li>
13 3 alavrent
                <li><a href="#query">Writing a simple query</a></li>
14 3 alavrent
                <li><a href="#concordances">Exporting a concordance</a></li>
15 3 alavrent
                <li><a href="#tags">Tagset used</a></li>
16 3 alavrent
                <li><a href="#sample">Sample queries</a></li>
17 3 alavrent
            </ul>
18 3 alavrent
            <h2><a name="interface"></a>Using the TIGERSearch web interface</h2>
19 3 alavrent
            <h3>Writing a query and browsing the results</h3>
20 3 alavrent
            <p>In the TigerSearch tab, queries are entered in the top panel, and matching sentences
21 3 alavrent
                are shown in tree form in the bottom panel. A tutorial on TigerSearch queries may be
22 3 alavrent
                found in the section “<a href="#query">Writing a simple query</a>”.</p>
23 3 alavrent
            <ul>
24 3 alavrent
                <li>Type your query in the top panel (e.g. <tt>#pivot:[word = "Tristran"])</tt></li>
25 3 alavrent
                <li>Click on the ‘Search’ button at the bottom right of the panel.</li>
26 3 alavrent
            </ul>
27 3 alavrent
            <p>If the query is well-formed, and if there are matching results in the corpus, the
28 3 alavrent
                first tree in the forest will appear in the bottom panel.</p>
29 3 alavrent
            <p>The central bar gives the number of matches and the position of the sentence in the
30 3 alavrent
                corpus, in the form <i>sent: [sentence number] [match number] / [total matching
31 3 alavrent
                    sentences].</i> Note that subgraph navigation is not yet implemented, and the
32 3 alavrent
                interface does <strong>not</strong> show the total number of matches, only the
33 3 alavrent
                number of matching sentences. You can navigate through the forest of matches using
34 3 alavrent
                the forward and back arrows on this bar. The ‘Export’ button displays the current
35 3 alavrent
                tree as an .SVG file in the browser, which can be saved and downloaded. The ‘Export
36 3 alavrent
                Concordance’ button allows matching sentences to be exported in <a
37 3 alavrent
                    href="#concordances">concordance form</a>.</p>
38 3 alavrent
            <h3>Exporting the results</h3>
39 3 alavrent
            <p>To export the results of your query, click the ‘Export Concordance’ button. An export
40 3 alavrent
                window will appear, with the following options:</p>
41 3 alavrent
            <ul>
42 3 alavrent
                <li><p><strong>Type</strong></p>
43 3 alavrent
                    <p>Three concordances are currently implemented:</p>
44 3 alavrent
                    <ul>
45 3 alavrent
                        <li>basic concordance</li>
46 3 alavrent
                        <li>single word pivot concordance</li>
47 3 alavrent
                        <li>pivot and block concordance</li>
48 3 alavrent
                    </ul>
49 3 alavrent
                    <p>It is important to note that these concordances use the names of TigerSearch
50 3 alavrent
                        variables from the query to structure the concordance. <strong>No
51 3 alavrent
                            concordance will be produced if your query does not contain a
52 3 alavrent
                                <tt>#pivot</tt> variable.</strong> The pivot and block concordance
53 3 alavrent
                        requires at least one additional <tt>#blockXX</tt> variable.</p>
54 3 alavrent
                    <p>Further documentation for these concordances may be found in the section “<a
55 3 alavrent
                            href="#concordances">Exporting a concordance</a>”.</p></li>
56 3 alavrent
                <li><p><strong>Context (number of words)</strong></p>
57 3 alavrent
                    <p>Sets the size of the context preceding and following the pivot.</p></li>
58 3 alavrent
                <li><p><strong>Restore punctuation</strong></p>
59 3 alavrent
                    <p>Adds punctuation from the BFM’s digitized edition to the exported
60 3 alavrent
                        concordance. It will also restore words excluded from the TIGERSearch corpus
61 3 alavrent
                        (e.g. lacunae, AOI in the <i>Chanson de Roland</i>).</p></li>
62 3 alavrent
                <li><p><strong>Properties to show in concordance</strong></p>
63 3 alavrent
                    <p>Select which features of terminal and non-terminal nodes should be shown in
64 3 alavrent
                        the concordance. This function is only active for the ‘pivot and block
65 3 alavrent
                        concordance’.</p></li>
66 3 alavrent
            </ul>
67 3 alavrent
            <p>When you have filled in the form:</p>
68 3 alavrent
            <ul>
69 3 alavrent
                <li>Click the ‘OK’ button.</li>
70 3 alavrent
            </ul>
71 3 alavrent
            <p>After a short delay, a new tab will open in your browser, containing the concordance
72 3 alavrent
                in plain text tabular format (.csv).</p>
73 3 alavrent
            <ul>
74 3 alavrent
                <li>Save this file to disk using the ‘File &gt; Save As...’ menu in your
75 3 alavrent
                    browser.</li>
76 3 alavrent
            </ul>
77 3 alavrent
            <h3>Viewing the concordance</h3>
78 3 alavrent
            <p>To view and manipulate the concordance, you will need to use a spreadsheet
79 3 alavrent
                package.</p>
80 3 alavrent
            <ul>
81 3 alavrent
                <li>Open the spreadsheet application.</li>
82 3 alavrent
                <li>Select ‘File > Open...’ from the toolbar.</li>
83 3 alavrent
                <li>Ensure that the file list is showing either ‘All files’ or ‘CSV text
84 3 alavrent
                    files’.</li>
85 3 alavrent
                <li>Select the saved .csv file.</li>
86 3 alavrent
            </ul>
87 3 alavrent
            <p>You will need to correctly configure your spreadsheet software to read the file. We
88 3 alavrent
                recommend using LibreOffice or OpenOffice Calc, which will prompt the user for
89 3 alavrent
                settings whenever a .csv file is opened. The following settings are required for the
90 3 alavrent
                import to function:</p>
91 3 alavrent
            <ul>
92 3 alavrent
                <li>Character set: Unicode (UTF-8);</li>
93 3 alavrent
                <li>Separated by Tab (ONLY);</li>
94 3 alavrent
                <li>Merge delimiters OFF;</li>
95 3 alavrent
                <li>Text delimiter: NONE (empty box)</li>
96 3 alavrent
            </ul>
97 3 alavrent
            <p>Troubleshooting likely problems:</p>
98 3 alavrent
            <ul>
99 3 alavrent
                <li>If accented characters do not appear correctly &gt; check the character set is
100 3 alavrent
                    UTF-8;</li>
101 3 alavrent
                <li>If some rows do not seem to have the correct number of columns &gt; check that
102 3 alavrent
                    Text Delimiter is set to nothing (the default is usually double quote, which
103 3 alavrent
                    will cause an error where the text contains double quotes), merge delimiters is
104 3 alavrent
                    OFF, and TAB is the only separator selected.</li>
105 3 alavrent
                <li>If zeros appear rather than punctuation (unlikely) &gt; use the ‘Fields’ section
106 3 alavrent
                    of the import window to set every column type to ‘Text’ rather than
107 3 alavrent
                    ‘Standard’.</li>
108 3 alavrent
            </ul>
109 3 alavrent
            <h2><a name="query"></a>Writing a simple query</h2>
110 3 alavrent
            <p>The following section will enable you to write simple TIGERSearch queries for the
111 3 alavrent
                SRCMF corpus. It is not comprehensive, and must be read in conjunction with:</p>
112 3 alavrent
            <ul>
113 3 alavrent
                <li>chapter III of the <a target="_blank"
114 3 alavrent
                        href="http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/manual_html.html"
115 3 alavrent
                        >TIGERSearch user’s guide</a></li>
116 3 alavrent
            </ul>
117 3 alavrent
            <h3>Nodes in the TS graph</h3>
118 3 alavrent
            <p>A TigerSearch graph is made up of two types of nodes: terminal and non-terminal
119 3 alavrent
                nodes. In the graph viewer, terminal nodes appear at the bottom of the graph, while
120 3 alavrent
                non-terminal nodes are represented by labelled white ovals, as shown in the example
121 3 alavrent
                    <i>je puis dire</i>.</p>
122 3 alavrent
            <img src="images/jepuisdire.png" alt="Example TIGERSearch tree" />
123 3 alavrent
            <p>Each node has a number of features (see section “<a href="#tags">Tagset used</a>”</p>
124 3 alavrent
            <h4>SRCMF: ‘split’ nodes</h4>
125 3 alavrent
            <p>In a true dependency graph, words form the only nodes.</p>
126 3 alavrent
            <p>In the TigerXML SRCMF corpus, each ‘word’ in the dependency structure is in fact
127 3 alavrent
                split between a terminal node (which contains the lexical form and the PoS tag of
128 3 alavrent
                the word itself) and a non-terminal node (which contains the syntactic features of
129 3 alavrent
                the structure headed by the word). The non-terminal node and the terminal node are
130 3 alavrent
                linked by an edge labelled ‘L’ (for lexical realization).</p>
131 3 alavrent
            <p>In the example tree, an ‘L’ edge links:</p>
132 3 alavrent
            <ul>
133 3 alavrent
                <li>the terminal node <i>puis</i> to the non-terminal node ‘Snt’: these nodes
134 3 alavrent
                    represent the finite verb which heads the sentence;</li>
135 3 alavrent
                <li>the terminal node <i>je</i> to the non-terminal node ‘SjPer’: these nodes
136 3 alavrent
                    represent the subject of the sentence <i>je</i>;</li>
137 3 alavrent
                <li>the terminal node <i>dire</i> to the non-terminal node ‘AuxA’: these nodes
138 3 alavrent
                    represent the infinitive verb <i>dire</i>.</li>
139 3 alavrent
            </ul>
140 3 alavrent
            <p>A ‘D’ edge links the ‘Snt’ node to the non-terminal nodes ‘SjPer’ and ‘AuxA’: this
141 3 alavrent
                indicates that the subject <i>je</i> and the ‘auxiliated’ infinitive <i>dire</i>
142 3 alavrent
                depend on the main verb <i>puis</i>.</p>
143 3 alavrent
            <h4>SRCMF corpus node features</h4>
144 3 alavrent
            <p>The SRCMF corpus has the following node features:</p>
145 3 alavrent
            <p><i>Terminal nodes:</i></p>
146 3 alavrent
            <ul>
147 3 alavrent
                <li><tt>word</tt>: the word form</li>
148 3 alavrent
                <li><tt>pos</tt>: part-of-speech tag (Cattex)</li>
149 3 alavrent
                <li><tt>form</tt>: whether the text is verse or prose, and position of the word in
150 3 alavrent
                    the line of verse.</li>
151 3 alavrent
            </ul>
152 3 alavrent
            <p><i>Non-terminal nodes:</i></p>
153 3 alavrent
            <ul>
154 3 alavrent
                <li><tt>cat</tt>: function of the structure headed by the node</li>
155 3 alavrent
                <li><tt>type</tt>: morpho-syntactic category of the node (VFin, VPar, VInf, NV)</li>
156 3 alavrent
                <li><tt>headpos</tt>: part-of-speech tag of the head word</li>
157 3 alavrent
                <li><tt>coord</tt>: set to ‘y’ if the structure forms part of a coordination</li>
158 3 alavrent
                <li><tt>dom</tt>: underscore-separated list of all functions dominated by the node
159 3 alavrent
                    (e.g. for the ‘Snt’ node above ‘AuxA_SjPer’)</li>
160 3 alavrent
            </ul>
161 3 alavrent
            <p>For simple queries, we will focus mainly on the <tt>word</tt>, <tt>pos</tt> and
162 3 alavrent
                    <tt>cat</tt> features.</p>
163 3 alavrent
            <h4>Defining the feature specifications of a node</h4>
164 3 alavrent
            <p>Node feature specifications are written between [square brackets] and take the
165 3 alavrent
                following form:</p>
166 3 alavrent
            <ul>
167 3 alavrent
                <li><tt>[feature operator "value"]</tt></li>
168 3 alavrent
            </ul>
169 3 alavrent
            <p>where <i>value</i> is a string or</p>
170 3 alavrent
            <ul>
171 3 alavrent
                <li><tt>[feature operator /value/]</tt></li>
172 3 alavrent
            </ul>
173 3 alavrent
            <p>where <i>value</i> is a regular expression. Permitted <i>operators</i> are ‘=’
174 3 alavrent
                (equals) and ‘!=’ (does not equal). For example, the following expression identifies
175 3 alavrent
                all nodes where <tt>cat</tt> is "SjPer" (personal subject):</p>
176 3 alavrent
            <ul>
177 3 alavrent
                <li><tt>[cat = "SjPer"]</tt></li>
178 3 alavrent
            </ul>
179 3 alavrent
            <p>If we wish to include impersonal subjects (i.e. "SjPer" and "SjImp") we can use a
180 3 alavrent
                regular expression:</p>
181 3 alavrent
            <ul>
182 3 alavrent
                <li><tt>[cat = /Sj.*/]</tt></li>
183 3 alavrent
            </ul>
184 3 alavrent
            <p>We can identify all nodes which are <i>not</i> subjects:</p>
185 3 alavrent
            <ul>
186 3 alavrent
                <li><tt>[cat != /Sj.*/]</tt></li>
187 3 alavrent
            </ul>
188 3 alavrent
            <p>We may also the conjunction (&amp;) operator within the square brackets to specify
189 3 alavrent
                several properties. For example, we can search for subordinate clause subjects by
190 3 alavrent
                requiring the subject to be headed by a finite verb (<tt>type</tt> is "VFin"):</p>
191 3 alavrent
            <ul>
192 3 alavrent
                <li><tt>[cat = /Sj.*/ &amp; type = "VFin"]</tt></li>
193 3 alavrent
            </ul>
194 3 alavrent
            <h4>Assigning a variable name to a node</h4>
195 3 alavrent
            <p>A variable name may be assigned to the node definition. These are useful to refer to
196 3 alavrent
                the same node several times in a complex query and are also used to indicate the
197 3 alavrent
                pivot node to concordance scripts.</p>
198 3 alavrent
            <p>Variable definitions adopt the following syntax:</p>
199 3 alavrent
            <ul>
200 3 alavrent
                <li><tt>#name:[&lt;definition&gt;]</tt></li>
201 3 alavrent
            </ul>
202 3 alavrent
            <p>where <i>definition</i> is a feature specification as described above. Note that
203 3 alavrent
                variable names must begin with hash (#) and are separated from their definition by a
204 3 alavrent
                colon (:).</p>
205 3 alavrent
            <p>For example, we may to construct a concordance in which the subject forms the pivot.
206 3 alavrent
                We define the #pivot variable as follows:</p>
207 3 alavrent
            <ul>
208 3 alavrent
                <li><tt>#pivot:[cat = /Sj.*/]</tt></li>
209 3 alavrent
            </ul>
210 3 alavrent
            <h3>Node relations</h3>
211 3 alavrent
            <p>All but the most simple queries will require more than one node to be defined, and
212 3 alavrent
                will usually require the relationship between the nodes to be specified. </p>
213 3 alavrent
            <p>For example, suppose we wish to identify all subjects headed by the word
214 3 alavrent
                    <i>Tristran</i>. First, we define the subject:</p>
215 3 alavrent
            <ul>
216 3 alavrent
                <li><tt>#subject:[cat = /Sj.*/]</tt></li>
217 3 alavrent
            </ul>
218 3 alavrent
            <p>Second, we define the word Tristran as a terminal node:</p>
219 3 alavrent
            <ul>
220 3 alavrent
                <li><tt>#tristran:[word = "Tristran"]</tt></li>
221 3 alavrent
            </ul>
222 3 alavrent
            <p>Finally, we must indicate the relationship between the nodes. The relationship
223 3 alavrent
                between a non-terminal node and the terminal node representing its lexical content
224 3 alavrent
                in the TigerSearch graph is one of direct dominance, labelled ‘L’ (lexical).</p>
225 3 alavrent
            <h4>Direct dominance</h4>
226 3 alavrent
            <p>In TigerSearch, direct dominance is expressed by using the operator ‘&gt;’ with the
227 3 alavrent
                following syntax:</p>
228 3 alavrent
            <ul>
229 3 alavrent
                <li><tt>node &gt;[label] node2</tt></li>
230 3 alavrent
            </ul>
231 3 alavrent
            <p>where <i>node</i> and <i>node2</i> are feature specifications or node variables, and
232 3 alavrent
                label (optional) is a string.</p>
233 3 alavrent
            <p>To identify subjects headed by the word <i>Tristran</i>, the relationship between
234 3 alavrent
                nodes #subject and #tristran is expressed as follows:</p>
235 3 alavrent
            <ul>
236 3 alavrent
                <li><tt>#subject &gt;L #tristran</tt></li>
237 3 alavrent
            </ul>
238 3 alavrent
            <h4>Left corner dominance</h4>
239 3 alavrent
            <p>The ‘>@l’ operator specifies the leftmost terminal node dominated at any depth by a
240 3 alavrent
                non-terminal node. It has the following syntax:</p>
241 3 alavrent
            <ul>
242 3 alavrent
                <li><tt>node &gt;@l tnode</tt></li>
243 3 alavrent
            </ul>
244 3 alavrent
            <p>where <i>node</i> and <i>tnode</i> are feature specifications or node variables, and
245 3 alavrent
                    <i>tnode</i> is a terminal node.</p>
246 3 alavrent
            <p>For example, instead of searching for all subjects which are headed by the word
247 3 alavrent
                    <i>Tristran</i>, we may wish to identify all subjects <strong>beginning</strong>
248 3 alavrent
                with the word <i>Tristran</i>. This relation would be written as follows:</p>
249 3 alavrent
            <ul>
250 3 alavrent
                <li><tt>#subject &gt;@l #tristran</tt></li>
251 3 alavrent
            </ul>
252 3 alavrent
            <p>Note that there is also a right corner dominance operator ‘>@r’.</p>
253 3 alavrent
            <h4>Precedence</h4>
254 3 alavrent
            <p>The precedence operator ‘.*’ permits the user to specify the word order of two
255 3 alavrent
                terminal nodes with the following syntax:</p>
256 3 alavrent
            <ul>
257 3 alavrent
                <li><tt>tnode .* tnode2</tt></li>
258 3 alavrent
            </ul>
259 3 alavrent
            <p>where <i>tnode</i> and <i>tnode2</i> are feature specifications or node variables
260 3 alavrent
                representing terminal nodes.</p>
261 3 alavrent
            <p> For example, suppose we wish to identify all sentences in which the word Tristran
262 3 alavrent
                heads the subject and precedes the main clause verb.</p>
263 3 alavrent
            <p>We need to add two additional conditions to the query in the previous section. First,
264 3 alavrent
                we need to identify the terminal node containing the main verb of the sentence: i.e.
265 3 alavrent
                the lexical realization of the non-terminal node ‘Snt’:</p>
266 3 alavrent
            <ul>
267 3 alavrent
                <li><tt>#snt:[cat = "Snt"] &gt;L #verb</tt></li>
268 3 alavrent
            </ul>
269 3 alavrent
            <p>You may have noticed that #verb has no feature specification. This is perfectly valid
270 3 alavrent
                in TigerSearch query syntax. In practice, we know that only one node can be linked
271 3 alavrent
                to #snt by an ‘L’ relation in the corpus. #Verb is thus defined by its relation to
272 3 alavrent
                #snt rather than by its features.</p>
273 3 alavrent
            <p>We then need to specify that the word Tristran precedes the verb:</p>
274 3 alavrent
            <ul>
275 3 alavrent
                <li><tt>#tristran .* #verb</tt></li>
276 3 alavrent
            </ul>
277 3 alavrent
            <p>Finally, we need to clarify that #subject is the the subject of #snt. Otherwise, we
278 3 alavrent
                risk finding subjects of a subordinate clause which happen to precede the main
279 3 alavrent
                clause verb:</p>
280 3 alavrent
            <ul>
281 3 alavrent
                <li><tt>#snt &gt;D #subject</tt></li>
282 3 alavrent
            </ul>
283 3 alavrent
            <p>Putting it all together, the query is as follows:</p>
284 3 alavrent
            <ul>
285 3 alavrent
                <li><tt>#subject:[cat = /Sj.*/] &gt;L #tristran:[word = "Tristran"] <br /> &amp;
286 3 alavrent
                        #snt:[cat = "Snt"] &gt;L #verb <br /> &amp; #tristran .* #verb <br /> &amp;
287 3 alavrent
                        #snt &gt;D #subject</tt></li>
288 3 alavrent
            </ul>
289 3 alavrent
            <p>There is also a direct precedence operator, ‘.’, which specifies that the two
290 3 alavrent
                terminal nodes must be directly adjacent.</p>
291 3 alavrent
            <h4>Negation</h4>
292 3 alavrent
            <p>It is important to learn one (extremely frustrating) golden rule of Tiger query
293 3 alavrent
                syntax:</p>
294 3 alavrent
            <ul>
295 3 alavrent
                <li>you can negate a feature specification (e.g. <tt>[cat != "SjPer"]</tt>);</li>
296 3 alavrent
                <li>you can negate a relation between nodes (e.g. <tt>#subject !&gt;L
297 3 alavrent
                    #tristran</tt>)</li>
298 3 alavrent
                <li><strong>but you can’t negate the existence of a node!</strong></li>
299 3 alavrent
            </ul>
300 3 alavrent
            <p>In practice, this means that when we write:</p>
301 3 alavrent
            <ul>
302 3 alavrent
                <li><tt>#snt:[cat = "Snt"] !&gt;D #subject:[cat = /Sj.*/]</tt></li>
303 3 alavrent
            </ul>
304 3 alavrent
            <p>we have <strong>not</strong> found all null subject main clauses. Instead, we have
305 3 alavrent
                asked for sentences (#snt) which contain a subject node (#subject) which is
306 3 alavrent
                    <strong>not</strong> the subject of a sentence. TigerSearch will return all
307 3 alavrent
                sentences with subjects in a subordinate clause.</p>
308 3 alavrent
            <p>The SRCMF corpus provides a partial work-around for this problem by using the
309 3 alavrent
                    <i>dom</i> feature. The <i>dom</i> feature of a non-terminal node lists the cat
310 3 alavrent
                features of all nodes linked to it by a ‘D’ edge in alphabetical order separated by
311 3 alavrent
                an underscore. For example, the ‘Snt’ node in the example tree has two dependants:
312 3 alavrent
                SjPer and AuxA. It therefore has a <i>dom</i> property ‘AuxA_SjPer’.</p>
313 3 alavrent
            <p>As a result, we can identify all main clauses without subjects by negating the
314 3 alavrent
                    <i>dom</i> feature:</p>
315 3 alavrent
            <ul>
316 3 alavrent
                <li><tt>#snt:[cat = "Snt" &amp; dom != /.*Sj.*/]</tt></li>
317 3 alavrent
            </ul>
318 3 alavrent
            <p>This will return all ‘Snt’ nodes whose <i>dom</i> property does not contain the
319 3 alavrent
                characters ‘Sj’: in other words, a main clause without an expressed subject.</p>
320 3 alavrent
            <h4>Syntactic variation</h4>
321 3 alavrent
            <p>TigerSearch syntax is quite flexible, and we may express queries in a number of ways.
322 3 alavrent
                For example, the query identifying all subjects headed by the word <i>Tristran</i>
323 3 alavrent
                may be expressed using three statements...</p>
324 3 alavrent
            <ul>
325 3 alavrent
                <li><tt>#subject:[cat = /Sj.*/] <br /> &amp; #tristran:[word = "Tristran"] <br />
326 3 alavrent
                        &amp; #subject &gt;L #tristran</tt></li>
327 3 alavrent
            </ul>
328 3 alavrent
            <p>... or two statements, e.g.:</p>
329 3 alavrent
            <ul>
330 3 alavrent
                <li><tt>#subject:[cat = /Sj.*/] <br /> &amp; #subject &gt;L #tristran:[word =
331 3 alavrent
                        "Tristran"]</tt></li>
332 3 alavrent
            </ul>
333 3 alavrent
            <p>... or one statement:</p>
334 3 alavrent
            <ul>
335 3 alavrent
                <li><tt>#subject:[cat = /Sj.*/] &gt;L #tristran:[word = "Tristran"]</tt></li>
336 3 alavrent
            </ul>
337 3 alavrent
            <p>... or without variable names:</p>
338 3 alavrent
            <ul>
339 3 alavrent
                <li><tt>[cat = /Sj.*/] &gt;L [word = "Tristran"]</tt></li>
340 3 alavrent
            </ul>
341 3 alavrent
            <p>Where multiple statements are used, the order of statements is irrelevant.
342 3 alavrent
                Confusingly for programmers, you may reference variables before assigning a value,
343 3 alavrent
                e.g.:</p>
344 3 alavrent
            <ul>
345 3 alavrent
                <li><tt>#subject &gt;L #tristran &amp; #tristran:[word = "Tristran"] &amp;
346 3 alavrent
                        #subject:[cat = /Sj.*/]</tt></li>
347 3 alavrent
            </ul>
348 3 alavrent
            <h2><a name="concordances"></a>Using concordances</h2>
349 3 alavrent
            <p>The SRCMF project has developed a number of concordances to present the results of
350 3 alavrent
                TigerSearch queries in tabular format. Three concordances are currently
351 3 alavrent
                implemented:</p>
352 3 alavrent
            <ul>
353 3 alavrent
                <li>basic concordance</li>
354 3 alavrent
                <li>single word pivot concordance</li>
355 3 alavrent
                <li>pivot and block concordance</li>
356 3 alavrent
            </ul>
357 3 alavrent
            <p>These concordances produce a text CSV file.</p>
358 3 alavrent
            <h3>Principles</h3>
359 3 alavrent
            <p>The concordances use the names of variables from the TigerSearch query to identify
360 3 alavrent
                the syntactic constituents which should form the focus of the table. All
361 3 alavrent
                concordances require a #pivot variable to be present in the query.</p>
362 3 alavrent
            <p>For example, the following query is correct in TigerSearch, but <strong>will
363 3 alavrent
                    not</strong> produce a concordance:</p>
364 3 alavrent
            <ul>
365 3 alavrent
                <li><tt>[word = /Tristr?a[nm][sz]?/]</tt></li>
366 3 alavrent
            </ul>
367 3 alavrent
            <p>To produce a concordance, the query must identify a node as the #pivot, for
368 3 alavrent
                example:</p>
369 3 alavrent
            <ul>
370 3 alavrent
                <li><tt><strong>#pivot:</strong>[word = /Tristr?a[nm][sz]?/]</tt></li>
371 3 alavrent
            </ul>
372 3 alavrent
            <h3>Basic concordance</h3>
373 3 alavrent
            <p>The basic concordance has four columns:</p>
374 3 alavrent
            <ul>
375 3 alavrent
                <li>sentence ID</li>
376 3 alavrent
                <li>left context</li>
377 3 alavrent
                <li>pivot</li>
378 3 alavrent
                <li>right context</li>
379 3 alavrent
            </ul>
380 3 alavrent
            <p>The #pivot can be any node in the syntactic tree, either a single word or a larger
381 3 alavrent
                structure. Currently, only lexical information (not annotation) can be shown in the
382 3 alavrent
                basic concordance.</p>
383 3 alavrent
            <p>For example, we may wish to create a concordance of all the main clause subjects
384 3 alavrent
                containing the word ‘Tristran’:</p>
385 3 alavrent
            <ul>
386 3 alavrent
                <li><tt>#snt:[cat = "Snt"] &gt;D #pivot:[cat = "SjPer"] &amp; #pivot &gt;* [word =
387 3 alavrent
                        /Tristr?a[nm][sz]?/]</tt></li>
388 3 alavrent
            </ul>
389 3 alavrent
            <p>Note that the #pivot variable is attached to the subject node (cat = "SjPer").</p>
390 3 alavrent
            <p>Below is a selection of the results from the concordance:</p>
391 3 alavrent
            <table border="1">
392 3 alavrent
                <tr>
393 3 alavrent
                    <th>ID</th>
394 3 alavrent
                    <th>contexte gauche</th>
395 3 alavrent
                    <th>pivot</th>
396 3 alavrent
                    <th>contexte droite</th>
397 3 alavrent
                </tr>
398 3 alavrent
                <tr>
399 3 alavrent
                    <td>beroul_pb:8_lb:234_1263227636.06</td>
400 3 alavrent
                    <td>di por averté Ce saciés vos de verité Atant s' en est Iseut tornee</td>
401 3 alavrent
                    <td>Tristran</td>
402 3 alavrent
                    <td>l' a plorant salüee Sor le perron de marbre bis Tristran s' apuie ce</td>
403 3 alavrent
                </tr>
404 3 alavrent
                <tr>
405 3 alavrent
                    <td>beroul_pb:13_lb:415_1264876249.02</td>
406 3 alavrent
                    <td># croiz Einz croiz parole fole et vaine Ma bone foi me fera saine Tristran
407 3 alavrent
                        [remest] a qui * mot poise </td>
408 3 alavrent
                    <td>Tristran tes niés </td>
409 3 alavrent
                    <td>vint soz cel pin Qui * est laienz en cel jardin Si me manda</td>
410 3 alavrent
                </tr>
411 3 alavrent
                <tr>
412 3 alavrent
                    <td>beroul_pb:134_lb:4365_1268928771.68</td>
413 3 alavrent
                    <td>moi le reçoive En sus l' atent s' espee tient Goudoïne autre voie tient</td>
414 3 alavrent
                    <td>Tristran [remest] a qui * mot poise</td>
415 3 alavrent
                    <td>Ist du * buison cela part toise Mais por noient quar cil s' esloigne</td>
416 3 alavrent
                </tr>
417 3 alavrent
            </table>
418 3 alavrent
            <p>Note that the pivot may be one or more words.</p>
419 3 alavrent
            <h3>What do the square brackets ([]), slashes (/), asterisks (*) and hashes (#)
420 3 alavrent
                mean?</h3>
421 3 alavrent
            <p>The third example in the above table contains [square brackets] in the pivot. These
422 3 alavrent
                are used in all concordances to indicate <strong>words which occur between parts of
423 3 alavrent
                    a discontinuous syntactic constituent</strong>.</p>
424 3 alavrent
            <p>The annotated subject in this sentence is <i>Tristran ... a qui mot poise</i>. The
425 3 alavrent
                main verb of the sentence, <i>remest</i>, is not part of the subject, but occurs
426 3 alavrent
                between its two parts. The verb <i>remest</i> is included in the pivot column, but
427 3 alavrent
                surrounded by square brackets.</p>
428 3 alavrent
            <p>This means that:</p>
429 3 alavrent
            <ul>
430 3 alavrent
                <li>the pivot column contains <strong>all parts</strong> of discontinuous
431 3 alavrent
                    pivots;</li>
432 3 alavrent
                <li>reading the concordance from left to right will always give the original
433 3 alavrent
                    sentence.</li>
434 3 alavrent
            </ul>
435 3 alavrent
            <p>Slashes (/) indicate division between sentences in the syntactic annotation. These
436 3 alavrent
                will not correspond to the editor’s division into sentences as shown in the
437 3 alavrent
                punctuation.</p>
438 3 alavrent
            <p>Asterisks (*) indicate that the preceding word has two syntactic functions (e.g.
439 3 alavrent
                    <i>qui</i> in <i>a qui mot poise</i> is both a relator and a subject). They may
440 3 alavrent
                usually be ignored.</p>
441 3 alavrent
            <p>Hashes (#) are related to the representation of coordination, and may always be
442 3 alavrent
                ignored.</p>
443 3 alavrent
            <h3>Single word pivot concordance</h3>
444 3 alavrent
            <p>The single word pivot concordance has a variable number of columns, based on the
445 3 alavrent
                following structure:</p>
446 3 alavrent
            <ul>
447 3 alavrent
                <li>ID</li>
448 3 alavrent
                <li>Left context outside the SRCMF sentence containing the pivot</li>
449 3 alavrent
                <li>Left context within the SRCMF sentence containing the pivot</li>
450 3 alavrent
                <li>Pivot</li>
451 3 alavrent
                <li>Structure headed by the pivot</li>
452 3 alavrent
                <li>Function of the structure headed by the pivot</li>
453 3 alavrent
                <li>Right context within the SRCMF sentence containing the pivot</li>
454 3 alavrent
                <li>Right context outside the SRCMF sentence containing the pivot</li>
455 3 alavrent
            </ul>
456 3 alavrent
            <p>The single word pivot concordance is designed to give as much information as possible
457 3 alavrent
                about a single word. For example, a concordance could be created around the word
458 3 alavrent
                "Tristran":</p>
459 3 alavrent
            <ul>
460 3 alavrent
                <li><tt>#pivot:[word = /Tristr?a[nm][sz]?/]</tt></li>
461 3 alavrent
            </ul>
462 3 alavrent
            <p>Below is a selection of the results from the concordance (some columns are
463 3 alavrent
                omitted):</p>
464 3 alavrent
            <table border="1">
465 3 alavrent
                <tr>
466 3 alavrent
                    <th>Left context in sentence</th>
467 3 alavrent
                    <th>Pivot</th>
468 3 alavrent
                    <th>Pivot-headed structure</th>
469 3 alavrent
                    <th>Right context in sentence</th>
470 3 alavrent
                </tr>
471 3 alavrent
                <tr>
472 3 alavrent
                    <td>Sire</td>
473 3 alavrent
                    <td>Tristran</td>
474 3 alavrent
                    <td>Tristran</td>
475 3 alavrent
                    <td>por Deu le roi Si grant pechié avez de moi Qui * me mandez a itel ore</td>
476 3 alavrent
                </tr>
477 3 alavrent
                <tr>
478 3 alavrent
                    <td></td>
479 3 alavrent
                    <td>Tristran</td>
480 3 alavrent
                    <td>Tristran tes niés</td>
481 3 alavrent
                    <td>tes niés vint soz cel pin Qui * est laienz en cel jardin</td>
482 3 alavrent
                </tr>
483 3 alavrent
                <tr>
484 3 alavrent
                    <td># Que por Yseut que por</td>
485 3 alavrent
                    <td>Tristranz</td>
486 3 alavrent
                    <td>que por Tristranz</td>
487 3 alavrent
                    <td>Mervellose joie menoient</td>
488 3 alavrent
                </tr>
489 3 alavrent
            </table>
490 3 alavrent
            <p>The ‘pivot-headed structure’ gives the noun phrase of which the word <i>Tristan</i>
491 3 alavrent
                is head. In the second example, for instance, the word <i>Tristran</i> heads the
492 3 alavrent
                structure <i>Tristan tes niés</i>.</p>
493 3 alavrent
            <p>Note that words appearing in the ‘pivot-headed structure’ column are also found in
494 3 alavrent
                the two context columns. The original sentence may be read across the columns left
495 3 alavrent
                context — pivot — right context.</p>
496 3 alavrent
            <h3>Pivot and block concordance</h3>
497 3 alavrent
            <h4>Introduction</h4>
498 3 alavrent
            <p>The pivot and block concordance is designed to highlight the position of certain
499 3 alavrent
                constituents, called ‘blocks’ (e.g. the subject) with respect to a pivot (e.g. the
500 3 alavrent
                verb). The resulting CSV files are complex, with a large number of columns, and are
501 3 alavrent
                intended as the basis for more detailed analysis in spreadsheet software.</p>
502 3 alavrent
            <p>The pivot and block concordances has the following basic structure:</p>
503 3 alavrent
            <ul>
504 3 alavrent
                <li>ID</li>
505 3 alavrent
                <li>Left context outside the SRCMF sentence containing the pivot</li>
506 3 alavrent
                <li>Left context within the SRCMF sentence containing the pivot</li>
507 3 alavrent
                <li>Pre-pivot blocks</li>
508 3 alavrent
                <li>Pivot</li>
509 3 alavrent
                <li>Post-pivot blocks</li>
510 3 alavrent
                <li>Right context within the SRCMF sentence containing the pivot</li>
511 3 alavrent
                <li>Right context outside the SRCMF sentence containing the pivot</li>
512 3 alavrent
            </ul>
513 3 alavrent
            <p>As with the other concordances, TigerSearch queries must define a #pivot variable.
514 3 alavrent
                However, any number of variables whose name begins ‘#block’ may be defined. At least
515 3 alavrent
                one ‘#blockXX’ variable is required.</p>
516 3 alavrent
            <p>For example, the following query will generate a pivot and block concordance to show
517 3 alavrent
                the position of the subject (#block1) with respect to the finite verb (#pivot):</p>
518 3 alavrent
            <ul>
519 3 alavrent
                <li><tt>#snt:[cat = "Snt"] &gt;D #block1:[cat = "SjPer"] &amp; #snt &gt;L
520 3 alavrent
                        #pivot</tt></li>
521 3 alavrent
            </ul>
522 3 alavrent
            <p>In essence, the central section of the resulting concordance will take the following
523 3 alavrent
                form:</p>
524 3 alavrent
            <table border="1">
525 3 alavrent
                <tr>
526 3 alavrent
                    <th>Left context</th>
527 3 alavrent
                    <th>Block</th>
528 3 alavrent
                    <th>Pivot</th>
529 3 alavrent
                    <th>Block</th>
530 3 alavrent
                    <th>Right context</th>
531 3 alavrent
                </tr>
532 3 alavrent
                <tr>
533 3 alavrent
                    <td></td>
534 3 alavrent
                    <td>Li rois</td>
535 3 alavrent
                    <td>pense</td>
536 3 alavrent
                    <td></td>
537 3 alavrent
                    <td>que par folie Sire Tristran vos aie amé</td>
538 3 alavrent
                </tr>
539 3 alavrent
                <tr>
540 3 alavrent
                    <td>Si</td>
541 3 alavrent
                    <td></td>
542 3 alavrent
                    <td>voient</td>
543 3 alavrent
                    <td>il</td>
544 3 alavrent
                    <td># Deu et son reigne</td>
545 3 alavrent
                </tr>
546 3 alavrent
            </table>
547 3 alavrent
            <p>Where the subject is pre-verbal, it appears in the block column to the left of the
548 3 alavrent
                pivot. Where it is post-verbal, it appears in the block column to the right of the
549 3 alavrent
                pivot.</p>
550 3 alavrent
            <h4>Why are there square brackets ([]) and curly brackets ({}) in the concordance?</h4>
551 3 alavrent
            <p>As with other concordances, square brackets denote <strong>words occurring between
552 3 alavrent
                    two parts of a discontinuous unit</strong>. The difference in this concordance
553 3 alavrent
                is that blocks may be discontinuous, as well as the pivot.</p>
554 3 alavrent
            <p>Curly brackets denote <strong>words which occur between the block and the
555 3 alavrent
                    pivot</strong> (or, in more complex examples, between two blocks).</p>
556 3 alavrent
            <table border="1">
557 3 alavrent
                <tr>
558 3 alavrent
                    <th>Left context</th>
559 3 alavrent
                    <th>Block</th>
560 3 alavrent
                    <th>Pivot</th>
561 3 alavrent
                    <th>Block</th>
562 3 alavrent
                    <th>Right context</th>
563 3 alavrent
                </tr>
564 3 alavrent
                <tr>
565 3 alavrent
                    <td></td>
566 3 alavrent
                    <td>Vos {n'}</td>
567 3 alavrent
                    <td>entendez</td>
568 3 alavrent
                    <td></td>
569 3 alavrent
                    <td>pas la raison</td>
570 3 alavrent
                </tr>
571 3 alavrent
                <tr>
572 3 alavrent
                    <td>Dex qel pitié</td>
573 3 alavrent
                    <td></td>
574 3 alavrent
                    <td>Faisoit</td>
575 3 alavrent
                    <td>{a} {mainte} {gent} li chiens</td>
576 3 alavrent
                    <td></td>
577 3 alavrent
                </tr>
578 3 alavrent
                <tr>
579 3 alavrent
                    <td></td>
580 3 alavrent
                    <td>Ta parole [est] [tost] [entendue] Que li rois la roïne prent</td>
581 3 alavrent
                    <td>est</td>
582 3 alavrent
                    <td></td>
583 3 alavrent
                    <td>tost entendue Que li rois la roïne prent</td>
584 3 alavrent
                </tr>
585 3 alavrent
                <tr>
586 3 alavrent
                    <td></td>
587 3 alavrent
                    <td>Tuit [s'] [escrïent] la gent du * reigne {s'}</td>
588 3 alavrent
                    <td>escrïent</td>
589 3 alavrent
                    <td></td>
590 3 alavrent
                    <td>la gent du * reigne</td>
591 3 alavrent
                </tr>
592 3 alavrent
            </table>
593 3 alavrent
            <p>In the table above, note the use of curly brackets in the first example to mark the
594 3 alavrent
                negative adverb <i>n’</i>, which occurs between the subject-block <i>vos</i> and the
595 3 alavrent
                verb-pivot <i>entendez</i>. In the second example, the prepositional phrase <i>a
596 3 alavrent
                    maintes gens</i> is marked with curly brackets, as it separates the verb-pivot
597 3 alavrent
                    <i>Faisoit</i> from the post-verbal subject-block <i>li chiens</i>.</p>
598 3 alavrent
            <p>In the third example, a discontinuous subject <i>Ta parole ... que li rois la roïne
599 3 alavrent
                    prent</i> appears in a pre-verbal block. <strong>The pre- or post-verbal
600 3 alavrent
                    position of a block is determined by the position of its first word relative to
601 3 alavrent
                    the pivot</strong>. The words <i>est tost entendue</i>, which separate the two
602 3 alavrent
                parts of the block, are marked with square brackets. </p>
603 3 alavrent
            <p>In the fourth example, the word <i>s’</i> appears (i) in square brackets, between the
604 3 alavrent
                two halves of a discontinuous subject-block and (ii) in curly brackets, between the
605 3 alavrent
                first part of the discontinuous subject <i>tost</i> and the verb-pivot
606 3 alavrent
                    <i>escrïent</i>.</p>
607 3 alavrent
            <h4>Why are there so many columns? I only asked for one block!</h4>
608 3 alavrent
            <p>The pivot and block concordance shows <strong>only one result per pivot</strong>.
609 3 alavrent
                Continuing to work with the same example, if a single verb-pivot has multiple
610 3 alavrent
                subject-blocks (which is quite possible in cases of coordination), each subject
611 3 alavrent
                occupies a separate column:</p>
612 3 alavrent
            <table border="1">
613 3 alavrent
                <tr>
614 3 alavrent
                    <th>Block3</th>
615 3 alavrent
                    <th>Block2</th>
616 3 alavrent
                    <th>Block1</th>
617 3 alavrent
                    <th>Pivot</th>
618 3 alavrent
                    <th>Block</th>
619 3 alavrent
                </tr>
620 3 alavrent
                <tr>
621 3 alavrent
                    <td>Ne tor</td>
622 3 alavrent
                    <td>ne mur</td>
623 3 alavrent
                    <td>ne fort chastel {Ne} {me}</td>
624 3 alavrent
                    <td>tendra</td>
625 3 alavrent
                    <td></td>
626 3 alavrent
                </tr>
627 3 alavrent
            </table>
628 3 alavrent
            <p>However, due to the way the number of columns is calculated, it is possible that some
629 3 alavrent
                will be empty. These may be deleted in the spreadsheet software, if you wish.</p>
630 3 alavrent
            <p>Note that the concordance will <strong>never</strong> represent the two halves of a
631 3 alavrent
                    <strong>single discontinuous</strong> block in separate columns. The following
632 3 alavrent
                representation therefore indicates a coordination:</p>
633 3 alavrent
            <table border="1">
634 3 alavrent
                <tr>
635 3 alavrent
                    <th>Left context</th>
636 3 alavrent
                    <th>Block</th>
637 3 alavrent
                    <th>Pivot</th>
638 3 alavrent
                    <th>Block</th>
639 3 alavrent
                    <th>Right context</th>
640 3 alavrent
                </tr>
641 3 alavrent
                <tr>
642 3 alavrent
                    <td></td>
643 3 alavrent
                    <td>Tristran {en}</td>
644 3 alavrent
                    <td>bese</td>
645 3 alavrent
                    <td>{la} {roïne} {Et} ele</td>
646 3 alavrent
                    <td>lui par la saisine</td>
647 3 alavrent
                </tr>
648 3 alavrent
            </table>
649 3 alavrent
            <p>The SRCMF of the sentence in this table identifies <strong>two coordinated
650 3 alavrent
                    subjects</strong> of the verb <i>bese</i>. One is pre-verbal (<i>Tristran</i>),
651 3 alavrent
                one is post-verbal (<i>ele</i>); both occupy separate blocks.</p>
652 3 alavrent
            <h3>Adding annotation information</h3>
653 3 alavrent
            <p>When a concordance is launched from the TXM-web interface, you may specify which
654 3 alavrent
                properties of terminal and non-terminal nodes you wish to see in the
655 3 alavrent
                concordance.</p>
656 3 alavrent
            <ul>
657 3 alavrent
                <li>On the ‘Export Concordance’ form, use the drop-down lists of ‘Non-terminal
658 3 alavrent
                    features’ and ‘Terminal Features’.</li>
659 3 alavrent
                <li>Select the features of terminal and non-terminal nodes that you wish to show in
660 3 alavrent
                    the concordance from the two drop-down lists.</li>
661 3 alavrent
                <li>Click ‘OK’.</li>
662 3 alavrent
            </ul>
663 3 alavrent
            <p>Each added property will be placed in a separate column next to the block or pivot.
664 3 alavrent
                For example, if the ‘cat’ property is selected for non-terminal nodes, and the ‘pos’
665 3 alavrent
                property is selected for terminal nodes, the query above will produce the following
666 3 alavrent
                concordance:</p>
667 3 alavrent
            <table border="1">
668 3 alavrent
                <tr>
669 3 alavrent
                    <th>Left context</th>
670 3 alavrent
                    <th>Block</th>
671 3 alavrent
                    <th>Block Cat</th>
672 3 alavrent
                    <th>Pivot</th>
673 3 alavrent
                    <th>Pivot Pos</th>
674 3 alavrent
                    <th>Block</th>
675 3 alavrent
                    <th>Block Cat</th>
676 3 alavrent
                    <th>Right context</th>
677 3 alavrent
                </tr>
678 3 alavrent
                <tr>
679 3 alavrent
                    <td></td>
680 3 alavrent
                    <td>Li rois</td>
681 3 alavrent
                    <td>SjPer</td>
682 3 alavrent
                    <td>pense</td>
683 3 alavrent
                    <td>VERcjg</td>
684 3 alavrent
                    <td></td>
685 3 alavrent
                    <td></td>
686 3 alavrent
                    <td>que par folie Sire Tristran vos aie amé</td>
687 3 alavrent
                </tr>
688 3 alavrent
                <tr>
689 3 alavrent
                    <td>Si</td>
690 3 alavrent
                    <td></td>
691 3 alavrent
                    <td></td>
692 3 alavrent
                    <td>voient</td>
693 3 alavrent
                    <td>VERcjg</td>
694 3 alavrent
                    <td>il</td>
695 3 alavrent
                    <td>SjPer</td>
696 3 alavrent
                    <td># Deu et son reigne</td>
697 3 alavrent
                </tr>
698 3 alavrent
            </table>
699 3 alavrent
            <h2><a name="tags"></a>Tagset</h2>
700 3 alavrent
            <h3>Non-terminal nodes</h3>
701 3 alavrent
            <p>Non-terminal nodes have the following properties and values:</p>
702 3 alavrent
            <h4>cat</h4>
703 3 alavrent
            <p>Gives the syntactic function of the element. For more details, please refer to the <a
704 3 alavrent
                    target="_blank" href="http://srcmf.org">SRCMF
705 3 alavrent
                    website</a>.</p>
706 3 alavrent
            <ul>
707 3 alavrent
                <li><a name="Apst"></a><strong>Apst</strong>: Vocative (fr. apostrophe)</li>
708 3 alavrent
                <li><a name="AtObj"></a><strong>AtObj</strong>: Object attribute</li>
709 3 alavrent
                <li><a name="AtRfc"></a><strong>AtRfc</strong>: Attribute of reflexive pronoun</li>
710 3 alavrent
                <li><a name="AtSj"></a><strong>AtSj</strong>: Subject attribute</li>
711 3 alavrent
                <li><a name="Aux"></a><strong>Aux</strong>: Auxiliated non-finite verb (neither
712 3 alavrent
                    passive nor active)</li>
713 3 alavrent
                <li><a name="AuxA"></a><strong>AuxA</strong>: Auxiliated non-finite verb
714 3 alavrent
                    (active)</li>
715 3 alavrent
                <li><a name="AuxP"></a><strong>AuxA</strong>: Auxiliated non-finite verb
716 3 alavrent
                    (passive)</li>
717 3 alavrent
                <li><a name="Circ"></a><strong>Circ</strong>: Adjunct (fr. circonstant)</li>
718 3 alavrent
                <li><a name="Cmpl"></a><strong>Cmpl</strong>: Complement</li>
719 3 alavrent
                <li><a name="Coo"></a><strong>Coo</strong>: Coordination</li>
720 3 alavrent
                <li><a name="GpCoo"></a><strong>GpCoo</strong>: Coordinated group (conjunct)</li>
721 3 alavrent
                <li><a name="Insrt"></a><strong>Insrt</strong>: Inserted clause</li>
722 3 alavrent
                <li><a name="Intj"></a><strong>Intj</strong>: Interjection</li>
723 3 alavrent
                <li><a name="ModA"></a><strong>ModA</strong>: Modifier (attached)</li>
724 3 alavrent
                <li><a name="ModD"></a><strong>ModD</strong>: Dislocated (detached) modifier</li>
725 3 alavrent
                <li><a name="Ng"></a><strong>Ng</strong>: Negation</li>
726 3 alavrent
                <li><a name="NgPrt"></a><strong>NgPrt</strong>: Negative particle (e.g. <i>pas</i>,
727 3 alavrent
                        <i>mie</i></li>
728 3 alavrent
                <li><a name="nSnt"></a><strong>nSnt</strong>: Non-sentence</li>
729 3 alavrent
                <li><a name="Obj"></a><strong>Obj</strong>: Object</li>
730 3 alavrent
                <li><a name="RelC"></a><strong>RelC</strong>: Coordinated relator</li>
731 3 alavrent
                <li><a name="RelNC"></a><strong>RelNC</strong>: Non-coordinating relator</li>
732 3 alavrent
                <li><a name="Regim"></a><strong>Regim</strong>: Regime</li>
733 3 alavrent
                <li><a name="Rfc"></a><strong>Rfc</strong>: Reflexive pronoun</li>
734 3 alavrent
                <li><a name="Rfx"></a><strong>Rfx</strong>: Doubled reflexive pronoun (e.g. <i>nous
735 3 alavrent
                        ... <strong>nous-mêmes</strong></i>)</li>
736 3 alavrent
                <li><a name="SjImp"></a><strong>SjImp</strong>: Impersonal subject</li>
737 3 alavrent
                <li><a name="SjPer"></a><strong>SjPer</strong>: Personal subject</li>
738 3 alavrent
                <li><a name="Snt"></a><strong>Snt</strong>: Sentence</li>
739 3 alavrent
            </ul>
740 3 alavrent
            <h4>type</h4>
741 3 alavrent
            <p>Gives the syntactic category of the head of the structure.</p>
742 3 alavrent
            <ul>
743 3 alavrent
                <li><a name="VFin"></a><strong>VFin</strong>: Finite verb form</li>
744 3 alavrent
                <li><a name="VInf"></a><strong>VInf</strong>: Infinitive</li>
745 3 alavrent
                <li><a name="VPar"></a><strong>VPar</strong>: Participle</li>
746 3 alavrent
                <li><a name="nV"></a><strong>nV</strong>: Non-verbal</li>
747 3 alavrent
            </ul>
748 3 alavrent
            <h4>dom</h4>
749 3 alavrent
            <p>A ‘dom’ property is added to each non-terminal node in the tree listing the functions
750 3 alavrent
                of all its dependants and relators in alphabetical order, separated by underscores.
751 3 alavrent
                For example, if a finite verb has a subject, object and two adjuncts, the property
752 3 alavrent
                [dom = "Circ_Circ_Obj_SjPer"] will be added.</p>
753 3 alavrent
            <p>This resolves to an extent the problem of ‘negative’ queries. Recall that it is
754 3 alavrent
                impossible to query the non-existence of a node:</p>
755 3 alavrent
            <ul>
756 3 alavrent
                <li><tt>#clause:[type = "VFin"] !&gt;D #suj:[cat = "SjPer"]</tt></li>
757 3 alavrent
            </ul>
758 3 alavrent
            <p>Contrary to appearances, this query DOES NOT mean ‘node #suj does not exist’: it
759 3 alavrent
                means that the node #suj exists, but is not dependant on #clause.</p>
760 3 alavrent
            <p>However, it is possible to find all finite verbs without a subject by using the dom
761 3 alavrent
                property of the finite verb:</p>
762 3 alavrent
            <ul>
763 3 alavrent
                <li><tt>#clause:[type = "VFin" &amp; dom != /.*SjPer.*/]</tt></li>
764 3 alavrent
            </ul>
765 3 alavrent
            <p>The query specifies that we wish to find a node #clause which is a finite verb and
766 3 alavrent
                does not have the string ‘SjPer’ in the list of dependant nodes given by the dom
767 3 alavrent
                property.</p>
768 3 alavrent
            <h4>coord</h4>
769 3 alavrent
            <p>A ‘coord’ property is added to each non-terminal node in the tree. If the node
770 3 alavrent
                represents a coordinated structure, [coord = "y"].</p>
771 3 alavrent
            <p>For example, in the sentence <i>Sade et douz est quanqu’est de li</i> (gcoin1: p. 3,
772 3 alavrent
                l. 31), <i>sade</i> and <i>douz</i> are coordinated AtSj. The non-terminal nodes
773 3 alavrent
                dominating the words <i>sade</i> and <i>douz</i> have the properties [cat = "AtSj"
774 3 alavrent
                &amp; coord="y"].</p>
775 3 alavrent
            <p>The ‘coord’ property exists primarily to allow non-coordinated structures to be
776 3 alavrent
                identified. In the original format, this is not possible, as it would require a
777 3 alavrent
                query specifying the non-existence of a node [cat = "Coo"]. However, with the coord
778 3 alavrent
                property, it is possible to restrict a query to non-coordinated structures only:</p>
779 3 alavrent
            <ul>
780 3 alavrent
                <li><tt>#suj:[cat = "SjPer" &amp; coord != "y"]</tt></li>
781 3 alavrent
            </ul>
782 3 alavrent
            <h4>headpos</h4>
783 3 alavrent
            <p>A ‘headpos’ property is added to each non-terminal node in the tree. If the text is
784 3 alavrent
                correctly annotated at the deep level, each non-terminal node representing a
785 3 alavrent
                structure should directly dominate at most one terminal node in the tree, the word
786 3 alavrent
                representing the lexical content of the head of the structure. If this is the case,
787 3 alavrent
                the ‘headpos’ property is equal to the ‘pos’ property of the dominated terminal
788 3 alavrent
                node. Thus:</p>
789 3 alavrent
            <ul>
790 3 alavrent
                <li><tt>#node:[headpos = "NOMcom"]</tt></li>
791 3 alavrent
            </ul>
792 3 alavrent
            <p>is equivalent to:</p>
793 3 alavrent
            <ul>
794 3 alavrent
                <li><tt>#node &gt;L #lexnode:[pos = "NOMcom"]</tt></li>
795 3 alavrent
            </ul>
796 3 alavrent
            <p>The headpos property does not improve the usability of the corpus in TigerSearch, but
797 3 alavrent
                is useful in producing concordances, providing a more detailed morpho-syntactic tag
798 3 alavrent
                for the head of a structure than the SRCMF ‘NV’ (non-verbal) type tag.</p>
799 3 alavrent
            <p>If the non-terminal node directly dominates more than one terminal node, the
800 3 alavrent
                algorithm generating the headpos property makes an calculated guess as to which word
801 3 alavrent
                is the head, and inserts the tag of this word as the ‘headpos’. For example, if a
802 3 alavrent
                non-terminal node dominates a word with pos ‘NOMcom’ and a word with pos ‘DETdef’,
803 3 alavrent
                the algorithm will guess that the noun is the head, and insert the headpos
804 3 alavrent
                ‘NOMcom?’.</p>
805 3 alavrent
            <p>Note that headpos values which have been ‘guessed’ are always suffixed by a question
806 3 alavrent
                mark (e.g. NOMcom?). There will be no guessed headpos values in texts with full NP
807 3 alavrent
                annotation.</p>
808 3 alavrent
            <h3>Terminal nodes</h3>
809 3 alavrent
            <p>Terminal nodes have the following properties:</p>
810 3 alavrent
            <h4>pos</h4>
811 3 alavrent
            <p>Part-of-speech tag (Cattex). For more information, please refer to the <a
812 3 alavrent
                    target="_blank" href="http://bfm.ens-lyon.fr/article.php3?id_article=323">Cattex
813 3 alavrent
                    documentation</a> on the <a target="_blank" href="http://bfm.ens-lyon.fr/">BFM website</a>.</p>
814 3 alavrent
            <h4>form</h4>
815 3 alavrent
            <p>Each word has a property “form”. For texts in prose, the value of the “form” tags is
816 3 alavrent
                always “prose”. For texts in verse, the form tag is:</p>
817 3 alavrent
            <ul>
818 3 alavrent
                <li>“vers_first” for the first word in a line;</li>
819 3 alavrent
                <li>“vers_end” for the last word in a line;</li>
820 3 alavrent
                <li>“vers” for other words.</li>
821 3 alavrent
            </ul>
822 3 alavrent
            <p>It is thus possible to formulate a TS query focusing on words at the beginning or end
823 3 alavrent
                of a line of verse:</p>
824 3 alavrent
            <ul>
825 3 alavrent
                <li><tt>[word = "Tristran" &amp; form = "vers_end"]</tt></li>
826 3 alavrent
            </ul>
827 3 alavrent
            <p>In <i>Aucassin and Nicolete</i>, the form tag correctly distinguishes the verse and
828 3 alavrent
                prose sections of the text.</p>
829 3 alavrent
            <h4>q</h4>
830 3 alavrent
            <p>Each word has a property “q”. This is equal to ‘y’ when the word occurs as part of
831 3 alavrent
                direct discourse, and ‘n’ when it does not. This annotation is automatically
832 3 alavrent
                generated by the BFM team from the position of quote marks in the text.</p>
833 3 alavrent
            <h2><a name="sample"></a>Sample queries</h2>
834 3 alavrent
            <p> The following sample queries may be tested by copying and pasting into the query
835 3 alavrent
                panel. </p>
836 3 alavrent
            <p>Find all main clause verbs:<br />
837 3 alavrent
                <tt>[cat = "Snt"]</tt></p>
838 3 alavrent
            <p>Find all structures introduced by a preposition:<br />
839 3 alavrent
                <tt>#n >R #relnc:[cat = "RelNC"]<br /> &amp; #relnc >L [pos = /PRE.*/]</tt><br />
840 3 alavrent
            </p>
841 3 alavrent
            <p>Find all post-verbal NP subjects:<br />
842 3 alavrent
                <tt>#verb:[type = "VFin"] >D #suj:[cat = "SjPer" &amp; type="nV"]<br /> &amp; #suj
843 3 alavrent
                    >L [pos = /NOM.*/] <br /> &amp; #suj >@l #sword<br /> &amp; #verb >L
844 3 alavrent
                    #vword<br /> &amp; #vword .* #sword</tt></p>
845 3 alavrent
            <p>Find indefinite subjects introduced by <q>qui</q>:<br />
846 3 alavrent
                <tt>[type = "VFin"] >D #suj:[cat = "SjPer"]<br /> &amp; #suj >R #relnc:[cat =
847 3 alavrent
                    "RelNC"]<br /> &amp; ( #relnc >L [word = /[QqKk]u?i/]<br /> | #relnc >~dupl
848 3 alavrent
                    [word = /[QqKk]u?i/] )</tt><br /></p>
849 3 alavrent
            <p>Find sentences with coordinated subjects:<br />
850 3 alavrent
                <tt>#coo:[cat = "Coo"] >~coord #sj1:[cat = "SjPer"]<br /> &amp; #coo >~coord
851 3 alavrent
                    #sj2:[cat = "SjPer"]<br /> &amp; #sj1 $ #sj2</tt></p>
852 3 alavrent
            <p>Find sentences with possible <q>gapping</q> of the finite verb (i.e. coordination of
853 3 alavrent
                subject–predicate pairs):<br />
854 3 alavrent
                <tt>#gpcoo1:[cat = "GpCoo"] >~ #suj1:[cat = "SjPer"]<br /> &amp; #gpcoo1 $.*
855 3 alavrent
                    #gpcoo2:[cat = "GpCoo"]<br /> &amp; #gpcoo2 >~ #suj2:[cat = "SjPer"]<br /> &amp;
856 3 alavrent
                    #gpcoo1 >~ #pred1:[cat = /Cmpl|Obj|AtSj/]<br /> &amp; #gpcoo2 >~ #pred2:[cat =
857 3 alavrent
                    /Cmpl|Obj|AtSj/]<br /></tt>
858 3 alavrent
            </p>
859 3 alavrent
            <h1> Useful links</h1>
860 3 alavrent
            <ul>
861 3 alavrent
                <li><a target="_blank" href="https://listes.cru.fr/wiki/srcmf/index">SRCMF wiki</a></li>
862 3 alavrent
                <li><a target="_blank" href="http://srcmf.org">SRCMF website</a></li>
863 3 alavrent
                <li><a
864 3 alavrent
                    target="_blank" href="http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/oldindex.shtml"
865 3 alavrent
                        >TIGERSearch website</a></li>
866 3 alavrent
                <li><a target="_blank" href="http://bfm.ens-lyon.fr/">BFM website</a></li>
867 3 alavrent
                <li><a target="_blank" href="http://textometrie.ens-lyon.fr/?lang=en">TXM website</a></li>
868 3 alavrent
            </ul>
869 3 alavrent
        </div>
870 3 alavrent
    </body>
871 3 alavrent
</html>