Révision 3926
TXM/trunk/bundles/org.txm.treetagger.core.linux/res/linux/FILES (revision 3926) | ||
---|---|---|
1 | 1 |
|
2 | 2 |
This package contains the TreeTagger, a probabilistic part-of-speech |
3 |
tagger developed by Helmut Schmid. All rights are reserved by the |
|
4 |
Institute for Computational Linguistics at the University of |
|
5 |
Stuttgart. The programs have been statically compiled for x86 computers
|
|
6 |
running Linux.
|
|
3 |
tagger developed by Helmut Schmid. All rights are reserved by the
|
|
4 |
Institute for Computational Linguistics at the University of Stuttgart.
|
|
5 |
The programs have been compiled for PCs with the Linux operating system.
|
|
6 |
Many thanks to Arne Fitschen for doing this!
|
|
7 | 7 |
|
8 | 8 |
Files contained in this package: |
9 | 9 |
|
... | ... | |
13 | 13 |
- bin/train-tree-tagger training program |
14 | 14 |
- bin/tree-tagger tagger programm |
15 | 15 |
- bin/separate-punctuation program for tokenization (used by the shell scripts) |
16 |
- cmd/lookup.perl Perl script for pretagging |
|
16 | 17 |
- doc/nemlap94.ps paper describing the TreeTagger |
17 | 18 |
- doc/sigdat95.ps paper describing the TreeTagger |
18 | 19 |
|
TXM/trunk/bundles/org.txm.treetagger.core.linux/res/linux/README (revision 3926) | ||
---|---|---|
1 | 1 |
|
2 |
/***************************************************************************/ |
|
3 |
/* How to use the TreeTagger */ |
|
4 |
/* Author: Helmut Schmid, University of Stuttgart, Germany */ |
|
5 |
/***************************************************************************/ |
|
2 |
/*****************************************************************************/ |
|
3 |
/* How to use the TreeTagger */ |
|
4 |
/*****************************************************************************/ |
|
6 | 5 |
|
7 | 6 |
|
8 | 7 |
The TreeTagger consists of two programs: train-tree-tagger is used to |
... | ... | |
24 | 23 |
input will be read from stdin. If neither an input file nor an output file |
25 | 24 |
is specified, the tagger will print to stdout. |
26 | 25 |
|
27 |
tree-tagger {-options-} <parameter file> {<input file> {<output file>}} |
|
26 |
tree-tagger <parameter file> <input file> <output file> {-eps <epsilon>} |
|
27 |
{-base} {-proto} {-sgml} {-token} {-lemma} {-beam <threshold>} |
|
28 | 28 |
|
29 | 29 |
Description of the command line arguments: |
30 | 30 |
|
31 | 31 |
* <parameter file>: Name of a parameter file which was created with the |
32 | 32 |
train-tree-tagger program. |
33 | 33 |
* <input file>: Name of the file which is to be tagged. Each token in this |
34 |
file has to be on a separate line. Tokens may contain blanks. It is possible
|
|
34 |
file must be on a separate line. Tokens may contain blanks. It is possible
|
|
35 | 35 |
to override the lexical information contained in the parameter file of the |
36 | 36 |
tagger by specifying a list of possible tags after a token. This list has |
37 |
to be preceded by a tab character and the elements are separated by tab |
|
38 |
characters. This pretagging feature could be used e.g. to ensure that |
|
37 |
to be preceded by a tab character. The tags are optionally followed by a |
|
38 |
floating point value to specify the probability of the tag. Adding such |
|
39 |
tag information in the tagger's input is sometimes useful to ensure that |
|
39 | 40 |
certain text-specific expressions are tagged properly. |
40 | 41 |
Punctuation marks must be on separate lines as well. Clitics (like "'s", |
41 | 42 |
"'re", and "'d" in English or "-la" and "-t-elle" in French) should be |
42 | 43 |
separated if they were separated in the training data. (The French and |
43 |
English parameter files available by ftp expect separation of clitics). |
|
44 |
English parameter files available by ftp, expect separation of clitics).
|
|
44 | 45 |
Sample input file: |
45 | 46 |
He |
46 | 47 |
moved |
47 | 48 |
to |
48 |
New York City NP |
|
49 |
New York City NP 1.0
|
|
49 | 50 |
. |
50 | 51 |
* <output file>: Name of the file to which the tagger should write its output. |
51 | 52 |
|
52 | 53 |
Further optional command line arguments: |
53 | 54 |
|
54 |
* -token: The words/tokens are printed in addition to the POS tags |
|
55 |
* -lemma: Lemmas are printed as well. |
|
56 |
* -sgml: This option instructs the tagger to ignore tokens which start |
|
57 |
with '<' and end with '>' (SGML tags). |
|
58 |
* -lex <f>: The file <f> contains additional lexicon entries to be used |
|
59 |
by the tagger. The file format is identical to the format of the lexicon |
|
60 |
argument of the training program (see below). |
|
61 |
* -no-unknown: If an unknown word is encountered, emit the word form |
|
62 |
as lemma. This was previously the default behaviour. Now, the default |
|
63 |
behaviour is to print "<unknown>" as lemma. |
|
64 |
* -threshold <p>: This option tells the tagger to print all tags of a |
|
65 |
word with a probability higher than <p> times the largest probability. |
|
66 |
(The tagger will use a different algorithm in this case and the set of |
|
67 |
best tags might be different from the tags generated without this |
|
68 |
option.) |
|
69 |
* -prob: Print tag probabilities (in combination with option -threshold) |
|
70 |
* -pt-with-prob: If this option is specified, then each pretagging tag |
|
71 |
(see above) has to be followed by a whitespace and a tag probability |
|
72 |
value. |
|
73 |
* -pt-with-lemma: If this option is specified, then each pretagging tag |
|
74 |
(see above) has to be followed by a whitespace and a lemma. Lemmas may |
|
75 |
contain blanks. |
|
76 |
If both -pt-with-prob and -pt-with-lemma have been specified, then each |
|
77 |
pretagging tag is followed by a probability and a lemma in that order. |
|
78 |
* -hyphen-heuristics: needed for chunking. See below for more information |
|
79 |
about how to train a chunk parameter file with hyphen-heuristics. |
|
55 |
* -token: tells the tagger to print the words also. |
|
56 |
* -lemma: tells the tagger to print the lemmas of the words also. |
|
57 |
* -sgml: tells the tagger to ignore tokens starting with '<' and ending |
|
58 |
with '>' (SGML tags). |
|
59 |
* -threshold p: tells the tagger to print all tags of a word with a probability higher than p times the largest tag probability. (The tagger will use the Forward-Backward algorithm instead of the Viterbi algorithm to annotate the data if this option is used.) |
|
60 |
* -lex file: tells the tagger to read additional lexicon entries form "file". The lexicon format is the same as that of the lexicon file used for training. |
|
80 | 61 |
|
81 |
The options below are for advanced users. Please, read the papers on the
|
|
82 |
TreeTagger to fully understand their meaning.
|
|
62 |
The options below are for advanced users. Read the papers on the TreeTagger
|
|
63 |
to fully understand their meaning. |
|
83 | 64 |
|
84 | 65 |
* -proto: If this option is specified, the tagger creates a file named |
85 | 66 |
"lexicon-protocol.txt", which contains information about the degree of |
... | ... | |
90 | 71 |
hyphen has been found in the fullform lexicon. |
91 | 72 |
* -eps <epsilon>: Value which is used to replace zero lexical frequencies. |
92 | 73 |
This is the case if a word/tag pair is contained in the lexicon but not |
93 |
in the training corpus. The choice of this parameter has only minor |
|
94 |
influence on the tagging accuracy. |
|
74 |
in the training corpus. The default is 0.1. The choice of this parameter |
|
75 |
has some minor influence on tagging accuracy. |
|
76 |
* -beam <threshold>: If the tagger is slow, this option can be used to speed it up. |
|
77 |
Good values for <threshold> are in the range 0.001-0.00001. |
|
95 | 78 |
* -base: If this option is specified, only lexical information is used |
96 | 79 |
for tagging but no contextual information about the preceding tags. |
97 | 80 |
This option is only useful in order to obtain a baseline result |
98 | 81 |
to which to compare the actual tagger output. |
99 | 82 |
|
83 |
There is another tagger program called "tree-tagger-flush" which |
|
84 |
flushes the output after reading an empty line. It expects a parameter |
|
85 |
file as argument and reads from stdin and writes to stdout. No command |
|
86 |
line options are supported. This program might be useful for |
|
87 |
implementing wrappers. |
|
100 | 88 |
|
101 | 89 |
|
90 |
|
|
91 |
|
|
102 | 92 |
Training |
103 | 93 |
-------- |
104 | 94 |
|
105 | 95 |
Training is done with the *train-tree-tagger* program. It expects at least |
106 | 96 |
four command line arguments which are described below. |
107 | 97 |
|
108 |
train-tree-tagger {options} <lexicon> <open class file> <input file> <output file> |
|
98 |
train-tree-tagger <lexicon> <open class file> <input file> <output file> |
|
99 |
{-cl <context length>} {-dtg <min. decision tree gain>} |
|
100 |
{-ecw <eq. class weight>} {-atg <affix tree gain>} {-st <sent. tag>} |
|
109 | 101 |
|
110 | 102 |
Description of the command line arguments: |
111 | 103 |
|
112 | 104 |
* <lexicon>: name of a file which contains the fullform lexicon. Each line |
113 | 105 |
of the lexicon corresponds to one word form and contains the word form |
114 |
and a sequence of tag-lemma pairs. Each tag is preceded by a tab character
|
|
115 |
and each lemma is preceded by a blank or tab character.
|
|
106 |
itself followed by a Tab character and a sequence of tag-lemma pairs.
|
|
107 |
The tags and lemmata are separated by whitespace.
|
|
116 | 108 |
Example: |
117 | 109 |
|
118 | 110 |
aback RB aback |
119 | 111 |
abacuses NNS abacus |
120 |
abandon VB abandon VBP abandon
|
|
121 |
abandoned JJ abandoned VBD abandon VBN abandon
|
|
112 |
abandon VB abandon VBP abandon
|
|
113 |
abandoned JJ abandoned VBD abandon VBN abandon
|
|
122 | 114 |
abandoning VBG abandon |
123 | 115 |
|
124 |
Attention: Ordinal and cardinal numbers which consist of digits |
|
125 |
(like 1, 13, 1278 or 2. and 75.) should not be included in the |
|
126 |
lexicon. Otherwise, the tagger will not be able to learn how to tag |
|
127 |
numbers which are not listed in the lexicon. Numbers with unusual |
|
128 |
tags should be added to the lexicon, however. If the training |
|
129 |
program reports an error because the POS tag used for numbers is |
|
130 |
unknown, you should add a lexicon entry for one number. |
|
116 |
Remark: The tagger doesn't need the lemmata actually. If you do not have |
|
117 |
the lemma information or if you do not plan to annotate corpora with |
|
118 |
lemmas, you can replace the lemma with a dummy value, e.g. "-". |
|
131 | 119 |
|
132 |
Remark: The tagger doesn't need the lemmas for tagging actually. If |
|
133 |
you do not have the lemma information or if you do not plan to |
|
134 |
annotate corpora with lemmas, you can replace the lemma with a dummy |
|
135 |
value, e.g. "-". |
|
136 |
|
|
137 |
You can use the Perl script make-lex.perl as follows in order to |
|
138 |
create a tagger lexicon from the training corpus: |
|
139 |
cmd/make-lex.perl corpus > lexicon |
|
140 |
|
|
141 |
If you have additional lexicon entries stored in a separate file "lex" |
|
142 |
with entries like this (The POS tag is preceded by a tab character.) |
|
143 |
aback RB aback |
|
144 |
aback RP aback |
|
145 |
abacs NNS abac |
|
146 |
you can include them as follows: cmd/make-lex.perl corpus lex > lexicon |
|
147 |
|
|
148 |
If train-tree-tagger complains about unknown tags, just add another |
|
149 |
entry to the lexicon with the respective POS tag. |
|
150 |
|
|
151 |
* <open class file>: name of a file which contains a list of open class tags |
|
152 |
i.e. possible tags of unknown word forms separated by whitespace. |
|
120 |
* <open class file>: name of a file which contains a list of open class |
|
121 |
tags i.e. possible tags of unknown word forms separated by whitespace. |
|
153 | 122 |
The tagger will use this information when it encounters unknown words, |
154 | 123 |
i.e. words which are not contained in the lexicon. |
155 | 124 |
Example: (for Penn Treebank tagset) |
... | ... | |
159 | 128 |
* <input file>: name of a file which contains tagged training data. The data |
160 | 129 |
must be in one-word-per-line format. This means that each line contains |
161 | 130 |
one token and one tag in that order separated by a tabulator. |
162 |
Punctuation marks are considered as tokens and must be tagged as well. |
|
163 |
The file should neither contain empty lines nor untagged SGML markup. |
|
131 |
Punctuation marks are considered as tokens and must have been tagged as well. |
|
164 | 132 |
Example: |
165 | 133 |
|
166 | 134 |
Pierre NP |
... | ... | |
177 | 145 |
|
178 | 146 |
* -st <sent. tag>: the end-of-sentence part-of-speech tag, i.e. the tag which |
179 | 147 |
is assigned to sentence punctuation like ".", "!", "?". |
180 |
Default is "SENT". You have to use this option, if your tag for sentence |
|
181 |
punctuation is not "SENT". If you have more than one such tag, choose the |
|
182 |
most frequent one. |
|
183 |
* -utf8 assume that the data is encoded with UTF8 |
|
148 |
Default is "SENT". It is important to set this option properly, if your |
|
149 |
tag for sentence punctuation is not "SENT". |
|
184 | 150 |
* -cl <context length>: number of preceding words forming the statistical |
185 | 151 |
context. The default is 2 which corresponds to a trigram context. For |
186 | 152 |
small training corpora and/or large tagsets, it could be useful to reduce |
187 | 153 |
this parameter to 1. |
188 | 154 |
* -dtg <min. decision tree gain>: Threshold - If the information gain at a |
189 | 155 |
leaf node of the decision tree is below this threshold, the node is deleted. |
190 |
* -sw <weight>: A smoothing parameter, which determines how much the |
|
191 |
probability distribution of some decision tree node is smoothed with the |
|
192 |
probability distribution of the parent node. |
|
156 |
The default value is 0.7. |
|
193 | 157 |
* -ecw <eq. class weight>: weight of the equivalence class based probability |
194 |
estimates. |
|
158 |
estimates. The default is 0.15.
|
|
195 | 159 |
* -atg <affix tree gain> Threshold - If the information gain at a leaf of an |
196 | 160 |
affix tree is below this threshold, it is deleted. The default is 1.2. |
197 | 161 |
|
198 |
The accuracy of the TreeTagger usually improves, if different settings |
|
199 |
of the above parameters are tested and the best combination is chosen. |
|
200 |
|
|
201 |
|
|
202 |
Caveat: Make sure that the lexicon and the training corpus contain no |
|
203 |
extra blanks. If the word form, for instance, is followed by a blank |
|
204 |
and a tab character, the blank will be considered part of the word. |
|
205 |
|
|
206 |
The script 'cmd/create-pos-parameter-file' can be used to train a |
|
207 |
parameter file, provided the file: 'lib/open-class-tags' exists. |
|
208 |
The script creates a lexicon and a parameter file |
|
209 |
that both will be stored in the lib-directory. |
|
210 |
|
|
211 |
|
|
212 |
Training a parameter file for chunking |
|
213 |
--------------------------------------- |
|
214 |
|
|
215 |
(This section of the README file was created by Wiebke Wagner.) |
|
216 |
|
|
217 |
Training is done with the *train-tree-tagger* program just like the training |
|
218 |
of part-of-speech parameter files. The input files differ (see below). |
|
219 |
|
|
220 |
train-tree-tagger {options} <lexicon> <open class file> <input file> <output file> |
|
221 |
|
|
222 |
Description of the command line arguments: |
|
223 |
|
|
224 |
* <lexicon>: name of a file which contains the fullform lexicon. Each line |
|
225 |
of the lexicon corresponds to one word form and contains the word form |
|
226 |
with its pos-tag and a sequence of chunktag-lemma pairs. Since there is |
|
227 |
no lemma for a string containing a word and its pos, lemma is just a |
|
228 |
dummy-placeholder. Each chunktag |
|
229 |
is preceded by a tab character and each dummy-lemma is preceded by a blank or tab character. |
|
230 |
Example: |
|
231 |
|
|
232 |
Abs-NP NP/I-NC # NP/B-NC # |
|
233 |
Academic-NP NP/I-NC # |
|
234 |
Activated-VVN VBN/B-VC # VBN/B-NC # |
|
235 |
Activation-NN NN/I-NC # NN/B-NC # |
|
236 |
VB VB/I-VC # |
|
237 |
|
|
238 |
The lexicon must contain specific entries that contain a hypen |
|
239 |
('WORD-POS POS/IOB DUMMY-LEMMA') and general entries without a hypen |
|
240 |
('POS POS/IOB DUMMY-LEMMMA)'. The hypen-heuristics enables the |
|
241 |
program to select the general entry if no specific entry is available. |
|
242 |
|
|
243 |
You can use the Perl script make-chunk-lex.perl as follows in order to |
|
244 |
create a tagger-chunker lexicon from the training corpus: |
|
245 |
'cmd/make-chunk-lex.perl' corpus > chunker-lexicon |
|
246 |
|
|
247 |
Attention: if 'train-tree-tagger' shows the error message: |
|
248 |
'ERROR: Sentence punctuation tag "SENT" is not in lexicon!' |
|
249 |
add the option: [-st 'SENT/O'] to the system call. |
|
250 |
|
|
251 |
If train-tree-tagger complains about unknown tags, just add another |
|
252 |
entry to the lexicon with the respective POS tag. |
|
253 |
|
|
254 |
* <open class file>: name of a file which contains only the dummy entry: |
|
255 |
NN/B-NC. If there are general entries in the lexicon for every |
|
256 |
word class, and if the hyphen heuristic is activated no tags |
|
257 |
have to be guessed. |
|
258 |
|
|
259 |
* <input file>: name of a file which contains training data annotated |
|
260 |
with part-of-speech tags and chunk tags. The data |
|
261 |
must be in one-word-per-line format. This means that each line contains |
|
262 |
one token with its mark-up: |
|
263 |
WORD-POS POS/IOB DUMMY-LEMMA |
|
264 |
The file should neither contain empty lines nor untagged SGML markup. |
|
265 |
Example: |
|
266 |
|
|
267 |
Activation-NN NN/I-NC |
|
268 |
of-IN IN/B-PC |
|
269 |
the-DT DT/B-NC |
|
270 |
CD28-NP NP/I-NC |
|
271 |
surface-NN NN/I-NC |
|
272 |
|
|
273 |
* <output file>: name of the file in which the resulting tagger parameters |
|
274 |
are stored. |
|
275 |
|
|
276 |
The optional parameters like for training a part-of-speech parameter file |
|
277 |
|
|
278 |
The accuracy of the TreeTagger usually improves, if different settings |
|
279 |
of the above parameters are tested and the best combination is chosen. |
|
280 |
|
|
281 |
Caveat: Make sure that the lexicon and the training corpus contain no |
|
282 |
extra blanks. If the word form, for instance, is followed by a blank |
|
283 |
and a tab character, the blank will be considered part of the word. |
|
284 |
|
|
285 |
The script 'cmd/create-chunk-parameter-file' can be used to train a chunk |
|
286 |
parameter file, provided the file: 'lib/open-class-chunks' exists and |
|
287 |
contains the dummy entry: |
|
288 |
NN/B-NC |
|
289 |
The script creates a lexicon and a parameter file |
|
290 |
that both will be stored in the lib-directory. |
|
162 |
The accuracy of the TreeTagger is usually slightly improved, if different |
|
163 |
settings of the above parameters are tested and the best combination is |
|
164 |
chosen. |
TXM/trunk/bundles/org.txm.treetagger.core.win32/res/win32/INSTALL.txt (revision 3926) | ||
---|---|---|
1 |
|
|
2 |
/****************************************************************************/ |
|
3 |
/* How to install the Windows version of the TreeTagger */ |
|
4 |
/* */ |
|
5 |
/* Author: Helmut Schmid, CIS, Ludwig-Maximilians-Universität, Germany */ |
|
6 |
/****************************************************************************/ |
|
7 |
|
|
8 |
This is the Windows distribution of the TreeTagger. |
|
9 |
|
|
10 |
It contains the following files: |
|
11 |
|
|
12 |
- tree-tagger.exe: the tagger program |
|
13 |
|
|
14 |
- train-tree-tagger.exe: the training program |
|
15 |
|
|
16 |
- utf8-tokenize.perl: A Perl script which transforms the tagger input |
|
17 |
into one-word-perl-line format |
|
18 |
|
|
19 |
- *-abbreviations: abbreviation lists required by the tokenizer |
|
20 |
|
|
21 |
- tag-*.bat: batch files for different languages which call |
|
22 |
the tokeniser and the tagger |
|
23 |
|
|
24 |
- chunk-*.bat batch files for POS tagging and chunking |
|
25 |
|
|
26 |
|
|
27 |
Installation |
|
28 |
------------ |
|
29 |
|
|
30 |
1. Install a Perl interpreter (if you have not already installed one). |
|
31 |
You can download a Perl interpreter for Windows for free at |
|
32 |
http://www.activestate.com/activeperl/ |
|
33 |
|
|
34 |
2. Move the TreeTagger directory to the root directory of drive C:. |
|
35 |
|
|
36 |
3. Download the PC parameter files for the languages you need, decompress |
|
37 |
them (e.g. using Winzip or 7zip) and move them to the subdirectory lib. |
|
38 |
Rename the parameter files to <language>-utf8.par |
|
39 |
Example: Rename french-par-linux-3.2-utf8.bin to french-utf8.par |
|
40 |
|
|
41 |
Non-UTF8 parameter files are not supported anymore. |
|
42 |
|
|
43 |
4. Add the path C:\TreeTagger\bin to the PATH environment variable. |
|
44 |
|
|
45 |
5. Open a shell and type the command |
|
46 |
set PATH=C:\TreeTagger\bin;%PATH% |
|
47 |
|
|
48 |
6. Change to the directory C:\TreeTagger |
|
49 |
|
|
50 |
7. Now you can test the tagger, e.g. by analyzing this file with the command |
|
51 |
tag-english INSTALL.txt |
|
52 |
|
|
53 |
If you install the TreeTagger in a different directory, you have to |
|
54 |
modify the first path in the batch files tag-*.bat. |
|
55 |
|
|
56 |
|
|
57 |
Michaela Atterer told me that she had difficulties to install the |
|
58 |
TreeTagger on a Windows XP system. She recommends the following |
|
59 |
work-around. |
|
60 |
|
|
61 |
4. Windows XP: |
|
62 |
-Right click on "My Computer" |
|
63 |
-Select the "Advanced" tab |
|
64 |
-Click on "Environment Variables" |
|
65 |
-click on New: enter PATH and C:\TreeTagger\bin\;%PATH% |
|
66 |
|
|
67 |
If the files have been unpacked into a single directory, you should |
|
68 |
restore the following directory structure: |
|
69 |
|
|
70 |
TreeTagger: |
|
71 |
INSTALL.txt README.txt bin cmd lib |
|
72 |
|
|
73 |
TreeTagger/bin: |
|
74 |
tag-english.bat tag-german.bat tag-spanish.bat tree-tagger.exe |
|
75 |
tag-french.bat tag-italian.bat train-tree-tagger.exe |
|
76 |
|
|
77 |
TreeTagger/cmd: |
|
78 |
mwl-lookup.perl tokenize.pl |
|
79 |
|
|
80 |
TreeTagger/lib: |
|
81 |
english-abbreviations german-abbreviations spanish-abbreviations |
|
82 |
french-abbreviations italian-abbreviations spanish-mwls |
|
83 |
|
|
84 |
|
|
85 |
Note that the TreeTagger comes without a graphical interface. You have |
|
86 |
to run it by entering a command in a command line window. If you prefer |
|
87 |
a graphical interface, try the one provided by Ciarán Ó Duibhín at |
|
88 |
http://www.smo.uhi.ac.uk/~oduibhin/oideasra/interfaces/winttinterface.htm |
|
89 |
|
|
1 |
|
|
2 |
/****************************************************************************/ |
|
3 |
/* How to install the Windows version of the TreeTagger */ |
|
4 |
/****************************************************************************/ |
|
5 |
|
|
6 |
This is the Windows distribution of the TreeTagger. |
|
7 |
|
|
8 |
It contains the following files: |
|
9 |
|
|
10 |
- tree-tagger.exe: the tagger program |
|
11 |
|
|
12 |
- train-tree-tagger.exe: the training program |
|
13 |
|
|
14 |
- utf8-tokenize.perl: A Perl script which transforms the tagger input |
|
15 |
into one-word-perl-line format |
|
16 |
|
|
17 |
- *-abbreviations: abbreviation lists required by the tokenizer |
|
18 |
|
|
19 |
- tag-*.bat: batch files for different languages which call |
|
20 |
the tokeniser and the tagger |
|
21 |
|
|
22 |
- chunk-*.bat batch files for POS tagging and chunking |
|
23 |
|
|
24 |
|
|
25 |
Installation |
|
26 |
------------ |
|
27 |
|
|
28 |
1. Install a Perl interpreter (if you have not already installed one). |
|
29 |
You can download a Perl interpreter for Windows for free at |
|
30 |
http://www.perl.org |
|
31 |
|
|
32 |
2. Extract the zip file (if it was not extracted yet) and move the |
|
33 |
TreeTagger directory to the root directory of drive C:\. |
|
34 |
|
|
35 |
3. Download the parameter files for the languages you need, decompress |
|
36 |
them (e.g. using Winzip or 7zip) and move them to the subdirectory TreeTagger/lib. |
|
37 |
|
|
38 |
4. Add the path C:\TreeTagger\bin to the PATH environment |
|
39 |
variable. The necessary steps differ from one Windows version to the other. |
|
40 |
|
|
41 |
5. Open a command prompt window and type the command |
|
42 |
set PATH=C:\TreeTagger\bin;%PATH% |
|
43 |
|
|
44 |
6. Go to the directory C:\TreeTagger |
|
45 |
cd c:\TreeTagger |
|
46 |
|
|
47 |
7. Now you can test the tagger, e.g. by analyzing this file with the command |
|
48 |
tag-english INSTALL.txt |
|
49 |
|
|
50 |
The input should be encoded in UTF8. |
|
51 |
|
|
52 |
If you install the TreeTagger in a different directory, you have to |
|
53 |
modify the first path in the batch files tag-*.bat using an editor |
|
54 |
such as Wordpad. |
|
55 |
|
|
56 |
If the files have been unpacked into a single directory, you should |
|
57 |
restore the following directory structure: |
|
58 |
|
|
59 |
TreeTagger: |
|
60 |
INSTALL.txt README.txt bin cmd lib |
|
61 |
|
|
62 |
TreeTagger/bin: |
|
63 |
tag-english.bat tag-german.bat tag-spanish.bat tree-tagger.exe |
|
64 |
tag-french.bat tag-italian.bat train-tree-tagger.exe |
|
65 |
|
|
66 |
TreeTagger/cmd: |
|
67 |
mwl-lookup.perl tokenize.pl |
|
68 |
|
|
69 |
TreeTagger/lib: |
|
70 |
english-abbreviations german-abbreviations spanish-abbreviations |
|
71 |
french-abbreviations italian-abbreviations spanish-mwls |
|
72 |
|
|
73 |
|
|
74 |
Note that the TreeTagger comes without a graphical interface. You have |
|
75 |
to run it by entering a command in a command line window. If you prefer |
|
76 |
a graphical interface, try the one provided by Ciarán Ó Duibhín at |
|
77 |
http://www.smo.uhi.ac.uk/~oduibhin/oideasra/interfaces/winttinterface.htm |
|
78 |
|
TXM/trunk/bundles/org.txm.treetagger.core.win32/res/win32/lib/english-abbreviations (revision 3926) | ||
---|---|---|
1 |
&Co. |
|
2 |
AUG. |
|
1 | 3 |
Adm. |
2 | 4 |
Ala. |
5 |
Ald. |
|
6 |
App.Cas. |
|
3 | 7 |
Ariz. |
4 | 8 |
Ark. |
9 |
Assn. |
|
10 |
Assoc. |
|
11 |
Att. |
|
5 | 12 |
Aug. |
13 |
Av. |
|
6 | 14 |
Ave. |
7 | 15 |
Bancorp. |
16 |
Bde. |
|
8 | 17 |
Bhd. |
18 |
Blvd. |
|
9 | 19 |
Brig. |
10 | 20 |
Bros. |
21 |
C.-in-C. |
|
11 | 22 |
CO. |
12 | 23 |
CORP. |
13 | 24 |
COS. |
... | ... | |
16 | 27 |
Canada-U.S. |
17 | 28 |
Canadian-U.S. |
18 | 29 |
Capt. |
30 |
Cas. |
|
31 |
Ch. |
|
32 |
Ch.App. |
|
33 |
Ch.D. |
|
19 | 34 |
Cia. |
20 | 35 |
Cie. |
36 |
Cm. |
|
37 |
Cmd. |
|
38 |
Cmnd. |
|
21 | 39 |
Co. |
22 | 40 |
Col. |
23 | 41 |
Colo. |
24 | 42 |
Conn. |
25 | 43 |
Corp. |
26 | 44 |
Cos. |
45 |
Cowp. |
|
46 |
Cr.App.R. |
|
47 |
Crim.L.R. |
|
27 | 48 |
D-Mass. |
49 |
D.Litt. |
|
50 |
D.Phil. |
|
51 |
DFl. |
|
28 | 52 |
Dec. |
29 | 53 |
Del. |
54 |
Dep. |
|
30 | 55 |
Dept. |
56 |
Deut. |
|
57 |
Diod. |
|
58 |
Div. |
|
31 | 59 |
Dr. |
32 | 60 |
Drs. |
61 |
Dtr. |
|
62 |
Durn. |
|
63 |
E.g. |
|
64 |
ESQ. |
|
65 |
Eph. |
|
66 |
Eq. |
|
67 |
Eqn. |
|
68 |
Eqns. |
|
69 |
Esq. |
|
33 | 70 |
Etc. |
71 |
Exch. |
|
72 |
Exod. |
|
73 |
Ext. |
|
74 |
FIG. |
|
75 |
Fam. |
|
34 | 76 |
Feb. |
77 |
Fig. |
|
78 |
Figs. |
|
35 | 79 |
Fla. |
36 | 80 |
Ft. |
81 |
G.m.b.H. |
|
37 | 82 |
Ga. |
38 | 83 |
Gen. |
39 | 84 |
Gov. |
85 |
Hdt. |
|
40 | 86 |
Hon. |
41 | 87 |
INC. |
88 |
Ibid. |
|
42 | 89 |
Ill. |
43 | 90 |
Inc. |
44 | 91 |
Ind. |
92 |
InfoCorp. |
|
93 |
Intercorp. |
|
94 |
Invest. |
|
95 |
JJ. |
|
96 |
JR. |
|
45 | 97 |
Jan. |
46 | 98 |
Japan-U.S. |
47 | 99 |
Jr. |
100 |
Jud. |
|
48 | 101 |
Kan. |
49 | 102 |
Korean-U.S. |
50 | 103 |
Ky. |
104 |
L.JJ. |
|
105 |
L.R.Ir. |
|
106 |
LL.M. |
|
107 |
LTD. |
|
51 | 108 |
La. |
52 | 109 |
Lt. |
110 |
Lt.-Col. |
|
53 | 111 |
Ltd. |
112 |
Ltda. |
|
113 |
M.Ed. |
|
114 |
M.Litt. |
|
115 |
M.Phil. |
|
54 | 116 |
Maj. |
55 | 117 |
Mass. |
56 | 118 |
Md. |
119 |
Me.T.A. |
|
57 | 120 |
Messrs. |
58 | 121 |
Mfg. |
59 | 122 |
Mich. |
60 | 123 |
Minn. |
61 | 124 |
Miss. |
62 | 125 |
Mo. |
126 |
Mod.Rep. |
|
127 |
Mont. |
|
63 | 128 |
Mr. |
64 | 129 |
Mrs. |
65 | 130 |
Ms. |
66 | 131 |
Neb. |
67 | 132 |
Nev. |
68 | 133 |
No. |
134 |
Non-U.S. |
|
69 | 135 |
Nos. |
70 | 136 |
Nov. |
71 | 137 |
Oct. |
138 |
Oe. |
|
72 | 139 |
Okla. |
73 | 140 |
Ont. |
141 |
Op. |
|
74 | 142 |
Ore. |
143 |
P.o.s. |
|
75 | 144 |
Pa. |
76 | 145 |
Ph. |
146 |
Ph.D. |
|
147 |
Pp. |
|
148 |
Prev. |
|
77 | 149 |
Prof. |
78 | 150 |
Prop. |
151 |
Pte. |
|
152 |
Ptr. |
|
79 | 153 |
Pty. |
154 |
Reg. |
|
155 |
Regt. |
|
80 | 156 |
Rep. |
81 | 157 |
Reps. |
158 |
Repub. |
|
159 |
Ret. |
|
82 | 160 |
Rev. |
161 |
Rom. |
|
83 | 162 |
S.p.A. |
163 |
Sec. |
|
84 | 164 |
Sen. |
85 | 165 |
Sens. |
86 | 166 |
Sept. |
87 | 167 |
Sgt. |
168 |
Sh.Ct. |
|
88 | 169 |
Sino-U.S. |
170 |
Soc. |
|
171 |
Som. |
|
172 |
Soviet-U.S. |
|
173 |
Sp. |
|
89 | 174 |
Sr. |
90 | 175 |
St. |
91 | 176 |
Ste. |
177 |
Suff. |
|
178 |
Syll. |
|
179 |
T.B.G.A.S. |
|
92 | 180 |
Tenn. |
93 | 181 |
Tex. |
182 |
Thess. |
|
183 |
Thuc. |
|
184 |
Transp. |
|
185 |
Trop. |
|
94 | 186 |
U.S.-U.K. |
95 | 187 |
U.S.-U.S.S.R. |
188 |
U.S.P.G.A. |
|
189 |
Univ. |
|
190 |
V.-C. |
|
96 | 191 |
Va. |
192 |
Vict. |
|
193 |
Vol. |
|
97 | 194 |
Vt. |
98 | 195 |
W.Va. |
99 | 196 |
Wash. |
100 | 197 |
Wis. |
101 | 198 |
Wyo. |
199 |
a-Ex-dividend. |
|
200 |
a.c. |
|
201 |
a.g.m. |
|
102 | 202 |
a.k.a. |
103 | 203 |
a.m. |
204 |
al. |
|
104 | 205 |
anti-U.S. |
206 |
approx. |
|
207 |
b.s. |
|
208 |
bldg. |
|
209 |
c.c.d. |
|
210 |
c.e.o. |
|
211 |
c.f. |
|
212 |
c.g. |
|
213 |
c.v. |
|
214 |
c/s. |
|
105 | 215 |
cap. |
106 |
days. |
|
216 |
cf. |
|
217 |
ch. |
|
218 |
cit. |
|
219 |
clar. |
|
220 |
co. |
|
221 |
col. |
|
222 |
cols. |
|
223 |
constr. |
|
224 |
cp. |
|
225 |
cwt. |
|
226 |
d.c. |
|
227 |
d.f. |
|
228 |
d.i.l. |
|
229 |
d.p.c. |
|
230 |
def. |
|
231 |
dw. |
|
232 |
e-Estimated. |
|
233 |
e.g. |
|
234 |
e.m.f. |
|
235 |
e.p.s.p. |
|
236 |
edn. |
|
237 |
edns. |
|
238 |
est. |
|
107 | 239 |
etc. |
240 |
ex-L.C.C. |
|
241 |
fig. |
|
242 |
fl. |
|
243 |
fol. |
|
108 | 244 |
ft. |
245 |
gen. |
|
246 |
govt. |
|
247 |
h.p. |
|
248 |
hon. |
|
249 |
hrs. |
|
250 |
i.c. |
|
109 | 251 |
i.e. |
252 |
ibid. |
|
253 |
inc. |
|
254 |
incl. |
|
255 |
juv. |
|
256 |
k.p.h. |
|
257 |
l.e.d. |
|
258 |
lbs. |
|
259 |
loc. |
|
260 |
m.d. |
|
261 |
m.p.h. |
|
262 |
msec. |
|
263 |
n.d. |
|
264 |
n.m.r. |
|
265 |
non-U.K. |
|
110 | 266 |
non-U.S. |
267 |
norw. |
|
268 |
nos. |
|
269 |
oz. |
|
270 |
ozs. |
|
271 |
p. |
|
272 |
p.a. |
|
273 |
p.c. |
|
111 | 274 |
p.m. |
275 |
p.o.s. |
|
276 |
p.p.m. |
|
277 |
p.s.i. |
|
278 |
p.w. |
|
279 |
pl. |
|
280 |
pls. |
|
281 |
pos. |
|
282 |
pp. |
|
283 |
pres. |
|
112 | 284 |
president-U.S. |
285 |
pro-U.S. |
|
286 |
q.v. |
|
287 |
qq.v. |
|
288 |
r.f. |
|
289 |
r.h. |
|
290 |
r.m.s. |
|
291 |
r.m.s.d. |
|
292 |
r.p.m. |
|
293 |
r.s.s. |
|
294 |
ref. |
|
295 |
s. |
|
296 |
s.a. |
|
297 |
s.a.e. |
|
298 |
s.d. |
|
299 |
s.e.m. |
|
113 | 300 |
s.r.l. |
301 |
s.t.p. |
|
302 |
spp. |
|
303 |
sq.ft. |
|
304 |
sq.m. |
|
305 |
subss. |
|
114 | 306 |
v. |
115 | 307 |
v.B. |
116 | 308 |
v.w. |
309 |
var. |
|
310 |
viz. |
|
311 |
vol. |
|
312 |
vols. |
|
117 | 313 |
vs. |
314 |
w.c. |
TXM/trunk/bundles/org.txm.treetagger.core.win32/res/win32/lib/german-abbreviations (revision 3926) | ||
---|---|---|
1 |
-hdg. |
|
2 |
-tlg. |
|
3 |
12Tel. |
|
4 |
12st. |
|
5 |
1856ff. |
|
6 |
20j. |
|
7 |
31Tel. |
|
8 |
3jähr. |
|
9 |
40j. |
|
10 |
48So. |
|
11 |
62j. |
|
1 | 12 |
A. |
13 |
A.-Br. |
|
2 | 14 |
A.-G. |
15 |
A.-Kuntz-Str. |
|
16 |
A.D. |
|
3 | 17 |
A.G. |
18 |
A.N.A.C.R.F. |
|
19 |
A.T. |
|
4 | 20 |
ADN-Korr. |
21 |
AF-Volkskorr. |
|
22 |
AGA-Annonc. |
|
23 |
AR-Volkskorr. |
|
5 | 24 |
AT-Mot. |
25 |
AWG-Wohng. |
|
26 |
Aach. |
|
27 |
Aachn. |
|
28 |
Aamtl. |
|
6 | 29 |
Abb. |
30 |
Abbestr. |
|
31 |
Abdeckpl. |
|
32 |
Abdseg. |
|
33 |
Abendrestaur. |
|
34 |
Abenthumstr. |
|
7 | 35 |
Abess. |
36 |
Abf. |
|
37 |
Abg. |
|
38 |
Abk. |
|
8 | 39 |
Abl. |
9 | 40 |
Ablief.-Gew. |
41 |
Ablös. |
|
10 | 42 |
Abm. |
43 |
Abrechn. |
|
11 | 44 |
Abs. |
45 |
Abschlußprfg. |
|
46 |
Abschlußprüfg. |
|
47 |
Abschn. |
|
48 |
Abschr. |
|
49 |
Absetzbkt. |
|
50 |
Abspr. |
|
51 |
Abst. |
|
52 |
Abstellfl. |
|
53 |
Abstellpl. |
|
12 | 54 |
Abt. |
55 |
Abteistr. |
|
13 | 56 |
Abtlg. |
57 |
Acc. |
|
58 |
Achenbachstr. |
|
59 |
Achterkab. |
|
60 |
Achtermannstr. |
|
61 |
Ackermannstr. |
|
62 |
Act. |
|
63 |
Ad. |
|
64 |
Adalbertstr. |
|
65 |
Add. |
|
66 |
Adelheidstr. |
|
67 |
Adlerflychtstr. |
|
68 |
Adolfsbr. |
|
69 |
Adolfstr. |
|
70 |
Adr. |
|
71 |
Adt. |
|
72 |
Ag. |
|
73 |
Aggrip. |
|
14 | 74 |
Agl. |
15 | 75 |
Agt. |
76 |
Ahornstr. |
|
77 |
Aindorferstr. |
|
78 |
Ainmillerstr. |
|
79 |
AirC. |
|
80 |
Airl. |
|
81 |
Akad. |
|
82 |
Akk. |
|
83 |
Akkermannstr. |
|
84 |
Akt. |
|
85 |
Akt.-G. |
|
16 | 86 |
Akt.-Ges. |
87 |
Akt.-Nr. |
|
17 | 88 |
Aktbr. |
89 |
Aktien-Nr. |
|
90 |
Aktv. |
|
91 |
Aktz. |
|
92 |
Akz. |
|
93 |
Alarmanl. |
|
94 |
Albanusstr. |
|
95 |
Albertusstr. |
|
96 |
Albing. |
|
97 |
Alcan. |
|
98 |
Alem. |
|
99 |
Alemannenstr. |
|
100 |
Alex. |
|
101 |
Alexanderstr. |
|
102 |
Alexanderw. |
|
103 |
Alexandrastr. |
|
104 |
Alfa-Montr. |
|
105 |
Alfm. |
|
106 |
Alfredstr. |
|
18 | 107 |
Alg. |
108 |
Algem. |
|
109 |
Alk. |
|
110 |
Alleestr. |
|
111 |
Alleinbeauftr. |
|
112 |
Alleinbeauftragt. |
|
19 | 113 |
Alleininh. |
114 |
Allerstr. |
|
20 | 115 |
Allg. |
116 |
Allg.-Bildg. |
|
117 |
Allgem. |
|
118 |
Allianz-Leb. |
|
119 |
Allianz-Vers. |
|
120 |
Almanasrh. |
|
121 |
Alnmillerstr. |
|
122 |
Alph. |
|
123 |
Alt-Schmargend. |
|
124 |
Alt-Schönebg. |
|
125 |
Alt.-Kalt. |
|
126 |
Altbt. |
|
127 |
Altenbg. |
|
128 |
Altentagesst. |
|
129 |
Altersbeschr. |
|
130 |
Altheimstr. |
|
131 |
Altkönigstr. |
|
132 |
Altluth. |
|
133 |
Altsäng. |
|
134 |
Altw. |
|
21 | 135 |
Altwageneint. |
22 | 136 |
Alu-Felg. |
137 |
Alu. |
|
138 |
Alufelg. |
|
23 | 139 |
Alum. |
24 | 140 |
Am. |
141 |
Amat. |
|
142 |
Amb. |
|
143 |
Ambg. |
|
144 |
Amer. |
|
25 | 145 |
Amp. |
146 |
Amsterd. |
|
147 |
Amtl. |
|
148 |
Anal. |
|
149 |
Anb. |
|
150 |
Andlerflychtstr. |
|
151 |
Andreasstr. |
|
152 |
Andrestr. |
|
153 |
Andréstr. |
|
154 |
Anerkg. |
|
26 | 155 |
Anf. |
27 | 156 |
Anfr. |
28 | 157 |
Anfrag. |
158 |
Anfäng. |
|
29 | 159 |
Ang. |
160 |
Angab. |
|
30 | 161 |
Angb. |
31 | 162 |
Angeb. |
163 |
Angeh. |
|
164 |
Angertorstr. |
|
165 |
Angest. |
|
32 | 166 |
Angl. |
167 |
Anglo-Am. |
|
168 |
Anh. |
|
33 | 169 |
Anhängerkuppl. |
170 |
Ank. |
|
34 | 171 |
Anl. |
172 |
Anl.-Obj. |
|
173 |
Anl.-Verm. |
|
174 |
Anlageverm. |
|
35 | 175 |
Anleih. |
176 |
Anm. |
|
177 |
Anmeld. |
|
36 | 178 |
Ann.-Exp. |
37 | 179 |
Ann.-Exped. |
180 |
Annastr. |
|
181 |
Annenstr. |
|
182 |
Annoncen-Exp. |
|
183 |
Annoncen-Exped. |
|
184 |
Annoncen-Expedit. |
|
185 |
Anr. |
|
186 |
Anschl. |
|
187 |
Anspr. |
|
188 |
Anst. |
|
38 | 189 |
Ant. |
39 | 190 |
Anten. |
191 |
Antiquitätenhandlg. |
|
192 |
Antoniterstr. |
|
193 |
Antr. |
|
194 |
Antw. |
|
195 |
Antwerp. |
|
40 | 196 |
Anz. |
197 |
Anz.-Ann. |
|
41 | 198 |
Anz.-Exp. |
199 |
Anz.-Exped. |
|
200 |
Anz.-Ges. |
|
201 |
Anz.-Mittl. |
|
42 | 202 |
Anz.-Verm. |
43 | 203 |
Anzahlg. |
204 |
Anzeig.-Exped. |
|
205 |
Anzeig.-Vermittl. |
|
206 |
Anzhl. |
|
44 | 207 |
Anzhlg. |
208 |
Ap. |
|
209 |
Apart. |
|
210 |
Apostelgem. |
|
45 | 211 |
Apoth. |
212 |
Apothek. |
|
46 | 213 |
App. |
214 |
Appart. |
|
47 | 215 |
Appartem. |
216 |
Appel-Fk. |
|
217 |
Apr. |
|
218 |
April-Abl. |
|
48 | 219 |
April-Lief. |
220 |
April-Mai-Abl. |
|
221 |
Apt. |
|
222 |
Arb. |
|
223 |
Arbeitsgemeinsch. |
|
224 |
Arbg. |
|
225 |
Arbn. |
|
226 |
Arbrechtl. |
|
227 |
Archäol. |
|
228 |
Arcisstr. |
|
49 | 229 |
Argent. |
230 |
Argentin. |
|
231 |
Argonnenstr. |
|
232 |
Armenruhstr. |
|
233 |
Arminiusstr. |
|
234 |
Arndstr. |
|
235 |
Arndtstr. |
|
236 |
Arnsburgerstr. |
|
237 |
Arnsburgg. |
|
238 |
Art. |
|
239 |
Arzneim.-Großhlg. |
|
240 |
Asahi-Chem. |
|
241 |
Asc. |
|
242 |
Aschaffb. |
|
243 |
Aschaffenb. |
|
244 |
Aschaffenbg. |
|
245 |
Ascheb. |
|
246 |
Aschffbg. |
|
247 |
Asiat. |
|
248 |
Ass. |
|
249 |
Assist. |
|
250 |
At. |
|
251 |
Ath. |
|
252 |
Athosstr. |
|
50 | 253 |
Atl. |
254 |
Atlas-Lev. |
|
255 |
Atlasw. |
|
256 |
Atmosph. |
|
257 |
Audenstr. |
|
51 | 258 |
Aufb. |
259 |
Aufbaugeb. |
|
260 |
Aufbrauch-Erkr. |
|
261 |
Aufg. |
|
262 |
Aufgabengeb. |
|
263 |
Aufl. |
|
264 |
Aufn. |
|
265 |
Aufnahmebed. |
|
52 | 266 |
Aufst. |
267 |
Auftr. |
|
268 |
Auftr.-Nr. |
|
269 |
Aufz. |
|
53 | 270 |
Aug. |
271 |
Aug.-Abl. |
|
272 |
Aug.-Wilh. |
|
54 | 273 |
Augsb. |
274 |
Augsburgerstr. |
|
275 |
August-Bebel-Str. |
|
276 |
Augustinusstr. |
|
277 |
Aunelstr. |
|
278 |
Aureliusstr. |
|
279 |
Ausbau-Abt. |
|
280 |
Ausbaumöglk. |
|
281 |
Ausbeutg. |
|
282 |
Ausbild. |
|
283 |
Ausbildg. |
|
284 |
Ausf. |
|
285 |
Ausführ. |
|
286 |
Ausführg. |
|
55 | 287 |
Ausg. |
56 | 288 |
Ausgl. |
57 | 289 |
Ausk. |
290 |
Auskft. |
|
58 | 291 |
Ausl. |
59 | 292 |
Ausl.-Akt. |
293 |
Ausl.-Anl. |
|
60 | 294 |
Auslandsanl. |
61 | 295 |
Auslandsb. |
296 |
Auslief. |
|
297 |
Auslieferungsgew. |
|
298 |
Auslieferungslag. |
|
299 |
Ausn. |
|
300 |
Auss. |
|
301 |
Ausseh. |
|
62 | 302 |
Ausst. |
303 |
Ausstatt. |
|
63 | 304 |
Ausstattg. |
305 |
Ausstellg. |
|
306 |
Ausstellungspl. |
|
307 |
Ausstg. |
|
308 |
Aust. |
|
309 |
Austatt. |
|
310 |
Austr. |
|
64 | 311 |
Austral. |
312 |
Austrofds. |
|
313 |
Auswandg. |
|
314 |
Auswert. |
|
65 | 315 |
Ausz. |
66 | 316 |
Aut. |
317 |
Aut.-Mittelsch. |
|
318 |
Auto-Abstellpl. |
|
319 |
Autob. |
|
320 |
Autob.-Abf. |
|
321 |
Autobahnanschl. |
|
322 |
Autobahnausf. |
|
323 |
Autohs. |
|
67 | 324 |
Autom. |
68 |
Automat. |
|
69 | 325 |
Automin. |
326 |
Automob. |
|
327 |
Autopist. |
|
328 |
Autopl. |
|
329 |
Av. |
|
330 |
Avda. |
|
331 |
Averhoffstr. |
|
332 |
Az. |
|
70 | 333 |
B. |
334 |
B.-Ang. |
|
335 |
B.-Angest. |
|
336 |
B.-E. |
|
337 |
B.-Str. |
|
338 |
B.O.A.C. |
|
71 | 339 |
B.P. |
340 |
B.P.a. |
|
341 |
B.Wttbg. |
|
72 | 342 |
BGBl. |
343 |
BMW-Niederl. |
|
344 |
Babenh. |
|
345 |
Bachmannstr. |
|
346 |
Bachmannstrr. |
|
347 |
Bachstr. |
|
348 |
Backhausstr. |
|
349 |
Bad. |
|
350 |
Bad.-Württ. |
|
351 |
Bad.-Württemb. |
|
352 |
Badeabt. |
|
353 |
Badeabtlg. |
|
354 |
Baden-württ. |
|
355 |
Badenw. |
|
356 |
Badenwk. |
|
357 |
Badezi. |
|
358 |
Badezim. |
|
359 |
Badstr. |
|
360 |
Bagh. |
|
361 |
Bahnanschl. |
|
362 |
Bahnhofsstr. |
|
363 |
Bahnhofst. |
|
73 | 364 |
Bahnhofstr. |
365 |
Bahnstr. |
|
366 |
Bal. |
|
74 | 367 |
Balk. |
368 |
Balkonzim. |
|
369 |
Baltim. |
|
370 |
Bamb. |
|
371 |
Bambg. |
|
372 |
Bandelstr. |
|
373 |
Bangertstr. |
|
374 |
Bankdir. |
|
375 |
Bankges. |
|
376 |
Bankhyp. |
|
377 |
Bankkaufm. |
|
378 |
Bankprax. |
|
379 |
Bankv. |
|
380 |
Bansastr. |
|
381 |
Baranz. |
|
382 |
Barauszhlg. |
|
383 |
Barbarossastr. |
|
384 |
Barckhausstr. |
|
385 |
Bardostr. |
|
386 |
Barkap. |
|
387 |
Barpr. |
|
388 |
Barz. |
|
389 |
Barzahl. |
|
390 |
Barzahlg. |
|
391 |
Basaltstr. |
|
392 |
Basketb. |
|
393 |
Bastelausst. |
|
394 |
Battonnstr. |
|
395 |
Battonstr. |
|
75 | 396 |
Bau-Ing. |
397 |
Bau-Zem. |
|
398 |
Bauabwickl. |
|
399 |
Baugenehmig. |
|
76 | 400 |
Bauges. |
401 |
Baugesch. |
|
402 |
Baugrdst. |
|
403 |
Baugroßhdlg. |
|
404 |
Baugrundst. |
|
405 |
Bauing. |
|
77 | 406 |
Bauj. |
407 |
Bauk.-Zusch. |
|
408 |
Baukostenzusch. |
|
409 |
Baumasch. |
|
410 |
Baumbest. |
|
411 |
Baumstr. |
|
412 |
Baumw. |
|
413 |
Baumwollsaatöl. |
|
414 |
Baupl. |
|
415 |
Bauspard. |
|
416 |
Baust. |
|
417 |
Bauw. |
|
418 |
Bauzust. |
|
78 | 419 |
Bay. |
79 |
Bayer. |
|
420 |
Bay.-ObLG. |
|
421 |
Bayern-Münch. |
|
422 |
Bayernwk. |
|
423 |
Bayerstr. |
|
424 |
Bayr. |
|
80 | 425 |
Bb. |
426 |
Bch. |
|
81 | 427 |
Bd. |
428 |
Bd.-Reg. |
|
429 |
Bd.-Str. |
|
430 |
Bdbahn. |
|
431 |
Bde. |
|
432 |
Bde.. |
|
433 |
Bds. |
|
434 |
Beamt. |
|
435 |
Beamtenhaush. |
|
436 |
Bearbeitg. |
|
437 |
Beauftr. |
|
438 |
Bechtenwaldstr. |
|
82 | 439 |
Bed. |
440 |
Bedg. |
|
441 |
Bedieng. |
|
83 | 442 |
Beding. |
443 |
Beethovenstr. |
|
444 |
Befäh. |
|
445 |
Beg. |
|
446 |
Begl. |
|
447 |
Begr. |
|
448 |
Begründg. |
|
449 |
Behandl. |
|
450 |
Behdl. |
|
451 |
Behrenstr. |
|
452 |
Behring-Krkh. |
|
453 |
Beiersd. |
|
454 |
Beih. |
|
455 |
Beisetzg. |
|
456 |
Beisp. |
|
457 |
Beitr. |
|
458 |
Beitr.-rückerst. |
|
459 |
Beitr.rückerst. |
|
460 |
Beitragserst. |
|
461 |
Beiw. |
|
462 |
Beleucht. |
|
463 |
Belg. |
|
464 |
Bem. |
|
465 |
Benf. |
|
466 |
Benz. |
|
467 |
Beobacht. |
|
468 |
Bequemlichk. |
|
84 | 469 |
Ber. |
470 |
Berat. |
|
85 | 471 |
Beratg. |
472 |
Beratungsst. |
|
86 | 473 |
Bereif. |
474 |
Berg. |
|
87 | 475 |
Bergb. |
476 |
Bergbl. |
|
477 |
Bergerstr. |
|
88 | 478 |
Bergstr. |
479 |
Bergw. |
|
480 |
Berh. |
|
481 |
Berl. |
|
482 |
Berlinerstr. |
|
89 | 483 |
Bernh. |
484 |
Bernusstr. |
|
485 |
Bertramstr. |
|
486 |
Berufl. |
|
487 |
Berufsauss. |
|
488 |
Berufserfahr. |
|
489 |
Berufsvorbildg. |
|
90 | 490 |
Bes. |
491 |
Bes.-Gr. |
|
492 |
Besch. |
|
493 |
Beschl. |
|
91 | 494 |
Besichtig. |
495 |
Bespr. |
|
496 |
Best. |
|
497 |
Best.-Nr. |
|
498 |
Bestellnr. |
|
499 |
Bestz. |
|
92 | 500 |
Bestzust. |
501 |
Bestzustd. |
|
502 |
Besucherz. |
|
503 |
Bet. |
|
504 |
Betckestr. |
|
93 | 505 |
Beteil. |
94 | 506 |
Beteilig. |
507 |
Bethl. |
|
508 |
Bethmannstr. |
|
95 | 509 |
Betr. |
96 |
Bett. |
|
510 |
Betr.-Akad. |
|
511 |
Betr.-Ing. |
|
512 |
Betr.-Nr. |
|
513 |
Betramstr. |
|
514 |
Betriebsabr. |
|
515 |
Betriebsabrechn. |
|
516 |
Betriebsführ. |
|
517 |
Betriebswirtsch. |
|
518 |
Bettinastr. |
|
519 |
Bev. |
|
97 | 520 |
Bew. |
521 |
Bewegungsgym. |
|
522 |
Beweism. |
|
98 | 523 |
Bewerb. |
99 | 524 |
Bewerbg. |
525 |
Bewert. |
|
100 | 526 |
Bez. |
527 |
Bezahl. |
|
528 |
Bezieh. |
|
529 |
Beziehg. |
|
530 |
Bezirks-Redakt. |
|
531 |
Bfrs. |
|
532 |
Bg. |
|
533 |
Bgb. |
|
101 | 534 |
Bgl. |
535 |
Bgm. |
|
102 | 536 |
Bhf. |
537 |
Bib. |
|
538 |
Biblioth. |
|
539 |
Bieberstr. |
|
540 |
Biebr. |
|
541 |
Bielef. |
|
103 | 542 |
Bierbr. |
543 |
Bietigh. |
|
544 |
Bildoff. |
|
104 | 545 |
Bildzuschr. |
105 | 546 |
Bilf. |
547 |
Bill-Br. |
|
548 |
Bill-Brauer. |
|
549 |
Billbr. |
|
550 |
Bin. |
|
551 |
Binding-Br. |
|
552 |
Biolog. |
|
553 |
Birkenstr. |
|
554 |
Birkenwaldstr. |
|
555 |
Bischöfl. |
|
556 |
Bismarckstr. |
|
557 |
Bismarkstr. |
|
106 | 558 |
Bj. |
107 | 559 |
Bk. |
560 |
BkS. |
|
561 |
Bkges. |
|
562 |
Bkvers. |
|
108 | 563 |
Bkz. |
564 |
Bkz.-Finanz. |
|
565 |
Bkz.-Finanzierg. |
|
109 | 566 |
Bl. |
567 |
Bleibiskopfstr. |
|
568 |
Bleichstr. |
|
569 |
Blk. |
|
110 | 570 |
Bln. |
571 |
Bln.-Charlottenbg. |
|
572 |
Blumenstr. |
|
573 |
Blumenthalstr. |
|
574 |
Blutspendezentr. |
|
575 |
Blz. |
|
576 |
Blücherstr. |
|
577 |
Bm. |
|
578 |
Bmstr. |
|
111 | 579 |
Boch. |
580 |
Boch.-Gels. |
|
581 |
Bockenh. |
|
112 | 582 |
Bod. |
583 |
Bodelschwinghstr. |
|
584 |
Bodensieckhs. |
|
585 |
Bogenstr. |
|
586 |
Bolongarostr. |
|
587 |
Bongardstr. |
|
588 |
Bootshfn. |
|
589 |
Bootspl. |
|
113 | 590 |
Bor. |
591 |
Bornh. |
|
592 |
Borsigstr. |
|
593 |
Borsigw. |
|
594 |
Boruss. |
|
595 |
Botsch. |
|
596 |
Bottigstr. |
|
597 |
Bourg-St. |
|
114 | 598 |
Bov. |
599 |
Bozenerstr. |
|
600 |
Bpr. |
|
115 | 601 |
Br. |
602 |
Br.-Veg. |
|
603 |
Brahmsstr. |
|
604 |
Brahmstr. |
|
605 |
Branch. |
|
606 |
Brau. |
|
607 |
Braubachstr. |
|
608 |
Brauereifds. |
|
609 |
Brauind. |
|
610 |
Brauneggerstr. |
|
611 |
Braunfelsstr. |
|
612 |
Braunk. |
|
613 |
Braunl. |
|
614 |
Brauns. |
|
615 |
Braunschw. |
|
616 |
Braunschwg. |
|
617 |
Braunschwst. |
|
618 |
Brd. |
|
619 |
Bredenstr. |
|
620 |
Brehmstr. |
|
621 |
Breitbg. |
|
622 |
Breitenb. |
|
623 |
Breitenbg. |
|
624 |
Breitscheidstr. |
|
116 | 625 |
Brem. |
626 |
Bremsh. |
|
627 |
Brentanostr. |
|
628 |
Breslauerstr. |
|
629 |
Breubergstr. |
|
630 |
Brg. |
|
117 | 631 |
Brh. |
632 |
Brit. |
|
633 |
Brk. |
|
634 |
Brockhöfe-Bhf. |
|
118 | 635 |
Brok. |
636 |
Brosch. |
|
637 |
Brown. |
|
638 |
Brschr. |
|
639 |
Brschw. |
Formats disponibles : Unified diff