Révision 3926

TXM/trunk/bundles/org.txm.treetagger.core.linux/res/linux/FILES (revision 3926)
1 1

  
2 2
This package contains the TreeTagger, a probabilistic part-of-speech
3
tagger developed by Helmut Schmid. All rights are reserved by the
4
Institute for Computational Linguistics at the University of
5
Stuttgart. The programs have been statically compiled for x86 computers
6
running Linux.
3
tagger developed by Helmut Schmid. All rights are reserved by the 
4
Institute for Computational Linguistics at the University of Stuttgart.
5
The programs have been compiled for PCs with the Linux operating system.
6
Many thanks to Arne Fitschen for doing this!
7 7

  
8 8
Files contained in this package:
9 9

  
......
13 13
- bin/train-tree-tagger training program
14 14
- bin/tree-tagger       tagger programm
15 15
- bin/separate-punctuation program for tokenization (used by the shell scripts)
16
- cmd/lookup.perl       Perl script for pretagging
16 17
- doc/nemlap94.ps       paper describing the TreeTagger
17 18
- doc/sigdat95.ps       paper describing the TreeTagger
18 19

  
TXM/trunk/bundles/org.txm.treetagger.core.linux/res/linux/README (revision 3926)
1 1

  
2
/***************************************************************************/
3
/* How to use the TreeTagger                                               */
4
/* Author: Helmut Schmid, University of Stuttgart, Germany                 */
5
/***************************************************************************/
2
/*****************************************************************************/
3
/* How to use the TreeTagger                                                 */
4
/*****************************************************************************/
6 5

  
7 6

  
8 7
The TreeTagger consists of two programs: train-tree-tagger is used to 
......
24 23
input will be read from stdin. If neither an input file nor an output file
25 24
is specified, the tagger will print to stdout.
26 25

  
27
tree-tagger {-options-} <parameter file> {<input file> {<output file>}}
26
tree-tagger <parameter file> <input file> <output file> {-eps <epsilon>}
27
       {-base} {-proto} {-sgml} {-token} {-lemma} {-beam <threshold>}
28 28

  
29 29
Description of the command line arguments:
30 30

  
31 31
* <parameter file>: Name of a parameter file which was created with the 
32 32
  train-tree-tagger program.
33 33
* <input file>: Name of the file which is to be tagged. Each token in this 
34
  file has to be on a separate line. Tokens may contain blanks. It is possible
34
  file must be on a separate line. Tokens may contain blanks. It is possible
35 35
  to override the lexical information contained in the parameter file of the
36 36
  tagger by specifying a list of possible tags after a token. This list has
37
  to be preceded by a tab character and the elements are separated by tab 
38
  characters. This pretagging feature could be used e.g. to ensure that
37
  to be preceded by a tab character. The tags are optionally followed by a
38
  floating point value to specify the probability of the tag. Adding such
39
  tag information in the tagger's input is sometimes useful to ensure that
39 40
  certain text-specific expressions are tagged properly.
40 41
  Punctuation marks must be on separate lines as well. Clitics (like "'s",
41 42
  "'re", and "'d" in English or "-la" and "-t-elle" in French) should be
42 43
  separated if they were separated in the training data. (The French and
43
  English parameter files available by ftp expect separation of clitics).
44
  English parameter files available by ftp, expect separation of clitics).
44 45
  Sample input file:
45 46
    He
46 47
    moved
47 48
    to
48
    New York City	NP
49
    New York City	NP 1.0
49 50
    .
50 51
* <output file>: Name of the file to which the tagger should write its output.
51 52

  
52 53
Further optional command line arguments:
53 54

  
54
* -token: The words/tokens are printed in addition to the POS tags
55
* -lemma: Lemmas are printed as well.
56
* -sgml: This option instructs the tagger to ignore tokens which start
57
  with '<' and end with '>' (SGML tags).
58
* -lex <f>: The file <f> contains additional lexicon entries to be used
59
  by the tagger. The file format is identical to the format of the lexicon
60
  argument of the training program (see below).
61
* -no-unknown: If an unknown word is encountered, emit the word form
62
  as lemma. This was previously the default behaviour. Now, the default 
63
  behaviour is to print "<unknown>" as lemma.
64
* -threshold <p>: This option tells the tagger to print all tags of a
65
  word with a probability higher than <p> times the largest probability.
66
  (The tagger will use a different algorithm in this case and the set of
67
  best tags might be different from the tags generated without this
68
  option.)
69
* -prob: Print tag probabilities (in combination with option -threshold)
70
* -pt-with-prob: If this option is specified, then each pretagging tag
71
  (see above) has to be followed by a whitespace and a tag probability 
72
  value.
73
* -pt-with-lemma: If this option is specified, then each pretagging tag
74
  (see above) has to be followed by a whitespace and a lemma. Lemmas may 
75
  contain blanks.
76
  If both -pt-with-prob and -pt-with-lemma have been specified, then each
77
  pretagging tag is followed by a probability and a lemma in that order.
78
* -hyphen-heuristics: needed for chunking. See below for more information
79
  about how to train a chunk parameter file with hyphen-heuristics.
55
* -token: tells the tagger to print the words also.
56
* -lemma: tells the tagger to print the lemmas of the words also.
57
* -sgml: tells the tagger to ignore tokens starting with '<' and ending
58
  with '>' (SGML tags).
59
* -threshold p: tells the tagger to print all tags of a word with a probability higher than p times the largest tag probability. (The tagger will use the Forward-Backward algorithm instead of the Viterbi algorithm to annotate the data if this option is used.)
60
* -lex file: tells the tagger to read additional lexicon entries form "file". The lexicon format is the same as that of the lexicon file used for training.
80 61

  
81
The options below are for advanced users. Please, read the papers on the 
82
TreeTagger to fully understand their meaning.
62
The options below are for advanced users. Read the papers on the TreeTagger
63
to fully understand their meaning.
83 64

  
84 65
* -proto: If this option is specified, the tagger creates a file named
85 66
  "lexicon-protocol.txt", which contains information about the degree of
......
90 71
  hyphen has been found in the fullform lexicon.
91 72
* -eps <epsilon>: Value which is used to replace zero lexical frequencies.
92 73
  This is the case if a word/tag pair is contained in the lexicon but not
93
  in the training corpus. The choice of this parameter has only minor
94
  influence on the tagging accuracy.
74
  in the training corpus. The default is 0.1. The choice of this parameter
75
  has some minor influence on tagging accuracy.
76
* -beam <threshold>: If the tagger is slow, this option can be used to speed it up.
77
  Good values for <threshold> are in the range 0.001-0.00001.
95 78
* -base: If this option is specified, only lexical information is used
96 79
  for tagging but no contextual information about the preceding tags.
97 80
  This option is only useful in order to obtain a baseline result
98 81
  to which to compare the actual tagger output.
99 82

  
83
There is another tagger program called "tree-tagger-flush" which
84
flushes the output after reading an empty line. It expects a parameter
85
file as argument and reads from stdin and writes to stdout. No command
86
line options are supported. This program might be useful for
87
implementing wrappers.
100 88

  
101 89

  
90

  
91

  
102 92
Training
103 93
--------
104 94

  
105 95
Training is done with the *train-tree-tagger* program. It expects at least
106 96
four command line arguments which are described below.
107 97

  
108
train-tree-tagger {options} <lexicon> <open class file> <input file> <output file>
98
train-tree-tagger <lexicon> <open class file> <input file> <output file> 
99
            {-cl <context length>} {-dtg <min. decision tree gain>}
100
            {-ecw <eq. class weight>} {-atg <affix tree gain>} {-st <sent. tag>}
109 101

  
110 102
Description of the command line arguments:
111 103

  
112 104
* <lexicon>: name of a file which contains the fullform lexicon. Each line 
113 105
  of the lexicon corresponds to one word form and contains the word form 
114
  and a sequence of tag-lemma pairs. Each tag is preceded by a tab character
115
  and each lemma is preceded by a blank or tab character.
106
  itself followed by a Tab character and a sequence of tag-lemma pairs.
107
  The tags and lemmata are separated by whitespace.
116 108
  Example:
117 109

  
118 110
aback	RB aback
119 111
abacuses	NNS abacus
120
abandon	VB abandon	VBP abandon
121
abandoned	JJ abandoned	VBD abandon	VBN abandon
112
abandon	VB abandon VBP abandon
113
abandoned	JJ abandoned VBD abandon VBN abandon
122 114
abandoning	VBG abandon
123 115

  
124
  Attention: Ordinal and cardinal numbers which consist of digits
125
  (like 1, 13, 1278 or 2. and 75.) should not be included in the
126
  lexicon. Otherwise, the tagger will not be able to learn how to tag
127
  numbers which are not listed in the lexicon. Numbers with unusual
128
  tags should be added to the lexicon, however. If the training
129
  program reports an error because the POS tag used for numbers is
130
  unknown, you should add a lexicon entry for one number.
116
  Remark: The tagger doesn't need the lemmata actually. If you do not have
117
  the lemma information or if you do not plan to annotate corpora with
118
  lemmas, you can replace the lemma with a dummy value, e.g. "-".
131 119

  
132
  Remark: The tagger doesn't need the lemmas for tagging actually. If
133
  you do not have the lemma information or if you do not plan to
134
  annotate corpora with lemmas, you can replace the lemma with a dummy
135
  value, e.g. "-".
136

  
137
  You can use the Perl script make-lex.perl as follows in order to
138
  create a tagger lexicon from the training corpus:
139
    cmd/make-lex.perl corpus > lexicon
140

  
141
  If you have additional lexicon entries stored in a separate file "lex"
142
  with entries like this (The POS tag is preceded by a tab character.)
143
    aback   RB aback
144
    aback   RP aback
145
    abacs   NNS abac
146
  you can include them as follows:  cmd/make-lex.perl corpus lex > lexicon
147

  
148
  If train-tree-tagger complains about unknown tags, just add another
149
  entry to the lexicon with the respective POS tag.
150

  
151
* <open class file>: name of a file which contains a list of open class tags
152
  i.e. possible tags of unknown word forms separated by whitespace.
120
* <open class file>: name of a file which contains a list of open class
121
  tags i.e. possible tags of unknown word forms separated by whitespace.
153 122
  The tagger will use this information when it encounters unknown words,
154 123
  i.e. words which are not contained in the lexicon.
155 124
  Example: (for Penn Treebank tagset)
......
159 128
* <input file>: name of a file which contains tagged training data. The data
160 129
  must be in one-word-per-line format. This means that each line contains 
161 130
  one token and one tag in that order separated by a tabulator. 
162
  Punctuation marks are considered as tokens and must be tagged as well.
163
  The file should neither contain empty lines nor untagged SGML markup.
131
  Punctuation marks are considered as tokens and must have been tagged as well.
164 132
  Example:
165 133

  
166 134
Pierre  NP
......
177 145

  
178 146
* -st <sent. tag>: the end-of-sentence part-of-speech tag, i.e. the tag which
179 147
  is assigned to sentence punctuation like ".", "!", "?". 
180
  Default is "SENT". You have to use this option, if your tag for sentence
181
  punctuation is not "SENT". If you have more than one such tag, choose the
182
  most frequent one.
183
* -utf8 assume that the data is encoded with UTF8
148
  Default is "SENT". It is important to set this option properly, if your
149
  tag for sentence punctuation is not "SENT".
184 150
* -cl <context length>: number of preceding words forming the statistical
185 151
  context. The default is 2 which corresponds to a trigram context. For
186 152
  small training corpora and/or large tagsets, it could be useful to reduce
187 153
  this parameter to 1.
188 154
* -dtg <min. decision tree gain>: Threshold - If the information gain at a 
189 155
  leaf node of the decision tree is below this threshold, the node is deleted.
190
* -sw <weight>: A smoothing parameter, which determines how much the
191
  probability distribution of some decision tree node is smoothed with the
192
  probability distribution of the parent node.
156
  The default value is 0.7.
193 157
* -ecw <eq. class weight>: weight of the equivalence class based probability
194
  estimates.
158
  estimates. The default is 0.15.
195 159
* -atg <affix tree gain> Threshold - If the information gain at a leaf of an
196 160
  affix tree is below this threshold, it is deleted. The default is 1.2.
197 161

  
198
The accuracy of the TreeTagger usually improves, if different settings
199
of the above parameters are tested and the best combination is chosen.
200

  
201

  
202
Caveat: Make sure that the lexicon and the training corpus contain no
203
extra blanks. If the word form, for instance, is followed by a blank
204
and a tab character, the blank will be considered part of the word.
205

  
206
The script 'cmd/create-pos-parameter-file' can be used to train a 
207
parameter file, provided the file: 'lib/open-class-tags' exists.
208
The script creates a lexicon and a parameter file
209
that both will be stored in the lib-directory.
210

  
211

  
212
Training a parameter file for chunking
213
---------------------------------------
214

  
215
(This section of the README file was created by Wiebke Wagner.)
216

  
217
Training is done with the *train-tree-tagger* program just like the training
218
of part-of-speech parameter files. The input files differ (see below).
219

  
220
train-tree-tagger {options} <lexicon> <open class file> <input file> <output file>
221

  
222
Description of the command line arguments:
223

  
224
* <lexicon>: name of a file which contains the fullform lexicon. Each line 
225
  of the lexicon corresponds to one word form and contains the word form 
226
  with its pos-tag and a sequence of chunktag-lemma pairs. Since there is
227
  no lemma for a string containing a word and its pos, lemma is just a 
228
  dummy-placeholder. Each chunktag 
229
  is preceded by a tab character and each dummy-lemma is preceded by a blank or tab character.
230
  Example:
231

  
232
Abs-NP  NP/I-NC #       NP/B-NC #
233
Academic-NP     NP/I-NC #
234
Activated-VVN   VBN/B-VC #      VBN/B-NC #
235
Activation-NN   NN/I-NC #       NN/B-NC #
236
VB      VB/I-VC #
237

  
238
  The lexicon must contain specific entries that contain a hypen 
239
  ('WORD-POS	POS/IOB DUMMY-LEMMA') and general entries without a hypen 
240
  ('POS	   POS/IOB DUMMY-LEMMMA)'. The hypen-heuristics enables the 
241
  program to select the general entry if no specific entry is available.
242

  
243
  You can use the Perl script make-chunk-lex.perl as follows in order to
244
  create a tagger-chunker lexicon from the training corpus:
245
  'cmd/make-chunk-lex.perl' corpus > chunker-lexicon
246

  
247
  Attention: if 'train-tree-tagger' shows the error message:
248
  'ERROR: Sentence punctuation tag "SENT" is not in lexicon!' 
249
  add the option: [-st 'SENT/O'] to the system call.
250
 
251
  If train-tree-tagger complains about unknown tags, just add another
252
  entry to the lexicon with the respective POS tag.
253

  
254
* <open class file>: name of a file which contains only the dummy entry:
255
  NN/B-NC. If there are general entries in the lexicon for every 
256
  word class, and if the hyphen heuristic is activated  no tags 
257
  have to be guessed.
258

  
259
* <input file>: name of a file which contains training data annotated
260
  with part-of-speech tags  and chunk tags. The data
261
  must be in one-word-per-line format. This means that each line contains 
262
  one token with its mark-up: 
263
  WORD-POS	POS/IOB DUMMY-LEMMA
264
  The file should neither contain empty lines nor untagged SGML markup.
265
  Example:
266

  
267
Activation-NN   NN/I-NC
268
of-IN   IN/B-PC
269
the-DT  DT/B-NC
270
CD28-NP NP/I-NC
271
surface-NN      NN/I-NC
272

  
273
* <output file>: name of the file in which the resulting tagger parameters 
274
  are stored.
275

  
276
The optional parameters like for training a part-of-speech parameter file
277

  
278
The accuracy of the TreeTagger usually improves, if different settings
279
of the above parameters are tested and the best combination is chosen.
280

  
281
Caveat: Make sure that the lexicon and the training corpus contain no
282
extra blanks. If the word form, for instance, is followed by a blank
283
and a tab character, the blank will be considered part of the word.
284

  
285
The script 'cmd/create-chunk-parameter-file' can be used to train a chunk
286
parameter file, provided the file: 'lib/open-class-chunks' exists and 
287
contains the dummy entry: 
288
NN/B-NC  
289
The script creates a lexicon and a parameter file
290
that both will be stored in the lib-directory.
162
The accuracy of the TreeTagger is usually slightly improved, if different
163
settings of the above parameters are tested and the best combination is
164
chosen.
TXM/trunk/bundles/org.txm.treetagger.core.win32/res/win32/INSTALL.txt (revision 3926)
1

  
2
/****************************************************************************/
3
/* How to install the Windows version of the TreeTagger                     */
4
/*                                                                          */
5
/* Author: Helmut Schmid, CIS, Ludwig-Maximilians-Universität, Germany      */
6
/****************************************************************************/
7

  
8
This is the Windows distribution of the TreeTagger.
9

  
10
It contains the following files:
11

  
12
- tree-tagger.exe: the tagger program
13

  
14
- train-tree-tagger.exe: the training program
15

  
16
- utf8-tokenize.perl: A Perl script which transforms the tagger input
17
                    into one-word-perl-line format
18

  
19
- *-abbreviations:  abbreviation lists required by the tokenizer
20

  
21
- tag-*.bat:        batch files for different languages which call 
22
                    the tokeniser and the tagger 
23

  
24
- chunk-*.bat	    batch files for POS tagging and chunking
25

  
26

  
27
Installation
28
------------
29

  
30
1. Install a Perl interpreter (if you have not already installed one).
31
   You can download a Perl interpreter for Windows for free at
32
   http://www.activestate.com/activeperl/
33

  
34
2. Move the TreeTagger directory to the root directory of drive C:.
35

  
36
3. Download the PC parameter files for the languages you need, decompress
37
   them (e.g. using Winzip or 7zip) and move them to the subdirectory lib.
38
   Rename the parameter files to <language>-utf8.par
39
   Example: Rename french-par-linux-3.2-utf8.bin to french-utf8.par
40

  
41
   Non-UTF8 parameter files are not supported anymore.
42

  
43
4. Add the path C:\TreeTagger\bin to the PATH environment variable.
44

  
45
5. Open a shell and type the command
46
   set PATH=C:\TreeTagger\bin;%PATH%
47

  
48
6. Change to the directory C:\TreeTagger
49

  
50
7. Now you can test the tagger, e.g. by analyzing this file with the command
51
   tag-english INSTALL.txt
52

  
53
If you install the TreeTagger in a different directory, you have to
54
modify the first path in the batch files tag-*.bat.
55

  
56

  
57
Michaela Atterer told me that she had difficulties to install the
58
TreeTagger on a Windows XP system. She recommends the following
59
work-around.
60

  
61
4. Windows XP:
62
-Right click on "My Computer"
63
-Select the "Advanced" tab
64
-Click on "Environment Variables"
65
-click on New: enter PATH and C:\TreeTagger\bin\;%PATH%
66

  
67
If the files have been unpacked into a single directory, you should
68
restore the following directory structure:
69

  
70
TreeTagger:
71
INSTALL.txt  README.txt  bin  cmd  lib
72

  
73
TreeTagger/bin:
74
tag-english.bat  tag-german.bat   tag-spanish.bat        tree-tagger.exe
75
tag-french.bat   tag-italian.bat  train-tree-tagger.exe
76

  
77
TreeTagger/cmd:
78
mwl-lookup.perl  tokenize.pl
79

  
80
TreeTagger/lib:
81
english-abbreviations  german-abbreviations   spanish-abbreviations
82
french-abbreviations   italian-abbreviations  spanish-mwls
83

  
84

  
85
Note that the TreeTagger comes without a graphical interface. You have
86
to run it by entering a command in a command line window. If you prefer 
87
a graphical interface, try the one provided by Ciarán Ó Duibhín at
88
http://www.smo.uhi.ac.uk/~oduibhin/oideasra/interfaces/winttinterface.htm
89

  
1

  
2
/****************************************************************************/
3
/* How to install the Windows version of the TreeTagger                     */
4
/****************************************************************************/
5

  
6
This is the Windows distribution of the TreeTagger.
7

  
8
It contains the following files:
9

  
10
- tree-tagger.exe: the tagger program
11

  
12
- train-tree-tagger.exe: the training program
13

  
14
- utf8-tokenize.perl: A Perl script which transforms the tagger input
15
                    into one-word-perl-line format
16

  
17
- *-abbreviations:  abbreviation lists required by the tokenizer
18

  
19
- tag-*.bat:        batch files for different languages which call 
20
                    the tokeniser and the tagger 
21

  
22
- chunk-*.bat	    batch files for POS tagging and chunking
23

  
24

  
25
Installation
26
------------
27

  
28
1. Install a Perl interpreter (if you have not already installed one).
29
   You can download a Perl interpreter for Windows for free at
30
   http://www.perl.org
31

  
32
2. Extract the zip file (if it was not extracted yet) and move the
33
   TreeTagger directory to the root directory of drive C:\.
34

  
35
3. Download the parameter files for the languages you need, decompress
36
   them (e.g. using Winzip or 7zip) and move them to the subdirectory TreeTagger/lib.
37

  
38
4. Add the path C:\TreeTagger\bin to the PATH environment
39
   variable. The necessary steps differ from one Windows version to the other.
40

  
41
5. Open a command prompt window and type the command
42
   set PATH=C:\TreeTagger\bin;%PATH%
43

  
44
6. Go to the directory C:\TreeTagger
45
   cd c:\TreeTagger
46

  
47
7. Now you can test the tagger, e.g. by analyzing this file with the command
48
   tag-english INSTALL.txt
49

  
50
The input should be encoded in UTF8.
51

  
52
If you install the TreeTagger in a different directory, you have to
53
modify the first path in the batch files tag-*.bat using an editor
54
such as Wordpad.
55

  
56
If the files have been unpacked into a single directory, you should
57
restore the following directory structure:
58

  
59
TreeTagger:
60
INSTALL.txt  README.txt  bin  cmd  lib
61

  
62
TreeTagger/bin:
63
tag-english.bat  tag-german.bat   tag-spanish.bat        tree-tagger.exe
64
tag-french.bat   tag-italian.bat  train-tree-tagger.exe
65

  
66
TreeTagger/cmd:
67
mwl-lookup.perl  tokenize.pl
68

  
69
TreeTagger/lib:
70
english-abbreviations  german-abbreviations   spanish-abbreviations
71
french-abbreviations   italian-abbreviations  spanish-mwls
72

  
73

  
74
Note that the TreeTagger comes without a graphical interface. You have
75
to run it by entering a command in a command line window. If you prefer 
76
a graphical interface, try the one provided by Ciarán Ó Duibhín at
77
http://www.smo.uhi.ac.uk/~oduibhin/oideasra/interfaces/winttinterface.htm
78

  
TXM/trunk/bundles/org.txm.treetagger.core.win32/res/win32/lib/english-abbreviations (revision 3926)
1
&Co.
2
AUG.
1 3
Adm.
2 4
Ala.
5
Ald.
6
App.Cas.
3 7
Ariz.
4 8
Ark.
9
Assn.
10
Assoc.
11
Att.
5 12
Aug.
13
Av.
6 14
Ave.
7 15
Bancorp.
16
Bde.
8 17
Bhd.
18
Blvd.
9 19
Brig.
10 20
Bros.
21
C.-in-C.
11 22
CO.
12 23
CORP.
13 24
COS.
......
16 27
Canada-U.S.
17 28
Canadian-U.S.
18 29
Capt.
30
Cas.
31
Ch.
32
Ch.App.
33
Ch.D.
19 34
Cia.
20 35
Cie.
36
Cm.
37
Cmd.
38
Cmnd.
21 39
Co.
22 40
Col.
23 41
Colo.
24 42
Conn.
25 43
Corp.
26 44
Cos.
45
Cowp.
46
Cr.App.R.
47
Crim.L.R.
27 48
D-Mass.
49
D.Litt.
50
D.Phil.
51
DFl.
28 52
Dec.
29 53
Del.
54
Dep.
30 55
Dept.
56
Deut.
57
Diod.
58
Div.
31 59
Dr.
32 60
Drs.
61
Dtr.
62
Durn.
63
E.g.
64
ESQ.
65
Eph.
66
Eq.
67
Eqn.
68
Eqns.
69
Esq.
33 70
Etc.
71
Exch.
72
Exod.
73
Ext.
74
FIG.
75
Fam.
34 76
Feb.
77
Fig.
78
Figs.
35 79
Fla.
36 80
Ft.
81
G.m.b.H.
37 82
Ga.
38 83
Gen.
39 84
Gov.
85
Hdt.
40 86
Hon.
41 87
INC.
88
Ibid.
42 89
Ill.
43 90
Inc.
44 91
Ind.
92
InfoCorp.
93
Intercorp.
94
Invest.
95
JJ.
96
JR.
45 97
Jan.
46 98
Japan-U.S.
47 99
Jr.
100
Jud.
48 101
Kan.
49 102
Korean-U.S.
50 103
Ky.
104
L.JJ.
105
L.R.Ir.
106
LL.M.
107
LTD.
51 108
La.
52 109
Lt.
110
Lt.-Col.
53 111
Ltd.
112
Ltda.
113
M.Ed.
114
M.Litt.
115
M.Phil.
54 116
Maj.
55 117
Mass.
56 118
Md.
119
Me.T.A.
57 120
Messrs.
58 121
Mfg.
59 122
Mich.
60 123
Minn.
61 124
Miss.
62 125
Mo.
126
Mod.Rep.
127
Mont.
63 128
Mr.
64 129
Mrs.
65 130
Ms.
66 131
Neb.
67 132
Nev.
68 133
No.
134
Non-U.S.
69 135
Nos.
70 136
Nov.
71 137
Oct.
138
Oe.
72 139
Okla.
73 140
Ont.
141
Op.
74 142
Ore.
143
P.o.s.
75 144
Pa.
76 145
Ph.
146
Ph.D.
147
Pp.
148
Prev.
77 149
Prof.
78 150
Prop.
151
Pte.
152
Ptr.
79 153
Pty.
154
Reg.
155
Regt.
80 156
Rep.
81 157
Reps.
158
Repub.
159
Ret.
82 160
Rev.
161
Rom.
83 162
S.p.A.
163
Sec.
84 164
Sen.
85 165
Sens.
86 166
Sept.
87 167
Sgt.
168
Sh.Ct.
88 169
Sino-U.S.
170
Soc.
171
Som.
172
Soviet-U.S.
173
Sp.
89 174
Sr.
90 175
St.
91 176
Ste.
177
Suff.
178
Syll.
179
T.B.G.A.S.
92 180
Tenn.
93 181
Tex.
182
Thess.
183
Thuc.
184
Transp.
185
Trop.
94 186
U.S.-U.K.
95 187
U.S.-U.S.S.R.
188
U.S.P.G.A.
189
Univ.
190
V.-C.
96 191
Va.
192
Vict.
193
Vol.
97 194
Vt.
98 195
W.Va.
99 196
Wash.
100 197
Wis.
101 198
Wyo.
199
a-Ex-dividend.
200
a.c.
201
a.g.m.
102 202
a.k.a.
103 203
a.m.
204
al.
104 205
anti-U.S.
206
approx.
207
b.s.
208
bldg.
209
c.c.d.
210
c.e.o.
211
c.f.
212
c.g.
213
c.v.
214
c/s.
105 215
cap.
106
days.
216
cf.
217
ch.
218
cit.
219
clar.
220
co.
221
col.
222
cols.
223
constr.
224
cp.
225
cwt.
226
d.c.
227
d.f.
228
d.i.l.
229
d.p.c.
230
def.
231
dw.
232
e-Estimated.
233
e.g.
234
e.m.f.
235
e.p.s.p.
236
edn.
237
edns.
238
est.
107 239
etc.
240
ex-L.C.C.
241
fig.
242
fl.
243
fol.
108 244
ft.
245
gen.
246
govt.
247
h.p.
248
hon.
249
hrs.
250
i.c.
109 251
i.e.
252
ibid.
253
inc.
254
incl.
255
juv.
256
k.p.h.
257
l.e.d.
258
lbs.
259
loc.
260
m.d.
261
m.p.h.
262
msec.
263
n.d.
264
n.m.r.
265
non-U.K.
110 266
non-U.S.
267
norw.
268
nos.
269
oz.
270
ozs.
271
p.
272
p.a.
273
p.c.
111 274
p.m.
275
p.o.s.
276
p.p.m.
277
p.s.i.
278
p.w.
279
pl.
280
pls.
281
pos.
282
pp.
283
pres.
112 284
president-U.S.
285
pro-U.S.
286
q.v.
287
qq.v.
288
r.f.
289
r.h.
290
r.m.s.
291
r.m.s.d.
292
r.p.m.
293
r.s.s.
294
ref.
295
s.
296
s.a.
297
s.a.e.
298
s.d.
299
s.e.m.
113 300
s.r.l.
301
s.t.p.
302
spp.
303
sq.ft.
304
sq.m.
305
subss.
114 306
v.
115 307
v.B.
116 308
v.w.
309
var.
310
viz.
311
vol.
312
vols.
117 313
vs.
314
w.c.
TXM/trunk/bundles/org.txm.treetagger.core.win32/res/win32/lib/german-abbreviations (revision 3926)
1
-hdg.
2
-tlg.
3
12Tel.
4
12st.
5
1856ff.
6
20j.
7
31Tel.
8
3jähr.
9
40j.
10
48So.
11
62j.
1 12
A.
13
A.-Br.
2 14
A.-G.
15
A.-Kuntz-Str.
16
A.D.
3 17
A.G.
18
A.N.A.C.R.F.
19
A.T.
4 20
ADN-Korr.
21
AF-Volkskorr.
22
AGA-Annonc.
23
AR-Volkskorr.
5 24
AT-Mot.
25
AWG-Wohng.
26
Aach.
27
Aachn.
28
Aamtl.
6 29
Abb.
30
Abbestr.
31
Abdeckpl.
32
Abdseg.
33
Abendrestaur.
34
Abenthumstr.
7 35
Abess.
36
Abf.
37
Abg.
38
Abk.
8 39
Abl.
9 40
Ablief.-Gew.
41
Ablös.
10 42
Abm.
43
Abrechn.
11 44
Abs.
45
Abschlußprfg.
46
Abschlußprüfg.
47
Abschn.
48
Abschr.
49
Absetzbkt.
50
Abspr.
51
Abst.
52
Abstellfl.
53
Abstellpl.
12 54
Abt.
55
Abteistr.
13 56
Abtlg.
57
Acc.
58
Achenbachstr.
59
Achterkab.
60
Achtermannstr.
61
Ackermannstr.
62
Act.
63
Ad.
64
Adalbertstr.
65
Add.
66
Adelheidstr.
67
Adlerflychtstr.
68
Adolfsbr.
69
Adolfstr.
70
Adr.
71
Adt.
72
Ag.
73
Aggrip.
14 74
Agl.
15 75
Agt.
76
Ahornstr.
77
Aindorferstr.
78
Ainmillerstr.
79
AirC.
80
Airl.
81
Akad.
82
Akk.
83
Akkermannstr.
84
Akt.
85
Akt.-G.
16 86
Akt.-Ges.
87
Akt.-Nr.
17 88
Aktbr.
89
Aktien-Nr.
90
Aktv.
91
Aktz.
92
Akz.
93
Alarmanl.
94
Albanusstr.
95
Albertusstr.
96
Albing.
97
Alcan.
98
Alem.
99
Alemannenstr.
100
Alex.
101
Alexanderstr.
102
Alexanderw.
103
Alexandrastr.
104
Alfa-Montr.
105
Alfm.
106
Alfredstr.
18 107
Alg.
108
Algem.
109
Alk.
110
Alleestr.
111
Alleinbeauftr.
112
Alleinbeauftragt.
19 113
Alleininh.
114
Allerstr.
20 115
Allg.
116
Allg.-Bildg.
117
Allgem.
118
Allianz-Leb.
119
Allianz-Vers.
120
Almanasrh.
121
Alnmillerstr.
122
Alph.
123
Alt-Schmargend.
124
Alt-Schönebg.
125
Alt.-Kalt.
126
Altbt.
127
Altenbg.
128
Altentagesst.
129
Altersbeschr.
130
Altheimstr.
131
Altkönigstr.
132
Altluth.
133
Altsäng.
134
Altw.
21 135
Altwageneint.
22 136
Alu-Felg.
137
Alu.
138
Alufelg.
23 139
Alum.
24 140
Am.
141
Amat.
142
Amb.
143
Ambg.
144
Amer.
25 145
Amp.
146
Amsterd.
147
Amtl.
148
Anal.
149
Anb.
150
Andlerflychtstr.
151
Andreasstr.
152
Andrestr.
153
Andréstr.
154
Anerkg.
26 155
Anf.
27 156
Anfr.
28 157
Anfrag.
158
Anfäng.
29 159
Ang.
160
Angab.
30 161
Angb.
31 162
Angeb.
163
Angeh.
164
Angertorstr.
165
Angest.
32 166
Angl.
167
Anglo-Am.
168
Anh.
33 169
Anhängerkuppl.
170
Ank.
34 171
Anl.
172
Anl.-Obj.
173
Anl.-Verm.
174
Anlageverm.
35 175
Anleih.
176
Anm.
177
Anmeld.
36 178
Ann.-Exp.
37 179
Ann.-Exped.
180
Annastr.
181
Annenstr.
182
Annoncen-Exp.
183
Annoncen-Exped.
184
Annoncen-Expedit.
185
Anr.
186
Anschl.
187
Anspr.
188
Anst.
38 189
Ant.
39 190
Anten.
191
Antiquitätenhandlg.
192
Antoniterstr.
193
Antr.
194
Antw.
195
Antwerp.
40 196
Anz.
197
Anz.-Ann.
41 198
Anz.-Exp.
199
Anz.-Exped.
200
Anz.-Ges.
201
Anz.-Mittl.
42 202
Anz.-Verm.
43 203
Anzahlg.
204
Anzeig.-Exped.
205
Anzeig.-Vermittl.
206
Anzhl.
44 207
Anzhlg.
208
Ap.
209
Apart.
210
Apostelgem.
45 211
Apoth.
212
Apothek.
46 213
App.
214
Appart.
47 215
Appartem.
216
Appel-Fk.
217
Apr.
218
April-Abl.
48 219
April-Lief.
220
April-Mai-Abl.
221
Apt.
222
Arb.
223
Arbeitsgemeinsch.
224
Arbg.
225
Arbn.
226
Arbrechtl.
227
Archäol.
228
Arcisstr.
49 229
Argent.
230
Argentin.
231
Argonnenstr.
232
Armenruhstr.
233
Arminiusstr.
234
Arndstr.
235
Arndtstr.
236
Arnsburgerstr.
237
Arnsburgg.
238
Art.
239
Arzneim.-Großhlg.
240
Asahi-Chem.
241
Asc.
242
Aschaffb.
243
Aschaffenb.
244
Aschaffenbg.
245
Ascheb.
246
Aschffbg.
247
Asiat.
248
Ass.
249
Assist.
250
At.
251
Ath.
252
Athosstr.
50 253
Atl.
254
Atlas-Lev.
255
Atlasw.
256
Atmosph.
257
Audenstr.
51 258
Aufb.
259
Aufbaugeb.
260
Aufbrauch-Erkr.
261
Aufg.
262
Aufgabengeb.
263
Aufl.
264
Aufn.
265
Aufnahmebed.
52 266
Aufst.
267
Auftr.
268
Auftr.-Nr.
269
Aufz.
53 270
Aug.
271
Aug.-Abl.
272
Aug.-Wilh.
54 273
Augsb.
274
Augsburgerstr.
275
August-Bebel-Str.
276
Augustinusstr.
277
Aunelstr.
278
Aureliusstr.
279
Ausbau-Abt.
280
Ausbaumöglk.
281
Ausbeutg.
282
Ausbild.
283
Ausbildg.
284
Ausf.
285
Ausführ.
286
Ausführg.
55 287
Ausg.
56 288
Ausgl.
57 289
Ausk.
290
Auskft.
58 291
Ausl.
59 292
Ausl.-Akt.
293
Ausl.-Anl.
60 294
Auslandsanl.
61 295
Auslandsb.
296
Auslief.
297
Auslieferungsgew.
298
Auslieferungslag.
299
Ausn.
300
Auss.
301
Ausseh.
62 302
Ausst.
303
Ausstatt.
63 304
Ausstattg.
305
Ausstellg.
306
Ausstellungspl.
307
Ausstg.
308
Aust.
309
Austatt.
310
Austr.
64 311
Austral.
312
Austrofds.
313
Auswandg.
314
Auswert.
65 315
Ausz.
66 316
Aut.
317
Aut.-Mittelsch.
318
Auto-Abstellpl.
319
Autob.
320
Autob.-Abf.
321
Autobahnanschl.
322
Autobahnausf.
323
Autohs.
67 324
Autom.
68
Automat.
69 325
Automin.
326
Automob.
327
Autopist.
328
Autopl.
329
Av.
330
Avda.
331
Averhoffstr.
332
Az.
70 333
B.
334
B.-Ang.
335
B.-Angest.
336
B.-E.
337
B.-Str.
338
B.O.A.C.
71 339
B.P.
340
B.P.a.
341
B.Wttbg.
72 342
BGBl.
343
BMW-Niederl.
344
Babenh.
345
Bachmannstr.
346
Bachmannstrr.
347
Bachstr.
348
Backhausstr.
349
Bad.
350
Bad.-Württ.
351
Bad.-Württemb.
352
Badeabt.
353
Badeabtlg.
354
Baden-württ.
355
Badenw.
356
Badenwk.
357
Badezi.
358
Badezim.
359
Badstr.
360
Bagh.
361
Bahnanschl.
362
Bahnhofsstr.
363
Bahnhofst.
73 364
Bahnhofstr.
365
Bahnstr.
366
Bal.
74 367
Balk.
368
Balkonzim.
369
Baltim.
370
Bamb.
371
Bambg.
372
Bandelstr.
373
Bangertstr.
374
Bankdir.
375
Bankges.
376
Bankhyp.
377
Bankkaufm.
378
Bankprax.
379
Bankv.
380
Bansastr.
381
Baranz.
382
Barauszhlg.
383
Barbarossastr.
384
Barckhausstr.
385
Bardostr.
386
Barkap.
387
Barpr.
388
Barz.
389
Barzahl.
390
Barzahlg.
391
Basaltstr.
392
Basketb.
393
Bastelausst.
394
Battonnstr.
395
Battonstr.
75 396
Bau-Ing.
397
Bau-Zem.
398
Bauabwickl.
399
Baugenehmig.
76 400
Bauges.
401
Baugesch.
402
Baugrdst.
403
Baugroßhdlg.
404
Baugrundst.
405
Bauing.
77 406
Bauj.
407
Bauk.-Zusch.
408
Baukostenzusch.
409
Baumasch.
410
Baumbest.
411
Baumstr.
412
Baumw.
413
Baumwollsaatöl.
414
Baupl.
415
Bauspard.
416
Baust.
417
Bauw.
418
Bauzust.
78 419
Bay.
79
Bayer.
420
Bay.-ObLG.
421
Bayern-Münch.
422
Bayernwk.
423
Bayerstr.
424
Bayr.
80 425
Bb.
426
Bch.
81 427
Bd.
428
Bd.-Reg.
429
Bd.-Str.
430
Bdbahn.
431
Bde.
432
Bde..
433
Bds.
434
Beamt.
435
Beamtenhaush.
436
Bearbeitg.
437
Beauftr.
438
Bechtenwaldstr.
82 439
Bed.
440
Bedg.
441
Bedieng.
83 442
Beding.
443
Beethovenstr.
444
Befäh.
445
Beg.
446
Begl.
447
Begr.
448
Begründg.
449
Behandl.
450
Behdl.
451
Behrenstr.
452
Behring-Krkh.
453
Beiersd.
454
Beih.
455
Beisetzg.
456
Beisp.
457
Beitr.
458
Beitr.-rückerst.
459
Beitr.rückerst.
460
Beitragserst.
461
Beiw.
462
Beleucht.
463
Belg.
464
Bem.
465
Benf.
466
Benz.
467
Beobacht.
468
Bequemlichk.
84 469
Ber.
470
Berat.
85 471
Beratg.
472
Beratungsst.
86 473
Bereif.
474
Berg.
87 475
Bergb.
476
Bergbl.
477
Bergerstr.
88 478
Bergstr.
479
Bergw.
480
Berh.
481
Berl.
482
Berlinerstr.
89 483
Bernh.
484
Bernusstr.
485
Bertramstr.
486
Berufl.
487
Berufsauss.
488
Berufserfahr.
489
Berufsvorbildg.
90 490
Bes.
491
Bes.-Gr.
492
Besch.
493
Beschl.
91 494
Besichtig.
495
Bespr.
496
Best.
497
Best.-Nr.
498
Bestellnr.
499
Bestz.
92 500
Bestzust.
501
Bestzustd.
502
Besucherz.
503
Bet.
504
Betckestr.
93 505
Beteil.
94 506
Beteilig.
507
Bethl.
508
Bethmannstr.
95 509
Betr.
96
Bett.
510
Betr.-Akad.
511
Betr.-Ing.
512
Betr.-Nr.
513
Betramstr.
514
Betriebsabr.
515
Betriebsabrechn.
516
Betriebsführ.
517
Betriebswirtsch.
518
Bettinastr.
519
Bev.
97 520
Bew.
521
Bewegungsgym.
522
Beweism.
98 523
Bewerb.
99 524
Bewerbg.
525
Bewert.
100 526
Bez.
527
Bezahl.
528
Bezieh.
529
Beziehg.
530
Bezirks-Redakt.
531
Bfrs.
532
Bg.
533
Bgb.
101 534
Bgl.
535
Bgm.
102 536
Bhf.
537
Bib.
538
Biblioth.
539
Bieberstr.
540
Biebr.
541
Bielef.
103 542
Bierbr.
543
Bietigh.
544
Bildoff.
104 545
Bildzuschr.
105 546
Bilf.
547
Bill-Br.
548
Bill-Brauer.
549
Billbr.
550
Bin.
551
Binding-Br.
552
Biolog.
553
Birkenstr.
554
Birkenwaldstr.
555
Bischöfl.
556
Bismarckstr.
557
Bismarkstr.
106 558
Bj.
107 559
Bk.
560
BkS.
561
Bkges.
562
Bkvers.
108 563
Bkz.
564
Bkz.-Finanz.
565
Bkz.-Finanzierg.
109 566
Bl.
567
Bleibiskopfstr.
568
Bleichstr.
569
Blk.
110 570
Bln.
571
Bln.-Charlottenbg.
572
Blumenstr.
573
Blumenthalstr.
574
Blutspendezentr.
575
Blz.
576
Blücherstr.
577
Bm.
578
Bmstr.
111 579
Boch.
580
Boch.-Gels.
581
Bockenh.
112 582
Bod.
583
Bodelschwinghstr.
584
Bodensieckhs.
585
Bogenstr.
586
Bolongarostr.
587
Bongardstr.
588
Bootshfn.
589
Bootspl.
113 590
Bor.
591
Bornh.
592
Borsigstr.
593
Borsigw.
594
Boruss.
595
Botsch.
596
Bottigstr.
597
Bourg-St.
114 598
Bov.
599
Bozenerstr.
600
Bpr.
115 601
Br.
602
Br.-Veg.
603
Brahmsstr.
604
Brahmstr.
605
Branch.
606
Brau.
607
Braubachstr.
608
Brauereifds.
609
Brauind.
610
Brauneggerstr.
611
Braunfelsstr.
612
Braunk.
613
Braunl.
614
Brauns.
615
Braunschw.
616
Braunschwg.
617
Braunschwst.
618
Brd.
619
Bredenstr.
620
Brehmstr.
621
Breitbg.
622
Breitenb.
623
Breitenbg.
624
Breitscheidstr.
116 625
Brem.
626
Bremsh.
627
Brentanostr.
628
Breslauerstr.
629
Breubergstr.
630
Brg.
117 631
Brh.
632
Brit.
633
Brk.
634
Brockhöfe-Bhf.
118 635
Brok.
636
Brosch.
637
Brown.
638
Brschr.
639
Brschw.
... Ce différentiel a été tronqué car il excède la taille maximale pouvant être affichée.

Formats disponibles : Unified diff