Apache OpenNLP Pretrained Models https://opennlp.apache.org ========================== Version 1.2 ########### - All models have been trained on the Universal Dependencies corpus version 2.15 using Apache OpenNLP 2.5.0 (and should work with previous OpenNLP versions >= 1.0). - Training was conducted with 300 iterations instead of 100 which should result in better overall model performance. - Please note: - The Lemmatizer model type was added for all existing and new languages. - We now provide models of all four types for the following 9 new languages: - "Armenian|hy|BSUT" - "Basque|eu|BDT" - "Catalan|ca|AnCora" - "Georgian|ka|GLC" - "Greek|el|GDT" - "Kazakh|kk|KTB" - "Korean|ko|Kaist" - "Icelandic|is|IcePaHC" - "Turkish|tr|BOUN" - Refer to the opennlp-training-eval-logs-1.2-2.5.0.zip file for the individual model training and evaluation logs. Version 1.1 ########### - Trained using Apache OpenNLP 2.4.0. (The models should work with other OpenNLP versions but were trained and tested using this version.) - Trained on the Universal Dependencies corpus, version 2.14 - The model file names match the specific Universal Dependencies training data. - Unless specifically noted, all models were trained with default options using the Apache OpenNLP CLI. - Refer to the opennlp-training-eval-logs-1.1-2.4.0.zip file for the individual model training and evaluation logs. - Please note: - The French models achieved really low performance scores in version 1.0 - this was due to an error in the FTB treebank, which is now discontinued. We are now using GSD instead. These models achieve a much better performance. - We now provide models for the following new languages: - "Bulgarian|bg|BTB" - "Czech|cs|PDT" - "Croatian|hr|SET" - "Danish|da|DDT" - "Estonian|et|EDT" - "Finnish|fi|TDT" - "Latvian|lv|LVTB" - "Norwegian|no|Bokmaal" - "Polish|pl|PDB" - "Portuguese|pt|GSD" - "Romanian|ro|RRT" - "Russian|ru|GSD" - "Serbian|sr|SET" - "Slovenian|sl|SSJ" - "Slovak|sk|SNK" - "Spanish|es|GSD" - "Swedish|sv|Talbanken" - "Ukrainian|uk|IU" Version 1.0 ########### - Trained using Apache OpenNLP 1.9.3. (The models should work with other OpenNLP versions but were trained and tested using this version.) - Trained on the Universal Dependencies corpus. - The model file names match the specific Universal Dependencies training data. - Unless specifically noted, all models were trained with default options using the Apache OpenNLP CLI. - Refer to the opennlp-training-eval-logs-1.0-1.9.3.zip file for the individual model training and evaluation logs.