Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ryanmcd/uni-dep-tb

A set of treebanks for multiple languages annotated in basic Stanford-style dependencies.
https://github.com/ryanmcd/uni-dep-tb

Last synced: 1 day ago
JSON representation

A set of treebanks for multiple languages annotated in basic Stanford-style dependencies.

Awesome Lists containing this project

README

        

# uni-dep-tb
A set of treebanks for multiple languages annotated in basic Stanford-style dependencies.

NB: The guidelines are currently under revision and the project is migrating to
http://universaldependencies.github.io/docs/.
For further updates, check the new site or contact the project coordinators.

**v2.1**

* Identical to version 2.0, except changes license to CC-BY-SA (drops non-commercial aspect of license). This applies solely to the UD annotations, not the underlying content.

**v2.0**

* Includes Brazilian-Portuguese, English, Finnish, French, German, Italian, Indonesian, Japanese, Korean, Spanish and
Swedish
* Beta content-head version
* Bug fixes
* Description of universal relations

**v1.0**
* Includes English, French, German, Spanish, Swedish and Korean.

**Releases**

* Version 2.0 - Bug fixes, new data, 5 new languages, content-head beta version
* Version 1.0 - Initial Release

**Relevant Documents**

* Universal Dependency Guidelines
* https://uni-dep-tb.googlecode.com/svn/trunk/universal-guidelines.pdf
* Universal Dependency Annotation for Multilingual Parsing. McDonald et al. ACL 2013
* http://ryanmcd.com/papers/treebanksACL2013.pdf
* English Stanford guidelines
* http://www-nlp.stanford.edu/software/dependencies_manual.pdf
* Generating typed dependency parses from phrase structure parses. De Marneffe et al. LREC 2006.
* https://code.google.com/p/uni-dep-tb/

**Contributors and Acknowledgements**

* Project coordinators: Ryan McDonald, Joakim Nivre, Slav Petrov
* Data contributors include Yvonne Quirmbach-Brundage and others at Appen-Butler-Hill; Adam LaMontagne,
Milan Soucek, Timo Jarvinen, Alessandra Radici and others at Lionbridge
* Joakim Nivre provided the harmonized version of the Swedish Treebank Talbanken portion
* http://stp.lingfil.uu.se/~nivre/swedish_treebank/
* Filip Ginter and the group at Turku provided the Finnish data and assisted in the harmonization process
* http://bionlp.utu.fi/fintreebank.html
* Maria Simi and other researchers at Pisa provided the harmonized Italian data
* http://medialab.di.unipi.it/wiki/ISDT
* Thanks to Fernando Pereira, Alfred Spector, Dave Orr, Jennifer Bahk and others at Google for support.
* Thanks to Hans Uszkoreit for giving us permission to use sentences from the Tiger treebank.