https://github.com/danieldk/conllu-utils
Utilities for working with CoNLL-U
https://github.com/danieldk/conllu-utils
conllu utilities
Last synced: about 1 year ago
JSON representation
Utilities for working with CoNLL-U
- Host: GitHub
- URL: https://github.com/danieldk/conllu-utils
- Owner: danieldk
- Created: 2020-03-21T13:07:30.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-11-30T20:03:48.000Z (over 3 years ago)
- Last Synced: 2025-04-23T04:18:35.745Z (about 1 year ago)
- Topics: conllu, utilities
- Language: Rust
- Homepage:
- Size: 74.2 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CoNLL-U Utilities
## Introduction
This is a set of utilities to process files in the CoNLL-U format. The
`conllu` command provides the following subcommands:
* `accuracy`: compute the accuracy of a system based on two treebanks
* `cleanup`: normalize unicode and replace unicode punctuation
* `compare`: compare two treebanks on one or more layers
* `from-text`: convert tokenized text files to CoNLL-U.
* `merge`: merge CoNLL-U files
* `partition`: partition a CoNLL-U file in N files.
* `shuffle`: shuffle the sentences in a CoNLL-U file.
* `to-text`: convert CoNLL-U to tokenized plain text.
## Usage
Executing a subcommand gives usage information when `--help` is given
as an argument.