An open API service indexing awesome lists of open source software.

https://github.com/dkpro/dkpro-bigdata

DKPro large scale processing support
https://github.com/dkpro/dkpro-bigdata

Last synced: 10 months ago
JSON representation

DKPro large scale processing support

Awesome Lists containing this project

README

          

# dkpro-bigdata

DKPro BigData enables the easy execution of UIMA-based natural language processing pipelines on a hadoop cluster.

###Features
Large scale NLP processing using UIMA and hadoop
Store your corpora on a Hadoop filesystem and access them from local or distributed pipelines
Find patterns in your textual data using adaptable collocation extraction
###Details
* Execute DKPro pipelines on a hadoop cluster with minimal adaption
* Read data stored on a HDFS Filesystem using DKPro Collection Readers
* Read/Write serialized CASes from HDFS
###Contributors:

* Hans-Peter Zorn
* Johannes Simon
* Martin Riedl
* Richard Eckart de Castilho
* Steffen Remus

##License
DKPro BigData is licensed under the Apache Software Licence (ASL) Version 2.0.

This project is a joint effort of UKP Lab and the Language Technology Group, Technical University of Darmstadt.