https://github.com/dkpro/dkpro-bigdata
DKPro large scale processing support
https://github.com/dkpro/dkpro-bigdata
Last synced: 10 months ago
JSON representation
DKPro large scale processing support
- Host: GitHub
- URL: https://github.com/dkpro/dkpro-bigdata
- Owner: dkpro
- License: other
- Created: 2015-04-13T09:57:20.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2018-05-18T19:06:33.000Z (about 8 years ago)
- Last Synced: 2025-04-09T12:11:19.298Z (about 1 year ago)
- Language: Java
- Homepage: https://dkpro.github.io/dkpro-bigdata
- Size: 1.45 MB
- Stars: 4
- Watchers: 8
- Forks: 3
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# dkpro-bigdata
DKPro BigData enables the easy execution of UIMA-based natural language processing pipelines on a hadoop cluster.
###Features
Large scale NLP processing using UIMA and hadoop
Store your corpora on a Hadoop filesystem and access them from local or distributed pipelines
Find patterns in your textual data using adaptable collocation extraction
###Details
* Execute DKPro pipelines on a hadoop cluster with minimal adaption
* Read data stored on a HDFS Filesystem using DKPro Collection Readers
* Read/Write serialized CASes from HDFS
###Contributors:
* Hans-Peter Zorn
* Johannes Simon
* Martin Riedl
* Richard Eckart de Castilho
* Steffen Remus
##License
DKPro BigData is licensed under the Apache Software Licence (ASL) Version 2.0.
This project is a joint effort of UKP Lab and the Language Technology Group, Technical University of Darmstadt.