Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/opensemanticsearch/open-semantic-etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
https://github.com/opensemanticsearch/open-semantic-etl
annotation documents elasticsearch enrichment etl extract extract-information extract-text extractor ingest ingestion-pipeline ingests-documents named-entity-recognition nlp ocr pdf python rdf solr solr-dataimporter
Last synced: 17 days ago
JSON representation
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
- Host: GitHub
- URL: https://github.com/opensemanticsearch/open-semantic-etl
- Owner: opensemanticsearch
- License: gpl-3.0
- Created: 2015-05-30T17:49:10.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2022-10-09T06:42:26.000Z (about 2 years ago)
- Last Synced: 2024-10-12T07:24:51.139Z (about 1 month ago)
- Topics: annotation, documents, elasticsearch, enrichment, etl, extract, extract-information, extract-text, extractor, ingest, ingestion-pipeline, ingests-documents, named-entity-recognition, nlp, ocr, pdf, python, rdf, solr, solr-dataimporter
- Language: Python
- Homepage: https://opensemanticsearch.org/etl
- Size: 615 KB
- Stars: 257
- Watchers: 27
- Forks: 69
- Open Issues: 41
-
Metadata Files:
- Funding: .github/FUNDING.yml
- License: LICENSE