Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-historical-newspaper-analysis
Awesome historical newspaper analysis tools and literature
https://github.com/Duke-Chronicle-Project/awesome-historical-newspaper-analysis
Last synced: 5 days ago
JSON representation
-
Data standards
-
hOCR
- hocr-tools
- hocrjs - Visualization of hOCR files.
- PAGEviewer - Visualization of page layout and OCR segmentation for PAGE XML, ALTO XML, FineReader XML and hOCR.
-
-
Optical character recognition
-
Document layout analysis, text enrichment and semantic segmentation
-
hOCR
- layout-parser - Dectron2-based layout analysis tool.
- dhSegment
-
-
Text analysis
-
hOCR
- TidyText - Manipulation of text data (R package; easy to do the same with pandas).
- scikit-learn - Well-documented general purpose ML library.
- LdaSeqModel - Dynamic topic modeling in Python.
- The Impresso project - Text mining 200 years of historical newspapers.
- app - project.ch/blog/).
- GitHub Organization
-
-
Quality evaluation
-
hOCR
- Aletheia - Ground truth annotation tool.
-
Categories
Sub Categories
Keywords
ocr
2
lstm
1
machine-learning
1
ocr-engine
1
tesseract
1
tesseract-ocr
1
computer-vision
1
deep-learning
1
detectron2
1
document-image-processing
1
document-layout-analysis
1
layout-analysis
1
layout-detection
1
layout-parser
1
object-detection
1
document-processing
1
historical-data
1
python3
1
segmentation
1
tensorflow
1
natural-language-processing
1
r
1
text-mining
1
tidy-data
1
tidyverse
1