Projects in Awesome Lists by impresso
A curated list of projects in awesome lists by impresso .
https://github.com/impresso/impresso-datalab-notebooks
🔬 Impresso Datalab Notebooks
api computational-humanities data-driven-methods historial-media-analysis historical-documents
Last synced: 17 Jan 2026
https://github.com/impresso/fair-sentence-transformers
Positionally fairer SentenceTransformers. Replication Code for preprint "Information Representation Fairness in Long-Document Embeddings: The Peculiar Interaction of Positional and Language Bias". All resources are publicly available and open-source, hoping to facilitate further research in Information Representation Fairness.
Last synced: 15 Apr 2026
https://github.com/impresso/impresso-frontend
🚀 The frontend application of the Impresso WebApp
design digital-library historical-documents historical-research information-extraction information-retrieval media-history natural-language-processing
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-schemas
Repository of JSON schemas used in the Impresso project.
digital-humanities historical-newspapers json-schema
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-pipelines
Reusable NLP pipelines: identify language, assess OCR quality, model topics, and extract news‑agency entities from any text.
Last synced: 14 Jan 2026
https://github.com/impresso/impresso-datalab
Impresso Datalab static Astro website
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-passim
This repository contains code and sample data related to running the impresso corpus through the text reuse detection software passim.
historical-newspapers text-reuse
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-consolidated-canonical-cookbook
Cookbook to produce the consolidated canonical format on s3.
Last synced: 13 Jan 2026
https://github.com/impresso/impresso-frontend-e2e
End-to-end tests for the Impresso Web Application
Last synced: 16 Jan 2026
https://github.com/impresso/impresso-text-embedder
multilingual text vectorizer for semantic search and comparison
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-linguistic-processing
Code for running spaCy on rebuilt impresso data.
lemmatization ner-tagging pos-tagging
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-make-cookbook
Repo for a make-based cookbook for (nlp) offline processing steps
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-py
Impresso Python Library to interact with the Impresso Public API
api computational-humanities data-driven-historical-research historical-documents impresso impresso-datalab media-history python-library
Last synced: 17 Jan 2026
https://github.com/impresso/impresso-essentials
⚙️ Python package highly reusable modules and functions within impresso.
Last synced: 17 Jan 2026