{"id":38544671,"url":"https://github.com/impresso/impresso-datalab-notebooks","last_synced_at":"2026-01-17T07:17:12.236Z","repository":{"id":243858107,"uuid":"784074009","full_name":"impresso/impresso-datalab-notebooks","owner":"impresso","description":"🔬 Impresso Datalab Notebooks","archived":false,"fork":false,"pushed_at":"2025-12-18T15:45:37.000Z","size":9305,"stargazers_count":9,"open_issues_count":28,"forks_count":3,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-12-20T06:03:03.335Z","etag":null,"topics":["api","computational-humanities","data-driven-methods","historial-media-analysis","historical-documents"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/impresso.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-04-09T06:16:52.000Z","updated_at":"2025-12-17T11:59:03.000Z","dependencies_parsed_at":null,"dependency_job_id":"8e9382c4-d536-4489-8d32-d71534f460a3","html_url":"https://github.com/impresso/impresso-datalab-notebooks","commit_stats":null,"previous_names":["impresso/impresso-datalab-ner-notebooks","impresso/impresso-datalab-notebooks"],"tags_count":0,"template":false,"template_full_name":"impresso/impresso-datalab-starter-pack","purl":"pkg:github/impresso/impresso-datalab-notebooks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/impresso%2Fimpresso-datalab-notebooks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/impresso%2Fimpresso-datalab-notebooks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/impresso%2Fimpresso-datalab-notebooks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/impresso%2Fimpresso-datalab-notebooks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/impresso","download_url":"https://codeload.github.com/impresso/impresso-datalab-notebooks/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/impresso%2Fimpresso-datalab-notebooks/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28503382,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T06:57:29.758Z","status":"ssl_error","status_checked_at":"2026-01-17T06:56:03.931Z","response_time":85,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","computational-humanities","data-driven-methods","historial-media-analysis","historical-documents"],"created_at":"2026-01-17T07:17:12.119Z","updated_at":"2026-01-17T07:17:12.215Z","avatar_url":"https://github.com/impresso.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Impresso Datalab Notebooks\n\n![License: AGPLV3+](https://img.shields.io/badge/License-AGPLV3+-brightgreen.svg) ![Python](https://img.shields.io/badge/Python-\u003e=3.10-blue.svg) [![Generic badge](https://img.shields.io/badge/Status-WIP!-red.svg)](https://shields.io/)\n\n## About\n\nThe Impresso project develops application interfaces to facilate historical transmedia research through:\n\n- the **[Impresso Web App](https://impresso-project.ch/app)**, a user interface for content exploration and visualisation.\n- the **[Impresso Datalab]()**, a suite of tools for data exploration and analysis.\n\nSpecifically, the Impresso Datalab enables custom analyses of the Impresso corpus, and the semantic indexation of external document collections also with Impresso models. We offer access to the Impresso corpus, data and models via the Impresso Public API, a dedicated Python library, and HuggingFace. For more information, be sure to visit the [Datalab website](https://dev.impresso-project.ch/datalab/about/).\n\n## Contents\n\nThis repository contains notebooks that illustrate how to use the **[Impresso Public API](#)** (coming soon!) and **Impresso Models**, allowing you to search through Impresso data and use Impresso annotation models.    \n \n  \n- **Impresso Public API**: The software component that provides third-party access to the Impresso backend.    \n- **Impresso Python Library**: The preferred method for users to interact with the Impresso Public API.    \n- **Impresso Models**: A collection of models trained to annotate the Impresso Corpus, made publicly available to facilitate the annotation of external documents, enabling comparison and analysis of semantic enrichments. Impresso models can be accessed through the [Impresso Hugging Face organisation](https://huggingface.co/impresso-project) and via annotation services offered through the API.    \n\nBefore getting started, check out how to create an account and obtain an API token on the [Impresso Datalab website]().     \n\n\n## Notebooks\n\n### Getting Started\n\nThe notebooks in the [`starter`](https://github.com/impresso/impresso-datalab-notebooks/tree/main/starter) folder will help you get started with the Impresso Public API and Python library:\n\n- [Introduction to the Impresso Python Library](https://github.com/impresso/impresso-datalab-notebooks/blob/main/starter/basics_ImpressoAPI.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/starter/basics_ImpressoAPI.ipynb)\n- [A quick guide to searching with Impresso library](https://github.com/impresso/impresso-datalab-notebooks/blob/main/starter/search_ImpressoAPI.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/starter/search_ImpressoAPI.ipynb)\n\n\n\n### Explore and Visualise your Impresso data\n\nThe notebooks in the [`explore-vis`](https://github.com/impresso/impresso-datalab-notebooks/tree/main/explore-vis) folder help you build complementary views on your Impresso data:\n\n- [Exploring Entity Co-occurrence Networks](https://github.com/impresso/impresso-datalab-notebooks/blob/main/explore-vis/entity_network.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/explore-vis/entity_network.ipynb)\n- [Visualising Place Entities on Maps](https://github.com/impresso/impresso-datalab-notebooks/blob/main/explore-vis/place-entities_map.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/explore-vis/place-entities_map.ipynb)\n- [Inspecting my collection with data visualisation tools](https://github.com/impresso/impresso-datalab-notebooks/blob/main/explore-vis/inspecting_my_collection.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/explore-vis/inspecting_my_collection.ipynb)\n\n### Annotate your Documents with Impresso Models\n\nThe notebooks in the [`annotate`](https://github.com/impresso/impresso-datalab-notebooks/tree/main/annotate) folder demonstrate how to use Impresso models, either from the [Hugging Face hub](https://huggingface.co/impresso-project) or through the Impresso API. These notebooks guide you in annotating your documents to produce annotations that are compatible with those in the Impresso corpus.\n\n- [Language Identification with impresso-pipelines Package](https://github.com/impresso/impresso-datalab-notebooks/blob/main/annotate/langident_pipeline_demo.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/annotate/langident_pipeline_demo.ipynb)\n- [OCR Quality Assessment with impresso-pipelines Package](https://github.com/impresso/impresso-datalab-notebooks/blob/main/annotate/ocrqa_pipeline_demo.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/impresso/impresso-datalab-notebooks/blob/main/annotate/ocrqa_pipeline_demo.ipynb)\n- [News Agencies Recognition and Linking with Impresso BERT models](https://github.com/impresso/impresso-datalab-notebooks/blob/main/annotate/newsagency-processing_ImpressoHF.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)]()\n\n\n\n## About Impresso\n\n### Impresso project\n\n[Impresso - Media Monitoring of the Past](https://impresso-project.ch) is an\ninterdisciplinary research project that aims to develop and consolidate tools for\nprocessing and exploring large collections of media archives across modalities, time,\nlanguages and national borders. The first project (2017-2021) was funded by the Swiss\nNational Science Foundation under grant\nNo. [CRSII5_173719](http://p3.snf.ch/project-173719) and the second project (2023-2027)\nby the SNSF under grant No. [CRSII5_213585](https://data.snf.ch/grants/grant/213585)\nand the Luxembourg National Research Fund under grant No. 17498891.\n\n### Copyright\n\nCopyright (C) 2024 The Impresso team.\n\n### License\n\nThis program is provided as open source under\nthe [GNU Affero General Public License](https://github.com/impresso/impresso-pyindexation/blob/master/LICENSE)\nv3 or later.\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/impresso/impresso.github.io/blob/master/assets/images/3x1--Yellow-Impresso-Black-on-White--transparent.png?raw=true\" width=\"350\" alt=\"Impresso Project Logo\"/\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimpresso%2Fimpresso-datalab-notebooks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fimpresso%2Fimpresso-datalab-notebooks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimpresso%2Fimpresso-datalab-notebooks/lists"}