{"id":49726272,"url":"https://github.com/sdsc-ordes/debates-analytics","last_synced_at":"2026-05-09T04:08:12.175Z","repository":{"id":340927640,"uuid":"1082366195","full_name":"sdsc-ordes/debates-analytics","owner":"sdsc-ordes","description":"Debates Transcription and Translation by AI Whisper plus a dashboard to search in the debates","archived":false,"fork":false,"pushed_at":"2026-02-27T08:02:39.000Z","size":10908,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-27T12:39:40.832Z","etag":null,"topics":["debates","search-engine","transcription","translation"],"latest_commit_sha":null,"homepage":"https://sdsc-ordes.github.io/debates-analytics/index.html","language":"Svelte","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sdsc-ordes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-24T06:18:12.000Z","updated_at":"2026-02-27T07:47:16.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sdsc-ordes/debates-analytics","commit_stats":null,"previous_names":["sdsc-ordes/debates-analytics"],"tags_count":2,"template":false,"template_full_name":"sdsc-ordes/repository-template-python","purl":"pkg:github/sdsc-ordes/debates-analytics","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdsc-ordes%2Fdebates-analytics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdsc-ordes%2Fdebates-analytics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdsc-ordes%2Fdebates-analytics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdsc-ordes%2Fdebates-analytics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sdsc-ordes","download_url":"https://codeload.github.com/sdsc-ordes/debates-analytics/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdsc-ordes%2Fdebates-analytics/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32806721,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"online","status_checked_at":"2026-05-09T02:00:06.633Z","response_time":123,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["debates","search-engine","transcription","translation"],"created_at":"2026-05-09T04:08:11.402Z","updated_at":"2026-05-09T04:08:12.169Z","avatar_url":"https://github.com/sdsc-ordes.png","language":"Svelte","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./docs/assets/political_debates_logo.svg\" alt=\"debates logo\" width=\"250\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003e\n  Debates Analytics\n\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/sdsc-ordes/debates-analytics/releases/latest\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/release/sdsc-ordes/debates-analytics.svg?style=for-the-badge\" alt=\"Current Release label\" /\u003e\u003c/a\u003e\n  \u003ca href=\"http://www.apache.org/licenses/LICENSE-2.0.html\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/LICENSE-Apache2.0-ff69b4.svg?style=for-the-badge\" alt=\"License label\" /\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n## About\n\nThis repository provides an app that is able to transcribe and translate\ndebates, where speakers take turns. Any such video or audio file in the format\n`mp4` or `wav` can be uploaded via a dashboard for analysis.\n\n- The analysis is performed with the hugging face component\n  [odtp-pyannote-whisper](https://github.com/sdsc-ordes/odtp-pyannote-whisper),\n  that was developed in the context of this project and can be accessed directly\n  via\n  [hugging face](https://huggingface.co/spaces/katospiegel/odtp-pyannote-whisper).\n\n- The results of that analysis are loaded into an S3 compatible object store\n  (garage).\n\n- From there it will be indexed into the Search Engine Solr. A Mongo db database\n  is used to manage the media processing results and status\n\n- A dashboard is provided to make all processing and results available via a\n  common interface: it consists of a frontend, a backend and a redis queue for a\n  decoupled processing of the long running media analysis jobs on hugging face.\n\n## Authors\n\n- [Sabine Maennel](mailto:sabine.maennel@sdsc.ethz.ch)\n- [Carlos Vivar Rios](mailto:carlos.vivarrios@epfl.ch)\n- [Hannah Casey](mailto:hannah.casey@sdsc.ethz.ch)\n\n## Installation\n\nInstallation and options for the installations are described in the\n[documentation](https://sdsc-ordes.github.io/debates-analytics/installation/overview/)\n\n## Usage\n\nUsage is described in the\n[documentation](https://sdsc-ordes.github.io/debates-analytics/userguide/roles/)\n\n## Development\n\nSee\n[documentation](https://sdsc-ordes.github.io/debates-analytics/development/setup/)\n\n## Acknowledgement\n\nThis work was originally funded by the SNSF Spark Grant number 221139 “Debating\nHuman Rights”\n[SNSF Data Portal . Documentation: Political Debates](https://data.snf.ch/grants/grant/221139).\n\nThe goal of that project was to create specialized components for the analysis\nof videos from United Nations Human Rights Council (UNHRC) debates.\n\n- Sophisticated Transcription: Integrating and optimizing cutting-edge\n  transcription models (e.g., Whisper 3.0) to ensure accurate, multilingual\n  transcription of UNHRC debates.\n- Multimodal Data Handling: Developing components tailored to video/audio\n  processing, scene extraction, and diarization.\n- Specialized Database Integration: Designing and deploying a database structure\n  to effectively store debate transcripts, relevant metadata, and extracted\n  features.\n\nThis repo was created as a wrapup of that project, to make the processings and\nresults available in a more general form.\n\n## Copyright\n\nCopyright © 2025-2028 Swiss Data Science Center (SDSC),\n[www.datascience.ch](http://www.datascience.ch/). All rights reserved. The SDSC\nis jointly established and legally represented by the École Polytechnique\nFédérale de Lausanne (EPFL) and the Eidgenössische Technische Hochschule Zürich\n(ETH Zürich). This copyright encompasses all materials, software, documentation,\nand other content created and developed by the SDSC.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsdsc-ordes%2Fdebates-analytics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsdsc-ordes%2Fdebates-analytics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsdsc-ordes%2Fdebates-analytics/lists"}