{"id":22729234,"url":"https://github.com/araule/echosis","last_synced_at":"2026-04-13T02:02:01.454Z","repository":{"id":267487199,"uuid":"895127471","full_name":"Araule/echosis","owner":"Araule","description":" Code developed for : Darenne, L. (2024). Propositions pour l'identification, la modélisation et la quantification des chambres d’écho : Expérimentation sur un corpus de commentaires YouTube. Master Thesis, Institut National des Langues et Civilisations Orientales.","archived":false,"fork":false,"pushed_at":"2025-05-12T14:35:37.000Z","size":12314,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-12T15:45:50.805Z","etag":null,"topics":["agreement-classification","analysis","echo-chamber","topic-modeling","toxicity-classification","youtube"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Araule.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-11-27T15:55:41.000Z","updated_at":"2025-05-12T14:35:41.000Z","dependencies_parsed_at":"2025-03-15T22:32:48.306Z","dependency_job_id":null,"html_url":"https://github.com/Araule/echosis","commit_stats":null,"previous_names":["araule/echosis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Araule/echosis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Araule%2Fechosis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Araule%2Fechosis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Araule%2Fechosis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Araule%2Fechosis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Araule","download_url":"https://codeload.github.com/Araule/echosis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Araule%2Fechosis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270702562,"owners_count":24630877,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agreement-classification","analysis","echo-chamber","topic-modeling","toxicity-classification","youtube"],"created_at":"2024-12-10T18:09:01.776Z","updated_at":"2026-04-13T02:02:01.445Z","avatar_url":"https://github.com/Araule.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ECHO chamber analySIS\n\nA framework to analyse of echo chambers on YouTube (for now...)\n\n## Installation\n\nIf you want to train a transformers model on GPU with SpaCy, you need to download extra libraries. See [here](https://spacy.io/usage) for more informations. You also need to choose and download one [spacy model](https://spacy.io/models), which will be use to preprocess the corpus for topic modeling and training a classification model. Last step, create your python environment.\n\n```bash\n# venv\npython -m venv my_env\nsource my_env/bin/activate\npip install -r requirements.txt\n```\n\n## First Steps\n\nBefore running the scripts, it is recommended to have an idea of your corpus structure. At the end, you will have 3 files: \n- one with the videos' metadata, captions and gensim annotation;\n- one with the comments' metadata, perspective api annotation and agree-disagree annotation;\n- one with the commentators' metadata.\nNo need to worry about directories, they will be created when saving files or models.\n\nYou need to get a key to access [Youtube Data API v3](https://developers.google.com/youtube/registering_an_application) and another one to access [Perspective API](https://developers.google.com/codelabs/setup-perspective-api#5). You can also request an increase of quota for [youtube](https://support.google.com/youtube/contact/yt_api_form) or [perspective](https://developers.perspectiveapi.com/s/request-quota-increase?language=en_US), if you are particulary impatient or are scrapping a big youtube channel. I cannot garantee your requests will be granted.\n\n### YouTube Corpus\n\nYou can use our scripts to get corpus from YouTube through the API. For more information, see `misc` directory.\n\n## Documentation\n\nAll functions are commented, and Python files are in the docs directory to show you how to import and use every part of the processing chain. Soon, you will be able to use the framework through a command-line interface.\n\n## Cite\n\n```\nDarenne, L. (2024). Propositions pour l'identification, la modélisation et la quantification des chambres d’écho : Expérimentation sur un corpus de commentaires YouTube. Master Thesis, Institut National des Langues et Civilisations Orientales.\n```\n\n```\n@mastersthesis{darenne_2024,\n    author = \"Laura Darenne\",\n    title = \"Propositions pour l'identification, la modélisation et la quantification des chambres d’écho : Expérimentation sur un corpus de commentaires YouTube\",\n    school = \"Institut National des Langues et Civilisations Orientales\",\n    year = \"2024\"\n}\n```\n\n## Main libraries used\n\n\u003e Guillaume Plique, Pauline Breteau, Jules Farjas, Héloïse Théro, Jean Descamps, Amélie Pellé, Laura Miguel, César Pichon, \u0026 Kelly Christensen. (2019). Minet, a webmining CLI tool \u0026 library for python. Zenodo. [http://doi.org/10.5281/zenodo.4564399](http://doi.org/10.5281/zenodo.4564399)\n\n\u003e Gensim. [https://radimrehurek.com/gensim/models/ldamodel.html](https://radimrehurek.com/gensim/models/ldamodel.html)\n\n\u003e Perspective API. [https://current.withgoogle.com/the-current/toxicity/](https://current.withgoogle.com/the-current/toxicity/).\n\n\u003e SpaCy. [https://spacy.io/](https://spacy.io/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faraule%2Fechosis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faraule%2Fechosis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faraule%2Fechosis/lists"}