{"id":38554088,"url":"https://github.com/compnet/wikisynch","last_synced_at":"2026-01-17T07:39:35.753Z","repository":{"id":145486607,"uuid":"214163652","full_name":"CompNet/WikiSynch","owner":"CompNet","description":"Synchronization between two Wikipedia-based Corpora","archived":false,"fork":false,"pushed_at":"2024-10-05T18:40:35.000Z","size":32,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-09-10T05:06:28.525Z","etag":null,"topics":["abuse-detection","annotation","conversations","corpus"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CompNet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2019-10-10T11:22:15.000Z","updated_at":"2024-10-05T18:40:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"32b34544-d24d-4885-aaa5-17fade58fa34","html_url":"https://github.com/CompNet/WikiSynch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CompNet/WikiSynch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompNet%2FWikiSynch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompNet%2FWikiSynch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompNet%2FWikiSynch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompNet%2FWikiSynch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CompNet","download_url":"https://codeload.github.com/CompNet/WikiSynch/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompNet%2FWikiSynch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28504356,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T06:57:29.758Z","status":"ssl_error","status_checked_at":"2026-01-17T06:56:03.931Z","response_time":85,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["abuse-detection","annotation","conversations","corpus"],"created_at":"2026-01-17T07:39:35.660Z","updated_at":"2026-01-17T07:39:35.726Z","avatar_url":"https://github.com/CompNet.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Wikipedia Abusive Conversations\n===================\n*Synchronization between two Wikipedia-based Corpora*\n\n* Copyright 2019-2020 Noé Cécillon\n\n`WikiSynch` is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation. For source availability and license information see licence.txt\n\n* **Lab site:** http://lia.univ-avignon.fr\n* **GitHub repo:** https://github.com/CompNet/Pang\n* **Contact:** Noé Cécillon \u003cnoe.cecillon@univ-avignon.fr\u003e\n\n-------------------------------------------------------------------------\n\n## Description\n*Wikipedia Abusive Conversations* (WAC) is a large corpus of Wikipedia conversations annotated 3 types of abusive content (personal attack, aggression and toxicity). We developped a reconstruction pipeline to synchronize 2 existing corpora of Wikipedia comments and create WAC. This repository contains the source code used to perform this alignment.\n\nIf you use this source code or the associated data, please cite article [[C'20](#references)]:\n```bibtex\n@InProceedings{Cecillon2020,\n  author    = {Cécillon, Noé and Labatut, Vincent and Dufour, Richard and Linarès, Georges},\n  title     = {{WAC}: A Corpus of {W}ikipedia Conversations for Online Abuse Detection},\n  booktitle = {12th Language Resources and Evaluation Conference},\n  year      = {2020},\n  pages     = {1375-1383},\n  address   = {Marseille, FR},\n  url       = {http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.172.pdf},\n}\n```\n\n\n## Dataset\nThe dataset itself is available for download on [Zenodo](https://doi.org/10.5281/zenodo.6817093). \nThe content of Wikipedia comments is distributed under the [CC-BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0/) license. The dataset is distributed under the [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/) license.\n\n## References\n* **[C'20]** N. Cécillon, V. Labatut, R. Dufour, and G. Linarès, *WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection*, 12th Language Resources and Evaluation Conference (LREC), 2020, pp. 1375–1383. [⟨hal-02497514⟩](https://hal.archives-ouvertes.fr/hal-02497514) \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompnet%2Fwikisynch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcompnet%2Fwikisynch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompnet%2Fwikisynch/lists"}