{"id":18754080,"url":"https://github.com/dcavar/antisemitismdatathon2020","last_synced_at":"2025-10-04T08:50:33.664Z","repository":{"id":150220923,"uuid":"264660629","full_name":"dcavar/AntisemitismDatathon2020","owner":"dcavar","description":"This is project material for the Antisemitism Datathon and Hackathon 2020 at Indiana University","archived":false,"fork":false,"pushed_at":"2020-05-17T16:34:40.000Z","size":1270,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-11T20:37:25.174Z","etag":null,"topics":["antisemitism","corpus-data","flair","hatespeech","machine-learning","nltk","python","pytorch","social-media","spacy","tensorflow","twitter"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dcavar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-17T12:29:25.000Z","updated_at":"2025-03-15T13:32:02.000Z","dependencies_parsed_at":"2023-04-08T10:38:04.302Z","dependency_job_id":null,"html_url":"https://github.com/dcavar/AntisemitismDatathon2020","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dcavar/AntisemitismDatathon2020","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcavar%2FAntisemitismDatathon2020","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcavar%2FAntisemitismDatathon2020/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcavar%2FAntisemitismDatathon2020/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcavar%2FAntisemitismDatathon2020/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dcavar","download_url":"https://codeload.github.com/dcavar/AntisemitismDatathon2020/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcavar%2FAntisemitismDatathon2020/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278289490,"owners_count":25962353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["antisemitism","corpus-data","flair","hatespeech","machine-learning","nltk","python","pytorch","social-media","spacy","tensorflow","twitter"],"created_at":"2024-11-07T17:28:00.757Z","updated_at":"2025-10-04T08:50:33.651Z","avatar_url":"https://github.com/dcavar.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Antisemitism Datathon 2020\n\n(C) 2020 by [Damir Cavar] and [Günther Jikeli]\n\nThe information and code examples are licensed under the Apache License Version 2.0.\n\n\nThis is project material for the [Antisemitism Datathon and Hackathon 2020](https://isca.indiana.edu/news-events/Antisemitism%20on%20Social%20Media%20Workshops%20in%20May%202020.html) at [Indiana University at Bloomington].\n\nThis [Datathon and Hackathon](https://isca.indiana.edu/news-events/Antisemitism%20on%20Social%20Media%20Workshops%20in%20May%202020.html) is a collaborative project of [Günther Jikeli] from the [Institute for the Study of Contemporary Antisemitism](https://isca.indiana.edu/) and [Damir Cavar]'s [NLP-Lab.org] at [Indiana University at Bloomington]!\n\n\n\n## Relevant Links\n\n- [Datathon and Hackathon 2020 Website](https://isca.indiana.edu/news-events/Antisemitism%20on%20Social%20Media%20Workshops%20in%20May%202020.html)\n\n\n## Technologies\n\nWe provide an NLP pipeline with detailed linguistic analysis: tokenization, lemmatization, splitting text into sentences, part-of-speech tagging, named entity annotation, dependency parsing, constituent parsing, sentiment detection, and coreference and anaphora resolution:\n\n- [NLP Pipeline as RESTful API](https://jnlp.semantic-tech.com/) (provided through the courtesy of [Semiring Inc.])\n\nThis pipeline is an integration of [RESTful Microservices] that take as input some text and return a [JSON-NLP] formated output. This service requires a login and password. We will share this with you during the meetings.\n\nThe linguistic annotations enable modeling of classifiers using deeper linguistic analysis.\n\nIn addition to that, we provide code examples for the following NLP and Machine Learning libraries, to develop probabilistic, neural, and/or symbolic classifiers for the corpus material:\n\n- [spaCy]\n- [Flair]\n- [NLTK]\n- [Tensorflow]\n- [PyTorch]\n\n\n## Data Sets and Formats\n\nThe Antisemitism Twitter corpus will be provided to you in a specific CSV format. We will also provide a CoNLL formated version of the data. These are formats that the different Machine Learning libraries for NLP mentioned above can read.\n\nYou might want to have a look at the different corpus or linguistic data formats:\n\n- [CoNLL-X](https://www.aclweb.org/anthology/W06-2920.pdf)\n- [CoNLL-U](https://universaldependencies.org/format.html)\n- [spaCy JSON](https://spacy.io/usage/training)\n\n\n\n## Tools\n\nFor testing the NLP API RESTful Microservices you might want to have a look at tools like:\n\n- [Postman]\n- [cURL] (for Windows users: check out Chocolatey to install cURL on your system)\n\n\n\n\n[Damir Cavar]: https://www.linkedin.com/in/damircavar/ \"Damir Cavar\"\n[Günther Jikeli]: https://isca.indiana.edu/about/faculty/jikeli-gunther.html \"Günther Jikeli\"\n[Indiana University at Bloomington]: http://www.indiana.edu/ \"IU Bloomington\"\n[spaCy]: https://spacy.io/ \"spaCy\"\n[Flair]: https://github.com/flairNLP/flair \"Flair\"\n[Tensorflow]: https://www.tensorflow.org/ \"Tensorflow\"\n[PyTorch]: https://pytorch.org/ \"PyTorch\"\n[Semiring Inc.]: https://semiring.com/ \"Semiring Inc.\"\n[NLTK]: https://www.nltk.org/ \"Natural Language Toolkit\"\n[RESTful Microservices]: https://blog.dreamfactory.com/restful-api-and-microservices-the-differences-and-how-they-work-together/ \"RESTful Microservices\"\n[JSON-NLP]: https://github.com/SemiringInc/JSON-NLP \"JSON-NLP Annotation Standard\"\n[NLP-Lab.org]: https://nlp-lab.org/ \"Damir Cavar's NLP Lab\"\n[Postman]: https://www.postman.com/ \"Postman\"\n[cURL]: https://curl.haxx.se/ \"cURL\"\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdcavar%2Fantisemitismdatathon2020","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdcavar%2Fantisemitismdatathon2020","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdcavar%2Fantisemitismdatathon2020/lists"}