{"id":18927577,"url":"https://github.com/gentaiscool/indonesian-nlp","last_synced_at":"2026-02-17T15:33:13.783Z","repository":{"id":37935285,"uuid":"496296869","full_name":"gentaiscool/indonesian-nlp","owner":"gentaiscool","description":"A curated list of research papers and resources on Indonesian languages","archived":false,"fork":false,"pushed_at":"2024-03-21T13:10:13.000Z","size":130,"stargazers_count":39,"open_issues_count":1,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-06-01T11:05:52.674Z","etag":null,"topics":["deep-learning","indonesian","javanese","local","local-languages","machine-learning","nlp","papers","research","speech","sundanese","survey"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gentaiscool.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-25T15:50:58.000Z","updated_at":"2023-10-23T22:11:07.000Z","dependencies_parsed_at":"2025-02-20T22:38:47.347Z","dependency_job_id":null,"html_url":"https://github.com/gentaiscool/indonesian-nlp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/gentaiscool/indonesian-nlp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gentaiscool%2Findonesian-nlp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gentaiscool%2Findonesian-nlp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gentaiscool%2Findonesian-nlp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gentaiscool%2Findonesian-nlp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gentaiscool","download_url":"https://codeload.github.com/gentaiscool/indonesian-nlp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gentaiscool%2Findonesian-nlp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279003406,"owners_count":26083581,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","indonesian","javanese","local","local-languages","machine-learning","nlp","papers","research","speech","sundanese","survey"],"created_at":"2024-11-08T11:19:36.913Z","updated_at":"2025-10-10T09:49:13.922Z","avatar_url":"https://github.com/gentaiscool.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Indonesian NLP Resources\nThis is the list of tutorials, workshops, talks, books, papers, and resources on computational linguistic approaches to research in Indonesian languages. \nThe list will be updated over time. You are welcome to send a pull request to update the list and be one of the contributors! 🚀\n\n📌 If you are working on any work related to Indonesian or any local Indonesian languages, don't hesitate to contact me or send a pull request! \n\n## 📔 Books\n- \u003cb\u003eJan Wira Gotama Putra (2019)\u003c/b\u003e \u003ci\u003ePengenalan Konsep Pembelajaran Mesin dan Deep Learning\u003c/i\u003e (in Indonesian). \u003ca href=\"https://wiragotama.github.io/resources/ebook/intro-to-ml-secured.pdf\"\u003e[Book]\u003c/a\u003e\n\n## 🔉 Talks\n- Bedah Paper Series by INACL (in Indonesian) \u003ca href=\"https://www.youtube.com/channel/UC4O5LY9sYN25M1oBTsqGSIw/videos\"\u003e[Video]\u003c/a\u003e\n\n## 📑 Research Papers\n\n### Position / Survey\n- \u003cb\u003eAji, et al. (2022)\u003c/b\u003e \u003ci\u003eOne Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia\u003c/i\u003e. ACL \u003ca href=\"https://aclanthology.org/2022.acl-long.500.pdf\"\u003e[Paper]\u003c/a\u003e\n\n### Datasets and Pretrained Models\n#### Public Benchmark\n- \u003cb\u003eWinata, et al. (2022)\u003c/b\u003e \u003ci\u003eNusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages\u003c/i\u003e. Preprint \u003ca href=\"https://arxiv.org/pdf/2205.15960.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/nusax\"\u003e[Benchmark]\u003c/a\u003e\n- \u003cb\u003eCahyawijaya, et al. (2021)\u003c/b\u003e \u003ci\u003eIndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation\u003c/i\u003e. EMNLP \u003ca href=\"https://aclanthology.org/2021.emnlp-main.699.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlg\"\u003e[Benchmark]\u003c/a\u003e \u003ca href=\"https://huggingface.co/indobenchmark\"\u003e[Huggingface Models]\u003c/a\u003e\n- \u003cb\u003eWibowo, et al. (2021)\u003c/b\u003e \u003ci\u003eIndoCollex: A Testbed for Morphological Transformation of Indonesian Colloquial Words\u003c/i\u003e. ACL Findings \u003ca href=\"https://aclanthology.org/2021.findings-acl.280.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/haryoa/indo-collex\"\u003e[Benchmark]\u003c/a\u003e\n- \u003cb\u003eKoto, et al. (2020)\u003c/b\u003e \u003ci\u003eIndoLEM and IndoBERT: A benchmark dataset and pre-trained language model for Indonesian NLP\u003c/i\u003e. COLING \u003ca href=\"https://aclanthology.org/2020.coling-main.66.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://indolem.github.io/\"\u003e[Benchmark]\u003c/a\u003e\n- \u003cb\u003eFajri Koto, and Ikhwan Koto (2020)\u003c/b\u003e \u003ci\u003eTowards Computational Linguistics in Minangkabau Language: Studies on Sentiment Analysis and Machine Translation\u003c/i\u003e. PACLIC \u003ca href=\"https://aclanthology.org/2020.paclic-1.17.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/fajri91/minangNLP\"\u003e[Benchmark]\u003c/a\u003e\n- \u003cb\u003eWilie, et al. (2020)\u003c/b\u003e \u003ci\u003eIndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding\u003c/i\u003e. AACL \u003ca href=\"https://aclanthology.org/2020.aacl-main.85.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e \u003ca href=\"https://huggingface.co/indobenchmark\"\u003e[Huggingface Models]\u003c/a\u003e\n\n#### Language-Specific Model\n- \u003cb\u003eWongso, et al. (2022)\u003c/b\u003e \u003ci\u003ePre-Trained Transformer-Based Language Models for Sundanese\u003c/i\u003e. Journal of Big Data \u003ca href=\"https://link.springer.com/content/pdf/10.1186/s40537-022-00590-7.pdf\"\u003e[Paper]\u003c/a\u003e \n\n#### Morphology Analysis\n- \u003cb\u003ePimentel, et al. (2021)\u003c/b\u003e \u003ci\u003eSIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages\u003c/i\u003e. Workshop on Computational Research in Phonetics, Phonology, and Morphology \u003ca href=\"https://aclanthology.org/2021.sigmorphon-1.25.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/unimorph/ind\"\u003e[Dataset]\u003c/a\u003e\n\n#### POS Tagging\n- \u003cb\u003eDevin Hoesen and Ayu Purwarianti (2018)\u003c/b\u003e \u003ci\u003eInvestigating Bi-LSTM and CRF with POS Tag Embedding for Indonesian\nNamed Entity Tagger\u003c/i\u003e. International Conference on Asian Language Processing  \u003ca href=\"https://arxiv.org/ftp/arxiv/papers/2009/2009.05687.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e\n- \u003cb\u003eDinakaramani, et al. (2014)\u003c/b\u003e \u003ci\u003eDesigning an Indonesian Part of speech Tagset and Manually Tagged Indonesian Corpus\u003c/i\u003e. International Conference on Asian Language Processing  \u003ca href=\"https://web.archive.org/web/20200321100925id_/\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"http://bahasa.cs.ui.ac.id/postag/downloads/Designing%20an%20Indonesian%20Part%20of%20speech%20Tagset.pdf\"\u003e[Dataset]\u003c/a\u003e\n\n#### Named Entity Recognition\n- \u003cb\u003eDevin Hoesen and Ayu Purwarianti (2018)\u003c/b\u003e \u003ci\u003eInvestigating Bi-LSTM and CRF with POS Tag Embedding for Indonesian\nNamed Entity Tagger\u003c/i\u003e. International Conference on Asian Language Processing \u003ca href=\"https://arxiv.org/ftp/arxiv/papers/2009/2009.05687.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e \n- \u003cb\u003eMuhammad Fachri (2014)\u003c/b\u003e \u003ci\u003eNamed Entity Recognition for Indonesian Text using Hidden Markov Model\u003c/i\u003e. Undergraduate Thesis \u003ca href=\"http://etd.repository.ugm.ac.id/penelitian/detail/150411\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/yusufsyaifudin/indonesia-ner\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eAlfina, et al. (2016)\u003c/b\u003e \u003ci\u003eDBpedia Entities Expansion in Automatically Building Dataset for Indonesian NER\u003c/i\u003e. International Conference on Advanced Computer Science and Information Systems \u003ca href=\"https://ieeexplore.ieee.org/abstract/document/7872784\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/ialfina/ner-dataset-modified-dee\"\u003e[Dataset]\u003c/a\u003e\n\n#### Word Sense Disambiguation\n- \u003cb\u003eMahendra, et al. (2018)\u003c/b\u003e \u003ci\u003eCross-Lingual and Supervised Learning Approach for Indonesian Word Sense Disambiguation Task\u003c/i\u003e. Global Wordnet Conference \u003ca href=\"https://aclanthology.org/2018.gwc-1.28.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/rmahendra/Indonesian-WSD\"\u003e[Dataset]\u003c/a\u003e\n\n#### Constituency Parsing\n- \u003cb\u003eArwidarasti, et al. (2019)\u003c/b\u003e \u003ci\u003eConverting an Indonesian Constituency Treebank to the Penn Treebank Format\u003c/i\u003e. International Conference on Asian\nLanguage Processing \u003ca href=\"https://colips.org/conferences/ialp2019/ialp2019.com/files/papers/IALP2019_086.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/ialfina/kethu\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eMoeljadi, et al. (2018)\u003c/b\u003e \u003ci\u003eBuilding Cendana: a Treebank for Informal Indonesian\u003c/i\u003e. Global Wordnet Conference \u003ca href=\"http://jaslli.org/files/proceedings/18_paclic33_postconf.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eDavid Moeljadi (2017)\u003c/b\u003e \u003ci\u003eBuilding JATI: A Treebank for Indonesian\u003c/i\u003e. Global Wordnet Conference \u003ca href=\"http://compling.hss.ntu.edu.sg/who/david/slides/ConCorps2017_davidmoeljadi_slides.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/davidmoeljadi/INDRA/tree/master/tsdb/gold/Cendana\"\u003e[Dataset]\u003c/a\u003e\n  \n#### Dependency Parsing\n- \u003cb\u003eZeman, et al. (2018)\u003c/b\u003e \u003ci\u003eCoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies\u003c/i\u003e. CoNLL Shared Task \u003ca href=\"https://aclanthology.org/K18-2001v2.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/UniversalDependencies/UD_Indonesian-PUD\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eMcDonald, et al. (2013)\u003c/b\u003e \u003ci\u003eUniversal Dependency Annotation for Multilingual Parsing\u003c/i\u003e. ACL \u003ca href=\"https://aclanthology.org/P13-2017.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/UniversalDependencies/UD_Indonesian-GSD\"\u003e[Dataset]\u003c/a\u003e\n\n#### Coreference Resolution\n- \u003cb\u003eArtari, et al. (2021)\u003c/b\u003e \u003ci\u003eA Multi-Pass Sieve Coreference Resolution for Indonesian\u003c/i\u003e. RANLP \u003ca href=\"https://aclanthology.org/2021.ranlp-1.10.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/valentinakania/indocoref\"\u003e[Dataset]\u003c/a\u003e\n\n#### Chatbot\n- \u003cb\u003eLin, et al. (2021)\u003c/b\u003e \u003ci\u003eXPersona: Evaluating Multilingual Personalized Chatbot\u003c/i\u003e. NLP4ConvAI \u003ca href=\"https://aclanthology.org/2021.nlp4convai-1.10.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlg\"\u003e[Benchmark]\u003c/a\u003e \u003ca href=\"https://github.com/HLTCHKUST/Xpersona\"\u003e[Dataset]\u003c/a\u003e\n\n#### Question Answering\n- \u003cb\u003eClark, et al. (2020)\u003c/b\u003e \u003ci\u003eTyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages\u003c/i\u003e. TACL \u003ca href=\"https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00317/96451\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/google-research-datasets/tydiqa\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003ePurwarianti, et al. (2007)\u003c/b\u003e \u003ci\u003eA Machine Learning Approach for Indonesian Question Answering System\u003c/i\u003e. RANLP \u003ca href=\"https://www.researchgate.net/profile/Ayu-Purwarianti/publication/221173808_A_machine_learning_approach_for_indonesian_question_answering_system/links/547404bd0cf245eb436dbcdc/A-machine-learning-approach-for-indonesian-question-answering-system.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e\n\n#### Summarization\n- \u003cb\u003eKemal Kurniawan and Samuel Louvan (2018)\u003c/b\u003e \u003ci\u003eA New Benchmark Dataset for Indonesian Text Summarization\u003c/i\u003e. International Conference\non Asian Language Processing \u003ca href=\"https://arxiv.org/pdf/1810.05334.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlg\"\u003e[Benchmark]\u003c/a\u003e \u003ca href=\"https://github.com/kata-ai/indosum\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eKoto, et al. (2020)\u003c/b\u003e \u003ci\u003eA Large-scale Indonesian Dataset for Text Summarization\u003c/i\u003e. AACL \u003ca href=\"https://aclanthology.org/2020.aacl-main.60.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlg\"\u003e[Benchmark]\u003c/a\u003e \u003ca href=\"https://github.com/fajri91/sum_liputan6\"\u003e[Dataset]\u003c/a\u003e\n\n#### Keyphrase Extraction \n- \u003cb\u003eMahfuzh, et al. (2019)\u003c/b\u003e \u003ci\u003eImproving Joint Layer RNN based Keyphrase Extraction by Using Syntactical Features\u003c/i\u003e. International Conference of Advanced Informatics: Concepts, Theory and Applications \u003ca href=\"https://arxiv.org/pdf/2009.07119.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e\n\n#### Natural Language Inference\n- \u003cb\u003eMahendra, et al. (2021)\u003c/b\u003e \u003ci\u003eIndoNLI: A Natural Language Inference Dataset for Indonesian\u003c/i\u003e. EMNLP \u003ca href=\"https://aclanthology.org/2021.emnlp-main.821.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/ir-nlp-csui/indonli\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eKen Nabila Setya and Rahmad Mahendra (2018)\u003c/b\u003e \u003ci\u003eSemi-supervised Textual Entailment on Indonesian Wikipedia Data\u003c/i\u003e. International Conference on Computational Linguistics and Intelligent Text Processing \u003ca href=\"http://www.cicling.org/2018/intranet/pre-print/papers/paper_55.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e\n\n#### Sentiment Analysis\n- \u003cb\u003eAyu Purwarianti and Ida Ayu Putu Ari Crisdayanti (2019)\u003c/b\u003e \u003ci\u003eImproving Bi-LSTM Performance for Indonesian Sentiment Analysis Using Paragraph Vector\u003c/i\u003e. International Conference of Advanced Informatics: Concepts, Theory and Applications \u003ca href=\"https://ieeexplore.ieee.org/abstract/document/8904199\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[IndoNLU Benchmark]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/nusax\"\u003e[NusaX Benchmark]\u003c/a\u003e\n- \u003cb\u003eAzhar, et al. (2019)\u003c/b\u003e \u003ci\u003eMulti-label Aspect Categorization with Convolutional Neural Networks and Extreme Gradient Boosting\u003c/i\u003e. International Conference on Electrical Engineering and Informatics \u003ca href=\"https://ieeexplore.ieee.org/document/8988898\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e\n- \u003cb\u003eIlmania, et al. (2018)\u003c/b\u003e \u003ci\u003eAspect Detection and Sentiment Classification Using Deep Neural Network for Indonesian Aspect-Based Sentiment Analysis\u003c/i\u003e. International Conference on Asian Language Processing \u003ca href=\"https://ieeexplore.ieee.org/document/8629181\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/IndoNLP/indonlu\"\u003e[Benchmark]\u003c/a\u003e\n\n#### Emotion Classification\n- \u003cb\u003eSaputri, et al. (2018)\u003c/b\u003e \u003ci\u003eEmotion Classification on Indonesian Twitter Dataset\u003c/i\u003e. International Conference on Asian Language Processing \u003ca href=\"https://ieeexplore.ieee.org/document/8629262\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/meisaputri21/Indonesian-Twitter-Emotion-Dataset\"\u003e[Dataset]\u003c/a\u003e\n\n#### Stance Detection\n- \u003cb\u003eJannati, et al. (2018)\u003c/b\u003e \u003ci\u003eStance Classification Towards Political Figures on Blog Writing\u003c/i\u003e. International Conference on Asian Language Processing \u003ca href=\"https://ieeexplore.ieee.org/document/8629144\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/reneje/id_stance_dataset_article-Stance-Classification-Towards-Political-Figures-on-Blog-Writing\"\u003e[Dataset]\u003c/a\u003e\n\n#### Hate Speech Detection\n- \u003cb\u003eAlfina, et al. (2017)\u003c/b\u003e \u003ci\u003eHate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study\u003c/i\u003e. International Conference on Advanced Computer Science and Information Systems \u003ca href=\"Hate speech detection in the Indonesian language: A dataset and preliminary\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/ialfina/id-hatespeech-detection\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eMuhammad Okky Ibrohim and Indra Budi (2018)\u003c/b\u003e \u003ci\u003eA Dataset and Preliminaries Study for Abusive Language Detection in Indonesian Social Media\u003c/i\u003e. International Conference on Computer Science and Computational Intelligence \u003ca href=\"https://www.sciencedirect.com/science/article/pii/S1877050918314583?via%3Dihub\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/haryoa/stif-indonesia\"\u003e[Dataset]\u003c/a\u003e\n- \u003cb\u003eMuhammad Okky Ibrohim and Indra Budi (2019)\u003c/b\u003e \u003ci\u003eMulti-label Hate Speech and Abusive Language Detection in Indonesian Twitter\u003c/i\u003e. Workshop on Abusive Language Online \u003ca href=\"https://aclanthology.org/W19-3506.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/okkyibrohim/id-abusive-language-detection\"\u003e[Dataset]\u003c/a\u003e\n\n#### Clickbait Detection\n- \u003cb\u003eAndika William and Yunita Sari (2020)\u003c/b\u003e \u003ci\u003eCLICK-ID: A Novel Dataset for Indonesian Clickbait Headlines\u003c/i\u003e. Data in Brief \u003ca href=\"https://www.sciencedirect.com/science/article/pii/S2352340920311252?via%3Dihub\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://data.mendeley.com/datasets/k42j7x2kpn/1\"\u003e[Dataset]\u003c/a\u003e\n\n#### Style Transfer\n- \u003cb\u003eWibowo, et al. (2020)\u003c/b\u003e \u003ci\u003eSemi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation\u003c/i\u003e. International Conference on Asian Language Processing \u003ca href=\"https://colips.org/conferences/ialp2020/proceedings/papers/IALP2020_P89.pdf\"\u003e[Paper]\u003c/a\u003e \u003ca href=\"https://github.com/haryoa/stif-indonesia\"\u003e[Dataset]\u003c/a\u003e\n\n## 🧪 Collaborative Project\nIndoNLP is going to start collecting new datasets at https://github.com/orgs/IndoNLP. They will open the submission starting mid June 2022. Stay tuned!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgentaiscool%2Findonesian-nlp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgentaiscool%2Findonesian-nlp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgentaiscool%2Findonesian-nlp/lists"}