Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/liyucheng09/cltk

CLTK - Creative Language Toolkit, one-stop toolkit for creative language processing. Metaphor, Sarcasm, Irony, Humor, Pun and more.
https://github.com/liyucheng09/cltk

Last synced: 15 days ago
JSON representation

CLTK - Creative Language Toolkit, one-stop toolkit for creative language processing. Metaphor, Sarcasm, Irony, Humor, Pun and more.

Awesome Lists containing this project

README

        


Logo of Selective Context

# CLTK - Creative Language Toolkit

Welcome to CLTK (Creative Language Toolkit), an open source project that provides a one-stop toolkit for creative language processing.

CLTK offers a comprehensive API for accessing creative language datasets and includes pre-trained models for creative language detection, interpretation, and generation. This projects includes metaphors, sarcasm, irony, simile, idiom and more types of creative language.

## Datasets

We collect available data resources for creative language processing and format them as Huggingface Datasets, so that you can access and use them directly via Huggingface Hub.

Now CLTK has collected the following datasets:

**Metaphor:**
- [VUA20 (FigLang@ACL 2020)](https://huggingface.co/datasets/CreativeLang/vua20_metaphor)
- [TroFi (EACL 2006)](https://huggingface.co/datasets/CreativeLang/trofi_metaphor)
- [MOH (SEM 2016)](https://huggingface.co/datasets/CreativeLang/moh_metaphor)
- [Chinese_Metaphor_Corpus (CMC, COLING 2022)](https://huggingface.co/datasets/CreativeLang/chinese_metaphor_corpus)
- [UKP_Novel_Metaphor (EMNLP 2018)](https://huggingface.co/datasets/CreativeLang/ukp_novel_metaphor)

**Humor:**
- [ColBERT_Humor](https://huggingface.co/datasets/CreativeLang/ColBERT_Humor_Detection)

**Simile:**
- [Scope_Simile (EMNLP 2020)](https://huggingface.co/datasets/CreativeLang/scope_simile_generation)
- [WPS_Chinese_Simile (AAAI 2020)](https://huggingface.co/datasets/CreativeLang/wps_chinese_simile)

**Irony:**
- [EPIC (ACL 2023)](https://huggingface.co/datasets/CreativeLang/EPIC_Irony)

**Sarcasm:**
- [SARC (LREC 2018)](https://huggingface.co/datasets/CreativeLang/SARC_Sarcasm)

**Tongue Twister**
- [TwistList (ACL 2023)](https://huggingface.co/datasets/CreativeLang/TwistList)

**Pun**
- [Semeval2017 Task 7: Pun Detection](https://huggingface.co/datasets/CreativeLang/pun_detection_semeval2017_task7)

Open an issue if you hope any dataset to be included in CLTK.

## Models

**Metaphor Detection:** [FrameBERT (EACL 2023)](https://huggingface.co/CreativeLang/metaphor_detection_roberta_seq).

## Contributing

We welcome contributions from the community! Join our discord server (to be created soon...) or open an issue to join the discussion.

## License

CLTK is open source and is released under the [MIT License](https://opensource.org/licenses/MIT).

## Contact

If you have any questions, open an issue or [email](mailto:yucheng.li[at]surrey.ac.uk) me directly.