https://github.com/educationaltestingservice/sarcasm

shared tasks and research related to sarcasm detection
https://github.com/educationaltestingservice/sarcasm

Last synced: 3 days ago
JSON representation

shared tasks and research related to sarcasm detection

Host: GitHub
URL: https://github.com/educationaltestingservice/sarcasm
Owner: EducationalTestingService
Created: 2020-01-13T20:59:09.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2020-07-28T00:30:39.000Z (over 5 years ago)
Last Synced: 2025-06-03T18:08:54.474Z (8 months ago)
Size: 3.79 MB
Stars: 21
Watchers: 7
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Automatic Sarcasm Detection

***The Shared Task (2nd FigLang Workshop at ACL 2020) is now over. Thanks a lot, participants :)***

Please refer to `reddit` and `twitter` sub-directories for further references on datasets.

For Twitter and Reddit, training and testing datasets are provided for sarcasm detection tasks in jsonlines format.

Each line contains a JSON object with the following fields :
- ***label*** : `SARCASM` or `NOT_SARCASM`
- ***id***: String identifier for sample. This id will be required when making submissions.
- **ONLY** in test data
- ***response*** : the sarcastic response, whether a sarcastic Tweet or a Reddit post
- ***context*** : the conversation context of the ***response***
- Note, the context is an ordered list of dialogue, i.e., if the context contains three elements, `c1`, `c2`, `c3`, in that order, then `c2` is a reply to `c1` and `c3` is a reply to `c2`. Further, if the sarcastic response is `r`, then `r` is a reply to `c3`.

For instance, for the following training example :

`"label": "SARCASM", "response": "Did Kelly just call someone else messy? Baaaahaaahahahaha", "context": ["X is looking a First Lady should . #classact, "didn't think it was tailored enough it looked messy"]`

The response tweet, "Did Kelly..." is a reply to its immediate context "didn't think it was tailored..." which is a reply to "X is looking...". Your goal is to predict the label of the "response" while also using the context (i.e, the immediate or the full context).

***Dataset size statistics*** :

| | Train | Test |
|---------|-------|------|
| Reddit | 4400 | 1800 |
| Twitter | 5000 | 1800 |

For Test, we will be providing you the ***response*** and the ***context***. We will also provide the ***id*** (i.e., identifier) to report the the results.

***Submission Instructions*** : Please follow the given [link](submission_instructions.pdf)

***Main References:***

[A Report on the 2020 Sarcasm Detection Shared Task.](https://www.aclweb.org/anthology/2020.figlang-1.1.pdf) Debanjan Ghosh, Avijit Vajpyee, Smaranda Muresan. Proceedings of the Second Workshop on Figurative Language Processing.

---
***Note***: Since we have collected our training data from popular social media platforms a large portion of the utterances are on controversial and/or political and social topics. Although we have pre-processed the training data and lightly edited to remove contentious text, many utterances still contain controversial perspectives (of the users) and informal language.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/educationaltestingservice/sarcasm

Awesome Lists containing this project

README