Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alvations/stasis
Semantic Textual Similarity in Python
https://github.com/alvations/stasis
dataset python semantic-textual-similarity semeval
Last synced: about 1 month ago
JSON representation
Semantic Textual Similarity in Python
- Host: GitHub
- URL: https://github.com/alvations/stasis
- Owner: alvations
- Created: 2015-01-23T10:03:11.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2017-01-30T11:56:17.000Z (almost 8 years ago)
- Last Synced: 2023-08-05T08:51:34.158Z (over 1 year ago)
- Topics: dataset, python, semantic-textual-similarity, semeval
- Language: Jupyter Notebook
- Size: 31.3 MB
- Stars: 80
- Watchers: 9
- Forks: 20
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Stasis - Python wrapper for Semantic Similarity datasets
Under the auspice of the EXPERT project (http://expert-itn.eu/), we have written a python wrapper to the STS datasets and we hope that it helps anyone with easy manipulation the datasets.
If you just need a tab-separated file, you can easily find the `sts.csv` available in the same repository. The repo also contains other (maybe) useful datasets that are manually compiled by the maintainer when they are free.
**Disclaimer**: The repository comes as it is. It should **NOT** be considered as the official SemEval's (Semantic Textual Similarity) STS data and it is not affiliated with the STS organizers. We've created this so that people can easily do something like `pandas.read_csv('sts.csv')` or `graphlab.SFrame('sts.csv')` and work with the dataframes with little hassle.
## Datasets
Below is a list of datasets/wrappers you can find here
- **STS**: [SemEval Semantic Textual Similarity](http://alt.qcri.org/semeval2016/task1/) (STS2012 - 2015)
- **CLSS**: [SemEval Cross-level Semantic Similarity](http://alt.qcri.org/semeval2014/task3/) (CLSS)
- **SICK**: [Sentences Involving Compositional Knowledge](http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf)
## ContributePlease feel free to add datasets/wrappers to the repository. Or post an issue to request for wrappers to the repository.
## Cite
**Please cite the respective references for the datasets when using them in your publication!**
If you want to cite this repository, you can cite [this paper](http://www.aclweb.org/anthology/S15-2015) where we created used the `sts.csv` in SemEval-2015