Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nlpatvcu/smm4h
https://github.com/nlpatvcu/smm4h
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/nlpatvcu/smm4h
- Owner: NLPatVCU
- Created: 2020-06-16T18:31:40.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-12-27T15:36:43.000Z (about 2 years ago)
- Last Synced: 2023-03-02T07:36:43.024Z (almost 2 years ago)
- Language: Python
- Size: 15.8 MB
- Stars: 1
- Watchers: 4
- Forks: 0
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# SMM4H
This repository contains our CNN that we propose for Task 2 of SMM4H 2020. Task 2 of SMM4H 2020 is the binary classfication of tweets that contain ADEs. The dataset is highly imbalanced and we propose 3 methods to address that: oversampling, desampling, and Keras class weights.## Install
To install clone this repository using Git:
```
git clone https://github.com/NLPatVCU/SMM4H.git
```
Then, create a virtual enviorment. You should use Python 3.6.
```
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
```## Overview
Data is preprocessed and extraneous information is removed. Then, it can be passed through the Unbalanced class where it can be desampled or oversampled. After that, the data goes through the Model class where it is prepared for the CNN. Finally, the CNN is run. At this stage, there is an option to do CV or train-test, use Test data, or use Keras class weights.## Running Experiments
To run an experiment:
```
python experiments.py
```
In the experiments.py file, there is an example and comments for other possible options.## Docs
For more detailed documentation, check out: https://smm4h.readthedocs.io/en/latest
## Reference
```bibtex
@inproceedings{mahendran2020nlp,
title={NLP@VCU: Identifying adverse effects in English tweets for unbalanced data},
author={Mahendran, Darshini and Lewis, Cora and McInnes, Bridget},
booktitle={Proceedings of the Fifth Social Media Mining for Health Applications Workshop \& Shared Task},
pages={158--160},
year={2020}
}
```