An open API service indexing awesome lists of open source software.

https://github.com/prsdm/text-data-analysis-and-data-ethics


https://github.com/prsdm/text-data-analysis-and-data-ethics

Last synced: 8 months ago
JSON representation

Awesome Lists containing this project

README

          

# Text-Analysis-and-Data-Ethics
## Acadamic Project
### Introduction
1. Reddit data that has been pre-processed and analysed which contains Covid-related subreddits as well as randomly selected subreddits
2. Discussion of the Data Ethics Framework.

### About Datasets
The csv dataset you are provided contains one row per post, and has information about three entities: posts, users and subreddits. The column names are self-explanatory: columns starting with the prefix user_ describe users, those starting with the prefix subr_ describe subreddits, the subreddit column is the subreddit name, and the rest of the columns are post attributes ( author, posted_at, title and post text - the selftext column-, number of comments - num_comments, score, etc.).

### Programming Language
* Python (Pandas, NumPy, NLTK, Datetime, literal_eval, etc.)