An open API service indexing awesome lists of open source software.

https://github.com/victoriapm/weratedogs_twitter_data_wrangling

Data wrangling, analyzing and visualizing of the tweet archive of WeRateDogs
https://github.com/victoriapm/weratedogs_twitter_data_wrangling

Last synced: 4 days ago
JSON representation

Data wrangling, analyzing and visualizing of the tweet archive of WeRateDogs

Awesome Lists containing this project

README

        

# Wrangle and Analyze a Dataset: WeRateDogs Twitter Archive

## Introduction
This project is the final task for part 4 of the Udacity Data Analyst Nano Degree.

The project consists in creating a jupyter notebook to perform data wrangling tasks on a given dataset.

## Learning objectives
- Data wrangling, which consists of:
- Gathering data (downloadable file in the Resources tab in the left most panel of your classroom and linked in step 1 below).
- Assessing data
- Cleaning data
- Storing, analyzing, and visualizing your wrangled data
- Reporting on 1) your data wrangling efforts and 2) your data analyses and visualizations

## Analysis Description

The dataset that will be wrangling (and analyzing and visualizing) in this project is the tweet archive of Twitter user [@dog_rates](https://twitter.com/dog_rates), also known as [WeRateDogs](https://en.wikipedia.org/wiki/WeRateDogs).

WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc.
Why?
Because...
!["they're good dogs Brent."](https://i.kym-cdn.com/photos/images/newsfeed/001/225/812/2b3.png)

WeRateDogs has over 4 million followers and has received international media coverage.

## Contents

* wrangle_act.ipynb file where the analysis has been developed, contains code with Markdown cells from Jupyter Notebook.
* .html file output of the .ipynb converted to web version for easy viewing
* twitter_archive_master.csv file with the clean data used
* wrangle_report.pdf file with a written report about the wrangling efforts. This is to be framed as an internal document.
* act_report.pdf file with a written report that communicates the insights and displays the visualization(s) produced from the wrangled data. This is to be framed as an external document, like a blog post or magazine article.

## Pre-requisites
No installation is needed to view the analysis.
To reproduce the project an installation of Python 3.5 and the following libraries is needed:
- pandas
- NumPy
- Matplotlib
- requests
- tweepy
- json