https://github.com/victoriapm/weratedogs_twitter_data_wrangling
Data wrangling, analyzing and visualizing of the tweet archive of WeRateDogs
https://github.com/victoriapm/weratedogs_twitter_data_wrangling
Last synced: 4 days ago
JSON representation
Data wrangling, analyzing and visualizing of the tweet archive of WeRateDogs
- Host: GitHub
- URL: https://github.com/victoriapm/weratedogs_twitter_data_wrangling
- Owner: Victoriapm
- Created: 2020-04-16T15:49:15.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-04-29T13:52:15.000Z (about 5 years ago)
- Last Synced: 2025-01-17T05:43:57.787Z (5 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 4.81 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Wrangle and Analyze a Dataset: WeRateDogs Twitter Archive
## Introduction
This project is the final task for part 4 of the Udacity Data Analyst Nano Degree.The project consists in creating a jupyter notebook to perform data wrangling tasks on a given dataset.
## Learning objectives
- Data wrangling, which consists of:
- Gathering data (downloadable file in the Resources tab in the left most panel of your classroom and linked in step 1 below).
- Assessing data
- Cleaning data
- Storing, analyzing, and visualizing your wrangled data
- Reporting on 1) your data wrangling efforts and 2) your data analyses and visualizations## Analysis Description
The dataset that will be wrangling (and analyzing and visualizing) in this project is the tweet archive of Twitter user [@dog_rates](https://twitter.com/dog_rates), also known as [WeRateDogs](https://en.wikipedia.org/wiki/WeRateDogs).
WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc.
Why?
Because...
WeRateDogs has over 4 million followers and has received international media coverage.
## Contents
* wrangle_act.ipynb file where the analysis has been developed, contains code with Markdown cells from Jupyter Notebook.
* .html file output of the .ipynb converted to web version for easy viewing
* twitter_archive_master.csv file with the clean data used
* wrangle_report.pdf file with a written report about the wrangling efforts. This is to be framed as an internal document.
* act_report.pdf file with a written report that communicates the insights and displays the visualization(s) produced from the wrangled data. This is to be framed as an external document, like a blog post or magazine article.## Pre-requisites
No installation is needed to view the analysis.
To reproduce the project an installation of Python 3.5 and the following libraries is needed:
- pandas
- NumPy
- Matplotlib
- requests
- tweepy
- json