Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/trainingbypackt/data-wrangling-with-python-elearning

Creating actionable data from raw sources
https://github.com/trainingbypackt/data-wrangling-with-python-elearning

beautifulsoup4 data-science data-wrangling matplotlib numpy pandas python web-scraping

Last synced: 2 days ago
JSON representation

Creating actionable data from raw sources

Awesome Lists containing this project

README

        

[![GitHub issues](https://img.shields.io/github/issues/TrainingByPackt/Data-Wrangling-with-Python-eLearning.svg)](https://github.com/TrainingByPackt/Data-Wrangling-with-Python-eLearning/issues)
[![GitHub forks](https://img.shields.io/github/forks/TrainingByPackt/Data-Wrangling-with-Python-eLearning.svg)](https://github.com/TrainingByPackt/Data-Wrangling-with-Python-eLearning/network)
[![GitHub stars](https://img.shields.io/github/stars/TrainingByPackt/Data-Wrangling-with-Python-eLearning.svg)](https://github.com/TrainingByPackt/Data-Wrangling-with-Python-eLearning/stargazers)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/TrainingByPackt/Data-Wrangling-with-Python-eLearning/pulls)

# Data Wrangling with Python [eLearning]
Data is the new oil, but it comes crude. To do anything meaningful - modeling, visualization, machine learning, for predictive analysis – you first need to wrestle and wrangle with data. Data Wrangling with Python teaches you the essentials that will get you up and running with data wrangling in no time.

## What you will learn
* Use and manipulate complex and simple data structures
* Harness the full potential of DataFrames and numpy.array at run time
* Perform web scraping with BeautifulSoup4 and html5lib
* Execute advanced string search and manipulation with RegEX
* Handle outliers and perform data imputation with Pandas
* Use descriptive statistics and plotting techniques
* Practice data wrangling and modeling using data generation techniques

### Hardware requirements
For an optimal learning experience, we recommend the following hardware configuration:
* Processor: Intel Core i5 or equivalent
* Memory: 4GB RAM (8 GB Preferred)
* Storage: 35 GB available space

### Software requirements
You must also install in advance the following software:
* OS: Windows 7 SP1 64-bit, Windows 8.1 64-bit or Windows 10 64-bit, Ubuntu Linux, or the latest version of OS X
* Browser: Google Chrome/Mozilla Firefox Latest Version
* Notepad++/Sublime Text as IDE (Optional, as you can practice everything using Jupyter notecourse on your browser)
* Python 3.4+ (latest is Python 3.7) installed (from https://python.org)
* Python libraries as needed (Jupyter, Numpy, Pandas, Matplotlib, BeautifulSoup4, and so)