Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tirthajyoti/web-database-analytics

Web scrapping and related analytics using Python tools
https://github.com/tirthajyoti/web-database-analytics

analytics beautifulsoup4 data-science data-wrangling database json json-parser natural-language-processing nlp python regular-expression sql sqlite3 web-scraping xml-parser

Last synced: about 13 hours ago
JSON representation

Web scrapping and related analytics using Python tools

Awesome Lists containing this project

README

        

# Web scraping, database and related analytics

[![GitHub issues](https://img.shields.io/github/issues/tirthajyoti/Web-Database-Analytics-Python.svg)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/issues)
[![GitHub forks](https://img.shields.io/github/forks/tirthajyoti/Web-Database-Analytics-Python.svg)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/network)
[![GitHub stars](https://img.shields.io/github/stars/tirthajyoti/Web-Database-Analytics-Python.svg)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/stargazers)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/pulls)
[![Github commits](https://img.shields.io/github/commit-activity/y/tirthajyoti/Web-Database-Analytics-Python.svg)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/stats/contributors)

### Dr. Tirthajyoti Sarkar ([You can connect with me on LinkedIn](https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7/))

---

### Requirements
* **Python 3.5+**
* **NumPy (`$ pip install numpy`)**
* **Pandas (`$ pip install pandas`)**
* **requests (`$ pip install requests`)**
* **BeautifulSoup4 (`$ pip install beautifulsoup4`)**
* **MatplotLib (`$ pip install matplotlib`)**

---

## [My new book on Data wrangling with Python](https://www.amazon.com/Data-Wrangling-Python-Creating-actionable-ebook/dp/B07JF26NGJ/)
![book-image](https://images-na.ssl-images-amazon.com/images/I/51-AuclWzTL.jpg)

---

## What type of Notebooks are here?
* Web scraping and related analytics using Python tools
* [Fundamentals of **Reg**ular **ex**pressions (**Regex**)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/blob/master/Regex_Basics.ipynb)
* Application of **urllib**
* Application of **BeautifulSoup for HTML parsing**
* [Application of **ElementTree for XML parsing**](https://github.com/tirthajyoti/Web-Database-Analytics-Python/blob/master/XML_reading_scraping.ipynb)
* Application of **Python json library for JSON parsing**
* [Application of **Python sqlite library** (building a personal movie database)](https://github.com/tirthajyoti/Web-Database-Analytics-Python/blob/master/Movie_Database_Build.ipynb)
---
### [How to design your own mini-IMDB movie database by scraping web](https://github.com/tirthajyoti/Web-Database-Analytics-Python/blob/master/Movie_Database_Build.ipynb)?
---
**[Check out this article I wrote on Medium about this topic](https://towardsdatascience.com/step-by-step-guide-to-build-your-own-mini-imdb-database-fc39af27d21b)**

---
### [How to scrape data from CIA website (this is harmless, I promise) about simple facts on various nations](https://github.com/tirthajyoti/Web-Database-Analytics-Python/blob/master/CIA-Factbook-Analytics2.ipynb)?
**[Check out this article I wrote on Medium about this topic](https://towardsdatascience.com/data-analytics-with-python-by-web-scraping-illustration-with-cia-world-factbook-abbdaa687a84)**

---
### [How to build a Yelp crawler which can generate interesting word cloud based on a particular city's food cuisine and taste](https://github.com/tirthajyoti/Web-Database-Analytics-Python/tree/master/Yelp_Review)?

---
### How to crawl the [Project Gutenberg](https://www.gutenberg.org/) portal and download 100 most popular books automatically?

---
### [How to use a free API to download basic information about countries around the world and build a database](https://github.com/tirthajyoti/Web-Database-Analytics-Python/blob/master/Countries-JSON-API.ipynb)?