An open API service indexing awesome lists of open source software.

https://github.com/essien1990/webscraping_using_python

Web Scraping using Python to scrape Tonaton.com and Weather.gov websites to extract specific content, transform the data, and load or store in PostgreSQL database.
https://github.com/essien1990/webscraping_using_python

jupyter-lab jupyter-notebook pandas-dataframe pgadmin4 postgresql-database python3 regex requests scrapy scrapy-crawler scrapy-sp webscraping

Last synced: 12 months ago
JSON representation

Web Scraping using Python to scrape Tonaton.com and Weather.gov websites to extract specific content, transform the data, and load or store in PostgreSQL database.

Awesome Lists containing this project

README

          

# WebScraping using Python(BeautifulSoup & Scrapy)
- BeautifulSoup was used to scrape the content of the website including the lxml parser
- Requests was used to get the website URL
- Some transformation was done i.e. creating new columns that stores specific information of the weather needed
- Pandas was used to store the scraped data into a dataframe
- The data was transformed and loaded into Json and CSV file format
- SQLAlchemy was used to create a PostgreSQL engine and connection to load the data into PostgreSQL database
- [DataFrame stored in Database Weather](https://user-images.githubusercontent.com/5301791/137428662-06a7fbad-047e-436a-86f7-abca0dbdc8ed.png)
- [DataFrame stored in Database Weather in Schema forecasts](https://user-images.githubusercontent.com/5301791/137428668-0fe365f7-9c22-4fd1-8e0e-03f94f68d1b5.png)
- Regex pattern was used to extract specific data from the Tonaton web page
- Scrapy framework was used to scrape a Shop website extracted the name, price and link of the items and stored in CSV and JSON.