Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hugo-hattori/commodity_price_webscraping

Web Scraping and Automation Project.
https://github.com/hugo-hattori/commodity_price_webscraping

automation jupyter jupyter-notebook pandas pandas-dataframe pandas-python python selenium web-scraping

Last synced: about 1 month ago
JSON representation

Web Scraping and Automation Project.

Awesome Lists containing this project

README

        

# Commodity Price Web Scraping

This small project aims to help a fictional importing company to decide the right
timing to make a commodity purchase by analysing the current commodity price and the
"ideal price" established by the company.

### Packages used:
+ selenium
+ pandas

## Importing the database

The information regarding which commodities should be analyzed and what are their
respective "ideal prices" are contained within the company's database, so we need to
use the pandas package to extract that data.

https://github.com/Hugo-Hattori/Commodity_Price_WebScraping/blob/169347eef21857c6ccde28024b383e4ba54281f9/Commodity_Price_WebScraping.py#L5-L7

The simulated database is the "commodities.xlsx" file.

![img.png](img.png)

## Researching commodities prices

The next step is to research each commodity price, in this case we will be using the
"www.melhorcambio.com" website. By utilizing this website's addressing mechanic we can
search for the prices by just adding a String containing the commodity's name on the site's
hyperlink (e.g.: "https://www.melhorcambio.com/commodity_name")

https://github.com/Hugo-Hattori/Commodity_Price_WebScraping/blob/169347eef21857c6ccde28024b383e4ba54281f9/Commodity_Price_WebScraping.py#L9-L18

## Updating the database

Now we need to update the database and compare the current prices with the "ideal"
ones, if the current price is lower than the "ideal" than it is right time to make the
commodity purchase otherwise it is advised to wait.

https://github.com/Hugo-Hattori/Commodity_Price_WebScraping/blob/169347eef21857c6ccde28024b383e4ba54281f9/Commodity_Price_WebScraping.py#L20-L21
https://github.com/Hugo-Hattori/Commodity_Price_WebScraping/blob/169347eef21857c6ccde28024b383e4ba54281f9/Commodity_Price_WebScraping.py#L24-L28

The updated database was named "commodities_atualizado.xlsx".

![img_1.png](img_1.png)

Note: this is a project developed for academic purposes, therefore the
data contained in "commodities.xlsx" and "commodities_atualizado.xlsx" files are
fictitious and used only to learn Pandas and Selenium packages applications.