Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hanan-nawaz/daraz-price-comparator

Daraz Price Comparator πŸŒπŸ›οΈ Compare prices effortlessly! πŸ”„ Find deals in πŸ‡΅πŸ‡° Pakistan, πŸ‡§πŸ‡© Bangladesh, πŸ‡³πŸ‡΅ Nepal, and πŸ‡±πŸ‡° Sri Lanka. Python, Requests, BeautifulSoup, and Airflow power smart shopping. Save time, shop savvy! πŸ›’πŸ’‘
https://github.com/hanan-nawaz/daraz-price-comparator

beautifulsoup bs4 darazpk python requests-python telemart webscraping

Last synced: about 2 months ago
JSON representation

Daraz Price Comparator πŸŒπŸ›οΈ Compare prices effortlessly! πŸ”„ Find deals in πŸ‡΅πŸ‡° Pakistan, πŸ‡§πŸ‡© Bangladesh, πŸ‡³πŸ‡΅ Nepal, and πŸ‡±πŸ‡° Sri Lanka. Python, Requests, BeautifulSoup, and Airflow power smart shopping. Save time, shop savvy! πŸ›’πŸ’‘

Awesome Lists containing this project

README

        

# Daraz Price ComparatorπŸŒπŸ›οΈ

Welcome to the Daraz Price Comparator repository! πŸŒπŸ›οΈ This project automates price search and comparison on Daraz across Pakistan, Nepal, Sri Lanka, and Bangladesh. Powered by Python, Requests, BeautifulSoup, and Airflow, it simplifies the task of finding budget-friendly deals. Shop smartly with this efficient tool! πŸ’‘πŸ›’

## Overview

### Core Features

1. **Dynamic E-commerce Search**: After user input, Initiate a Product search on Daraz.

2. **Automated Price Comparison**: The scraper automates Product searches and compares prices across multiple listings.

3. **Airflow Workflow**: Integrated with Apache Airflow for scheduled and automated execution.

4. **Data Storage**: Extracted Product details and prices are stored in a structured format.

## Project Process

This repository houses the code and configuration for the Daraz Price ComparatorπŸŒπŸ›οΈ project. The workflow encompasses initiating a search on Daraz across Pakistan, Nepal, Sri Lanka, and Bangladesh. It extracts details from the first page and efficiently compares prices. The entire process is seamlessly orchestrated using Airflow for automation, ensuring a hassle-free shopping experience. πŸ’»πŸ”„πŸ›’

### 1. E-commerce Search

- User enter its desired product name, Initiating a Product search on Daraz (Only first page).

### 2. Web Scraping

- Python scripts leverage Requests and BeautifulSoup for web scraping to extract product details and prices from both platforms.

### 3. Data Comparison (ETL)

- Extracted data is compared to identify the Product with the lowest price.
- We compare data of both platforms, Converting Currency and other relevant parameters and store in MongoDb.

### 4. Airflow Integration

- Airflow is configured to automate the entire process at scheduled intervals.

## Technology Stack

- **Web Scraping**: Python (Requests, BeautifulSoup)
- **Automation**: Apache Airflow
- **Data Storage**: Structured format for extracted data (MongoDb)

## File Structure

- `ecommerce-product-comparator/`: Contains the code and configuration for the Daraz Price ComparatorπŸŒπŸ›οΈ project.
- `process-imgs/`: Contains helping images related to the project.
- `ecommerce-product-comparator-logo.jpg`: Logo image for the project.
- `ecommerce-product-comparator-process.jpg`: Process/Roadmap image for the project.
- `main.py`: Contains code, generally calling other classes.
- `code/`: Conatins all the code files.
- `daraz_scraper.py`: Conatins code to scrape from Daraz.
- `extract.py`: Contains code to call both scrapers.
- `transform.py`: Contains code to implement Extract, Transform and Load.
- `load_to_mongodb.py`: Conatins code to load data into MongoDb using PyMongo.
- `README.md`: You are here, providing an overview of the project.

## Getting Started

To replicate this project, follow these steps:

1. Clone this project.

2. Run the Python scripts for web scraping and data comparison.

3. Optionally, integrate with Airflow for automated and scheduled execution.

4. The naming convention i use is `snaking case`.

Feel free to explore the project directory for more details on implementation and configuration.

## Contributors

- [Abdul Hanan Nawaz](https://www.github.com/Hanan-Nawaz)

## License

This project is licensed under the [MIT License](LICENSE).

## πŸ”— Links
[![portfolio](https://img.shields.io/badge/my_portfolio-000?style=for-the-badge&logo=ko-fi&logoColor=white)](https://hanannawaz.com/)
[![linkedin](https://img.shields.io/badge/linkedin-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/abdulhanan0/)
[![twitter](https://img.shields.io/badge/twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/HananNawaz0/)