https://github.com/quantumudit/test-store-data-analysis

This repository showcases a web scraper with a pipeline structure for efficient data extraction and transformation from websites. The tool can be tailored to leverage its capabilities for insightful data analysis, providing valuable insights and informed decision-making.
https://github.com/quantumudit/test-store-data-analysis

data data-visualization dataanalytics python python-webscraping webscraper webscraping-data

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/quantumudit/test-store-data-analysis
Owner: quantumudit
License: other
Created: 2024-02-04T18:32:47.000Z (over 1 year ago)
Default Branch: master
Last Pushed: 2024-02-05T10:36:01.000Z (over 1 year ago)
Last Synced: 2024-12-26T08:42:27.092Z (6 months ago)
Topics: data, data-visualization, dataanalytics, python, python-webscraping, webscraper, webscraping-data
Language: Jupyter Notebook
Homepage:
Size: 481 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Test Store Data Analysis

---

Empowering users to scrape the products data from John's Test Store website.

built-with-love
powered-by-coffee
cc-nc-sa

Overview •
Prerequisites •
Architecture •
Demo •
Support •
License

## Overview

The primary goal of this project revolves around the retrieval of comprehensive products data from the [John's Test Store][website_link] website and analyze it.

The project repository exhibits the following structure:

```
Test-Store-Data-Analysis/
├── ⚙️.env
├── 📜.gitignore
├── ⚙️.pre-commit-config.yaml
├── 🔑LICENSE
├── 🐍main.py
├── 🔒poetry.lock
├── 📇pyproject.toml
├── 📝README.md
├── 🗒️requirements.txt
├── 🐍setup.py
├── 🐍template.py
├── 📁.github
│ └── 📂workflows
│ └── 📃actions.yaml
├── 📁conf
│ └── 📃configs.yaml
├── 📁data
│ ├── 📂external
│ │ ├── 📑products_link.csv
│ │ └── 📑scraped_products.csv
│ └── 📂processed
│ └── 📑products.csv
├── 📁images
│ └── 🖼️topmate_featured.png
├── 📁logs
│ └── 🧾2024_02_04_02_44_21_PM.log
├── 📁notebooks
│ ├── 📙01_web_scraping_tests.ipynb
│ └── 📙02_data_preprocessing.ipynb
├── 📁reports
│ └── .gitkeep
└── 📁src
├── 🐍constants.py
├── 🐍exception.py
├── 🐍logger.py
├── 🐍__init__.py
├── 📂components
│ ├── 🐍data_preprocessor.py
│ ├── 🐍link_extraction.py
│ └── 🐍product_scraper.py
├── 📂pipelines
│ ├── 🐍stage_01_data_extraction.py
│ └── 🐍stage_02_data_preprocessor.py
└── 📂utils
└── 🐍basic_utils.py

```

## Prerequisites

To fully grasp the concepts and processes involved in this project, it is recommended to have a solid understanding of the following skills:

- Fundamental knowledge of Python & Modular coding
- Familiarity with the Python libraries listed in the 🗒️[requirements.txt][requirements] file
- Basic familiarity with data analytics and Power BI

Having these skills as a foundation will help to ensure a smooth and effective experience while working on this project.

> The selection of applications and their installation process may differ depending on personal preferences and computer configurations.

## Architecture

[CONTENT TO BE ADDED]

## Demo

[CONTENT TO BE ADDED]

## Support

If you have any questions, concerns, or suggestions, feel free to reach out to me through any of the following channels:

[![Linkedin Badge][linkedinbadge]][linkedin] [![Twitter Badge][twitterbadge]][twitter] [![Medium Badge][mediumbadge]][medium]

If you find my work valuable, you can show your appreciation by [buying me a coffee][buy_me_a_coffee]

## License

This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.

---

[project_logo]: ./images/ebooks_logo.png
[process_workflow]: ./images/process_workflow.png

[website_link]: https://gopher1.extrkt.com/
[webapp_link]: https://ebooks-extractor-app.streamlit.app/
[requirements]: ./requirements.txt

[app]: ./app.py
[scraper_funcs]: ./scraper_functions.py

[linkedin]: https://www.linkedin.com/in/uditkumarchatterjee/
[twitter]: https://twitter.com/quantumudit
[medium]: https://medium.com/@quantumudit
[buy_me_a_coffee]: https://www.buymeacoffee.com/quantumudit

[linkedinbadge]: https://img.shields.io/badge/-uditkumarchatterjee-0e76a8?style=flat&labelColor=0e76a8&logo=linkedin&logoColor=white
[twitterbadge]: https://img.shields.io/badge/-quantumudit-000000?style=flat&labelColor=000000&logo=x&logoColor=white&link=https://twitter.com/quantumudit
[mediumbadge]: https://img.shields.io/badge/-quantumudit-02b875?style=flat&labelColor=02b875&logo=medium&logoColor=white

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/quantumudit/test-store-data-analysis

Awesome Lists containing this project

README

Empowering users to scrape the products data from John's Test Store website.