Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/quantumudit/test-store-data-analysis
This repository showcases a web scraper with a pipeline structure for efficient data extraction and transformation from websites. The tool can be tailored to leverage its capabilities for insightful data analysis, providing valuable insights and informed decision-making.
https://github.com/quantumudit/test-store-data-analysis
data data-visualization dataanalytics python python-webscraping webscraper webscraping-data
Last synced: 1 day ago
JSON representation
This repository showcases a web scraper with a pipeline structure for efficient data extraction and transformation from websites. The tool can be tailored to leverage its capabilities for insightful data analysis, providing valuable insights and informed decision-making.
- Host: GitHub
- URL: https://github.com/quantumudit/test-store-data-analysis
- Owner: quantumudit
- License: other
- Created: 2024-02-04T18:32:47.000Z (9 months ago)
- Default Branch: master
- Last Pushed: 2024-02-05T10:36:01.000Z (9 months ago)
- Last Synced: 2024-02-05T21:56:34.179Z (9 months ago)
- Topics: data, data-visualization, dataanalytics, python, python-webscraping, webscraper, webscraping-data
- Language: Jupyter Notebook
- Homepage:
- Size: 481 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Test Store Data Analysis
---
Empowering users to scrape the products data from John's Test Store website.
Overview •
Prerequisites •
Architecture •
Demo •
Support •
License## Overview
The primary goal of this project revolves around the retrieval of comprehensive products data from the [John's Test Store][website_link] website and analyze it.
The project repository exhibits the following structure:
```
Test-Store-Data-Analysis/
├── ⚙️.env
├── 📜.gitignore
├── ⚙️.pre-commit-config.yaml
├── 🔑LICENSE
├── 🐍main.py
├── 🔒poetry.lock
├── 📇pyproject.toml
├── 📝README.md
├── 🗒️requirements.txt
├── 🐍setup.py
├── 🐍template.py
├── 📁.github
│ └── 📂workflows
│ └── 📃actions.yaml
├── 📁conf
│ └── 📃configs.yaml
├── 📁data
│ ├── 📂external
│ │ ├── 📑products_link.csv
│ │ └── 📑scraped_products.csv
│ └── 📂processed
│ └── 📑products.csv
├── 📁images
│ └── 🖼️topmate_featured.png
├── 📁logs
│ └── 🧾2024_02_04_02_44_21_PM.log
├── 📁notebooks
│ ├── 📙01_web_scraping_tests.ipynb
│ └── 📙02_data_preprocessing.ipynb
├── 📁reports
│ └── .gitkeep
└── 📁src
├── 🐍constants.py
├── 🐍exception.py
├── 🐍logger.py
├── 🐍__init__.py
├── 📂components
│ ├── 🐍data_preprocessor.py
│ ├── 🐍link_extraction.py
│ └── 🐍product_scraper.py
├── 📂pipelines
│ ├── 🐍stage_01_data_extraction.py
│ └── 🐍stage_02_data_preprocessor.py
└── 📂utils
└── 🐍basic_utils.py```
## Prerequisites
To fully grasp the concepts and processes involved in this project, it is recommended to have a solid understanding of the following skills:
- Fundamental knowledge of Python & Modular coding
- Familiarity with the Python libraries listed in the 🗒️[requirements.txt][requirements] file
- Basic familiarity with data analytics and Power BIHaving these skills as a foundation will help to ensure a smooth and effective experience while working on this project.
> The selection of applications and their installation process may differ depending on personal preferences and computer configurations.
## Architecture
[CONTENT TO BE ADDED]
## Demo
[CONTENT TO BE ADDED]
## Support
If you have any questions, concerns, or suggestions, feel free to reach out to me through any of the following channels:
[![Linkedin Badge][linkedinbadge]][linkedin] [![Twitter Badge][twitterbadge]][twitter] [![Medium Badge][mediumbadge]][medium]
If you find my work valuable, you can show your appreciation by [buying me a coffee][buy_me_a_coffee]
## License
This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.
---
---
[project_logo]: ./images/ebooks_logo.png
[process_workflow]: ./images/process_workflow.png[website_link]: https://gopher1.extrkt.com/
[webapp_link]: https://ebooks-extractor-app.streamlit.app/
[requirements]: ./requirements.txt[app]: ./app.py
[scraper_funcs]: ./scraper_functions.py[linkedin]: https://www.linkedin.com/in/uditkumarchatterjee/
[twitter]: https://twitter.com/quantumudit
[medium]: https://medium.com/@quantumudit
[buy_me_a_coffee]: https://www.buymeacoffee.com/quantumudit[linkedinbadge]: https://img.shields.io/badge/-uditkumarchatterjee-0e76a8?style=flat&labelColor=0e76a8&logo=linkedin&logoColor=white
[twitterbadge]: https://img.shields.io/badge/-quantumudit-000000?style=flat&labelColor=000000&logo=x&logoColor=white&link=https://twitter.com/quantumudit
[mediumbadge]: https://img.shields.io/badge/-quantumudit-02b875?style=flat&labelColor=02b875&logo=medium&logoColor=white