https://github.com/drisskhattabi6/Data-Scraping-Tasks
This Repo contains My Data Scraping Projects
https://github.com/drisskhattabi6/Data-Scraping-Tasks
Last synced: 3 months ago
JSON representation
This Repo contains My Data Scraping Projects
- Host: GitHub
- URL: https://github.com/drisskhattabi6/Data-Scraping-Tasks
- Owner: drisskhattabi6
- Created: 2024-08-12T16:53:13.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-20T08:28:30.000Z (about 1 year ago)
- Last Synced: 2024-08-21T10:28:28.095Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 304 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Scraping Tasks
Welcome to the **Data Scraping Tasks** repository! This repository showcases my data scraping Tasks, where I have used Python and various libraries (like `BeautifulSoup`, `Scrapy`, and `Requests`) to extract, process, and analyze data from the web. Each project is stored in its respective folder with detailed explanations, code, and datasets.
---
## Projects Overview
### 1. [Books to Scrape](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Books%20to%20Scrape)
This project extracts book information (title, price, rating, availability, etc.) from the [Books to Scrape website](http://books.toscrape.com/). The dataset can be useful for e-commerce analysis, book categorization, and price comparisons.### 2. [Quotes to Scrape](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Quotes%20to%20Scrape)
In this project, quotes, authors, and tags are scraped from the [Quotes to Scrape website](http://quotes.toscrape.com/). This dataset is great for sentiment analysis or NLP experiments.### 3. [Countries of the World](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Countries%20of%20the%20World)
This project focuses on extracting detailed information about countries, including names, capitals, and population. The dataset can support geographical and demographic analysis.### 4. [Etsy Product](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Etsy%20Product)
Data is scraped from Etsy product listings, including product names, prices, and seller details. This project is useful for market research and competitor analysis in the e-commerce space.### 5. [Hockey Teams](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Hockey%20Teams)
A project that collects information about hockey teams, their players, and match statistics. It is ideal for sports analytics and fan engagement projects.### 6. [List of Largest Companies in the USA by Revenue](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/List%20of%20largest%20companies%20in%20the%20USA%20by%20Revenue)
This project extracts financial data about the largest companies in the USA by revenue. The dataset supports financial analysis and insights into the U.S. economy.### 7. [Moviesjoy Movies Data](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Moviesjoy%20Movies%20Data)
Scrapes data from the Moviesjoy website, including movie titles, genres, and ratings. The dataset can be leveraged for building recommendation systems or film research.### 8. [Wikipedia Page](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Wikipedia%20Page)
This project demonstrates how to scrape structured data from Wikipedia pages. It is a versatile tool for extracting general knowledge or niche information.### 9. [World Countries Population](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/World%20Countries%20Population)
Scrapes data about the population of countries worldwide. This dataset is perfect for demographic studies and visualization projects.### 10. [World Population Data](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/World%20Population%20Data)
Similar to the previous project but includes additional attributes like growth rate, density, and urban population percentages for deeper insights into global population trends.---
## Key Features
- **Clean and Organized:** Each project includes a folder with code, README file, and scraped dataset for easy navigation and replication.
- **Scalable Codebase:** Designed to handle dynamic content and adapt to changes in website structures.
- **Reusable Scripts:** Scripts can be reused and customized for scraping similar data from other sources.---
## Tools and Technologies Used
- **Libraries:** `BeautifulSoup`, `Scrapy`, `toscrap`, `Requests`, `Pandas`
- **Languages:** Python
- **Applications:** Data analysis, visualization, and insights generation---
## How to Use
1. Clone this repository:
```bash
git clone https://github.com/drisskhattabi6/Data-Scraping-Tasks.git
```
2. Navigate to the desired project folder.
3. Follow the instructions in the project's README file to run the scraping script.---
If you have any questions or suggestions, feel free Contact me.
Happy Scraping! 🚀