An open API service indexing awesome lists of open source software.

https://github.com/drisskhattabi6/data-scraping-tasks

This repository showcases my data scraping Tasks, where I have used Python libraries to extract, process, and analyze data from the web.
https://github.com/drisskhattabi6/data-scraping-tasks

data-scraping data-scraping-projects etsy scraper scrapethissite scraping toscrape wikipedia worldometers

Last synced: 7 months ago
JSON representation

This repository showcases my data scraping Tasks, where I have used Python libraries to extract, process, and analyze data from the web.

Awesome Lists containing this project

README

          

# Data Scraping Tasks

Welcome to the **Data Scraping Tasks** repository! This repository showcases my data scraping Tasks, where I have used Python and various libraries (like `BeautifulSoup`, `Scrapy`, and `Requests`) to extract, process, and analyze data from the web. Each project is stored in its respective folder with detailed explanations, code, and datasets.

---

## Projects Overview

### 1. [Books to Scrape](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Books%20to%20Scrape)
This project extracts book information (title, price, rating, availability, etc.) from the [Books to Scrape website](http://books.toscrape.com/). The dataset can be useful for e-commerce analysis, book categorization, and price comparisons.

### 2. [Quotes to Scrape](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Quotes%20to%20Scrape)
In this project, quotes, authors, and tags are scraped from the [Quotes to Scrape website](http://quotes.toscrape.com/). This dataset is great for sentiment analysis or NLP experiments.

### 3. [Countries of the World](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Countries%20of%20the%20World)
This project focuses on extracting detailed information about countries, including names, capitals, and population. The dataset can support geographical and demographic analysis.

### 4. [Etsy Product](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Etsy%20Product)
Data is scraped from Etsy product listings, including product names, prices, and seller details. This project is useful for market research and competitor analysis in the e-commerce space.

### 5. [Hockey Teams](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Hockey%20Teams)
A project that collects information about hockey teams, their players, and match statistics. It is ideal for sports analytics and fan engagement projects.

### 6. [List of Largest Companies in the USA by Revenue](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/List%20of%20largest%20companies%20in%20the%20USA%20by%20Revenue)
This project extracts financial data about the largest companies in the USA by revenue. The dataset supports financial analysis and insights into the U.S. economy.

### 7. [Moviesjoy Movies Data](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Moviesjoy%20Movies%20Data)
Scrapes data from the Moviesjoy website, including movie titles, genres, and ratings. The dataset can be leveraged for building recommendation systems or film research.

### 8. [Wikipedia Page](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/Wikipedia%20Page)
This project demonstrates how to scrape structured data from Wikipedia pages. It is a versatile tool for extracting general knowledge or niche information.

### 9. [World Countries Population](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/World%20Countries%20Population)
Scrapes data about the population of countries worldwide. This dataset is perfect for demographic studies and visualization projects.

### 10. [World Population Data](https://github.com/drisskhattabi6/data-scraping-projects/tree/main/World%20Population%20Data)
Similar to the previous project but includes additional attributes like growth rate, density, and urban population percentages for deeper insights into global population trends.

---

## Key Features
- **Clean and Organized:** Each project includes a folder with code, README file, and scraped dataset for easy navigation and replication.
- **Scalable Codebase:** Designed to handle dynamic content and adapt to changes in website structures.
- **Reusable Scripts:** Scripts can be reused and customized for scraping similar data from other sources.

---

## Tools and Technologies Used
- **Libraries:** `BeautifulSoup`, `Scrapy`, `toscrap`, `Requests`, `Pandas`
- **Languages:** Python
- **Applications:** Data analysis, visualization, and insights generation

---

## How to Use
1. Clone this repository:
```bash
git clone https://github.com/drisskhattabi6/Data-Scraping-Tasks.git
```
2. Navigate to the desired project folder.
3. Follow the instructions in the project's README file to run the scraping script.

---
If you have any questions or suggestions, feel free Contact me.
Happy Scraping! 🚀