Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/abdoomohamedd/beautifulsoup-web-scraping-projects

A collection of web scraping projects using BeautifulSoup, requests, and CSV modules to extract and analyze data from various websites.
https://github.com/abdoomohamedd/beautifulsoup-web-scraping-projects

beautifulsoup beautifulsoup4 csv requests web-scraper web-scraping

Last synced: 12 days ago
JSON representation

A collection of web scraping projects using BeautifulSoup, requests, and CSV modules to extract and analyze data from various websites.

Awesome Lists containing this project

README

        

# BeautifulSoup-Web-Scraping-Projects

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

This repository contains projects demonstrating responsible web scraping using BeautifulSoup, requests, and CSV modules. These projects aim to extract, process, and save data from various websites in a structured manner.

## Table of Contents

- [Introduction](#introduction)
- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Project Structure](#project-structure)
- [Projects](#projects)
- [Contributing](#contributing)
- [License](#license)

## Introduction

Web scraping is a technique used to extract information from websites. This repository showcases multiple projects where BeautifulSoup, requests, and CSV modules are used to perform web scraping tasks in a responsible manner. The goal is to collect data, process it, and store it in CSV files for further analysis.

## Features

- Extract data from multiple websites.
- Use BeautifulSoup for parsing HTML and XML documents.
- Utilize requests for making HTTP requests to fetch web pages.
- Save scraped data into CSV files for easy analysis and storage.
- Ensure responsible and ethical web scraping practices.

## Installation

To get started with these projects, clone the repository and install the required dependencies.

```bash
git clone https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects.git
cd BeautifulSoup-Web-Scraping-Projects
pip install beautifulsoup4 csv requests
```

## Usage

Each project is contained in its own directory. You can navigate to a specific project and run the scraping script as follows:

```bash
cd project-directory
python scrape.py
```

Replace `project-directory` with the actual directory name of the project you want to run.

## Project Structure

- `Web Scraping Alibaba Products/` - Scrapes product information from Alibaba. Collects product names, prices, and availability. Stores the data in a CSV file for easy access.
- `Web Scraping FIFA 2024 all Players/` - Scrapes player data from the FIFA 2024 database. Gathers player names, positions, and statistics. Saves the data in a CSV file for analysis.
- `Web Scraping Jumia Laptop Prices/` - Scrapes laptop prices from Jumia. Extracts laptop names, prices, and specifications. Outputs the data to a CSV file for comparison.
- `Web Scraping Quotes/` - Scrapes quotes from a quotes website. Collects quotes, authors, and tags. Saves the data in a CSV file for reference.
- `Web Scraping WUZZUF Jobs/` - Scrapes job listings from WUZZUF. Gathers job titles, companies, locations, and posting dates. Stores the data in a CSV file for job market analysis.

Each project directory typically contains the following files:

- `file.py` - Main script to perform web scraping.
- `output.csv` - Output file containing the scraped data.

## Projects

### Project 1: Web Scraping Alibaba Products

Scrapes product information from Alibaba. Collects product names, prices, and availability. Stores the data in a CSV file for easy access.

[Project 1 - Web Scraping Alibaba Products](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20Alibaba%20Products)

### Project 2: Web Scraping FIFA 2024 all Players

Scrapes player data from the FIFA 2024 database. Gathers player names, positions, and statistics. Saves the data in a CSV file for analysis.

[Project 2 - Web Scraping FIFA 2024 all Players](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20FIFA%202024%20all%20Players)

### Project 3: Web Scraping Jumia Laptop Prices

Scrapes laptop prices from Jumia. Extracts laptop names, prices, and specifications. Outputs the data to a CSV file for comparison.

[Project 3 - Web Scraping Jumia Laptop Prices](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20Jumia%20Laptop%20Prices)

### Project 4: Web Scraping Quotes

Scrapes quotes from a quotes website. Collects quotes, authors, and tags. Saves the data in a CSV file for reference.

[Project 4 - Web Scraping Quotes](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20Quotes)

### Project 5: Web Scraping WUZZUF Jobs

Scrapes job listings from WUZZUF. Gathers job titles, companies, locations, and posting dates. Stores the data in a CSV file for job market analysis.
[Project 5 - Web Scraping WUZZUF Jobs](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20WUZZUF%20Jobs)

Sure, here is the updated contributing section:

## Contributing

Contributions are welcome! If you have any suggestions or improvements, please open an issue or create a pull request. For any inquiries, you can reach me at [email protected].