Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/abdoomohamedd/beautifulsoup-web-scraping-projects
A collection of web scraping projects using BeautifulSoup, requests, and CSV modules to extract and analyze data from various websites.
https://github.com/abdoomohamedd/beautifulsoup-web-scraping-projects
beautifulsoup beautifulsoup4 csv requests web-scraper web-scraping
Last synced: 12 days ago
JSON representation
A collection of web scraping projects using BeautifulSoup, requests, and CSV modules to extract and analyze data from various websites.
- Host: GitHub
- URL: https://github.com/abdoomohamedd/beautifulsoup-web-scraping-projects
- Owner: AbdooMohamedd
- Created: 2024-07-11T21:20:15.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-09-03T22:48:54.000Z (4 months ago)
- Last Synced: 2024-11-06T17:54:59.194Z (2 months ago)
- Topics: beautifulsoup, beautifulsoup4, csv, requests, web-scraper, web-scraping
- Language: Jupyter Notebook
- Homepage:
- Size: 4.68 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# BeautifulSoup-Web-Scraping-Projects
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
This repository contains projects demonstrating responsible web scraping using BeautifulSoup, requests, and CSV modules. These projects aim to extract, process, and save data from various websites in a structured manner.
## Table of Contents
- [Introduction](#introduction)
- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Project Structure](#project-structure)
- [Projects](#projects)
- [Contributing](#contributing)
- [License](#license)## Introduction
Web scraping is a technique used to extract information from websites. This repository showcases multiple projects where BeautifulSoup, requests, and CSV modules are used to perform web scraping tasks in a responsible manner. The goal is to collect data, process it, and store it in CSV files for further analysis.
## Features
- Extract data from multiple websites.
- Use BeautifulSoup for parsing HTML and XML documents.
- Utilize requests for making HTTP requests to fetch web pages.
- Save scraped data into CSV files for easy analysis and storage.
- Ensure responsible and ethical web scraping practices.## Installation
To get started with these projects, clone the repository and install the required dependencies.
```bash
git clone https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects.git
cd BeautifulSoup-Web-Scraping-Projects
pip install beautifulsoup4 csv requests
```## Usage
Each project is contained in its own directory. You can navigate to a specific project and run the scraping script as follows:
```bash
cd project-directory
python scrape.py
```Replace `project-directory` with the actual directory name of the project you want to run.
## Project Structure
- `Web Scraping Alibaba Products/` - Scrapes product information from Alibaba. Collects product names, prices, and availability. Stores the data in a CSV file for easy access.
- `Web Scraping FIFA 2024 all Players/` - Scrapes player data from the FIFA 2024 database. Gathers player names, positions, and statistics. Saves the data in a CSV file for analysis.
- `Web Scraping Jumia Laptop Prices/` - Scrapes laptop prices from Jumia. Extracts laptop names, prices, and specifications. Outputs the data to a CSV file for comparison.
- `Web Scraping Quotes/` - Scrapes quotes from a quotes website. Collects quotes, authors, and tags. Saves the data in a CSV file for reference.
- `Web Scraping WUZZUF Jobs/` - Scrapes job listings from WUZZUF. Gathers job titles, companies, locations, and posting dates. Stores the data in a CSV file for job market analysis.Each project directory typically contains the following files:
- `file.py` - Main script to perform web scraping.
- `output.csv` - Output file containing the scraped data.## Projects
### Project 1: Web Scraping Alibaba Products
Scrapes product information from Alibaba. Collects product names, prices, and availability. Stores the data in a CSV file for easy access.
[Project 1 - Web Scraping Alibaba Products](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20Alibaba%20Products)### Project 2: Web Scraping FIFA 2024 all Players
Scrapes player data from the FIFA 2024 database. Gathers player names, positions, and statistics. Saves the data in a CSV file for analysis.
[Project 2 - Web Scraping FIFA 2024 all Players](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20FIFA%202024%20all%20Players)### Project 3: Web Scraping Jumia Laptop Prices
Scrapes laptop prices from Jumia. Extracts laptop names, prices, and specifications. Outputs the data to a CSV file for comparison.
[Project 3 - Web Scraping Jumia Laptop Prices](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20Jumia%20Laptop%20Prices)### Project 4: Web Scraping Quotes
Scrapes quotes from a quotes website. Collects quotes, authors, and tags. Saves the data in a CSV file for reference.
[Project 4 - Web Scraping Quotes](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20Quotes)### Project 5: Web Scraping WUZZUF Jobs
Scrapes job listings from WUZZUF. Gathers job titles, companies, locations, and posting dates. Stores the data in a CSV file for job market analysis.
[Project 5 - Web Scraping WUZZUF Jobs](https://github.com/AbdooMohamedd/BeautifulSoup-Web-Scraping-Projects/tree/main/Web%20Scraping%20WUZZUF%20Jobs)Sure, here is the updated contributing section:
## Contributing
Contributions are welcome! If you have any suggestions or improvements, please open an issue or create a pull request. For any inquiries, you can reach me at [email protected].