https://github.com/abhimanyu-gaurav/web-scraper-api
A lightweight API built with FastAPI for scraping static web content. Easily extract various HTML elements such as paragraphs, titles, links, and more from websites using BeautifulSoup.
https://github.com/abhimanyu-gaurav/web-scraper-api
beautifulsoup4 fastapi python webscraping
Last synced: 2 months ago
JSON representation
A lightweight API built with FastAPI for scraping static web content. Easily extract various HTML elements such as paragraphs, titles, links, and more from websites using BeautifulSoup.
- Host: GitHub
- URL: https://github.com/abhimanyu-gaurav/web-scraper-api
- Owner: Abhimanyu-Gaurav
- License: mit
- Created: 2024-09-17T10:08:02.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-09-25T10:11:01.000Z (over 1 year ago)
- Last Synced: 2025-01-20T05:09:46.247Z (over 1 year ago)
- Topics: beautifulsoup4, fastapi, python, webscraping
- Language: Python
- Homepage: https://github.com/Abhimanyu-Gaurav/Web-Scraper-API
- Size: 9.77 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: License
Awesome Lists containing this project
README
# Web Scraper API
## Technologies Used
- **Python**: Core programming language for backend logic.
- **FastAPI**: For building a RESTful web API.
- **BeautifulSoup**: For parsing and extracting data from static HTML.
- **Requests**: For making HTTP requests to fetch web pages.
- **Swagger UI** and **Postman**: For testing and validating API requests and responses.
---
## Table of Contents
1. [Project Description](#project-description)
2. [Key Features](#key-features)
3. [Types of Websites We Can Scrape](#types-of-websites-we-can-scrape)
4. [Installation](#installation)
5. [How to Use](#how-to-use)
6. [License](#license)
---
## Project Description
The **Web Scraper API** is a lightweight tool built with **FastAPI** for scraping web content. It allows you to extract various HTML elements (such as paragraphs, titles, links, etc.) from **static web pages** by making HTTP requests and parsing the HTML content with **BeautifulSoup**.
This tool is ideal for scraping websites where the content is directly available in the HTML source, making it easy to extract information such as articles, product descriptions, and other static content.
---
## Key Features
- **Web Scraping for Static Content:** Scrapes content from static HTML elements like paragraphs (`