https://github.com/ansh420/fastapi
API links for the text Extract from Url's and pdf's and finding similar words using cosine.
https://github.com/ansh420/fastapi
docker-container fastapi ma pdfs python3 urls
Last synced: 6 months ago
JSON representation
API links for the text Extract from Url's and pdf's and finding similar words using cosine.
- Host: GitHub
- URL: https://github.com/ansh420/fastapi
- Owner: Ansh420
- Created: 2024-09-03T10:22:05.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-15T05:16:52.000Z (12 months ago)
- Last Synced: 2025-04-05T19:19:02.660Z (6 months ago)
- Topics: docker-container, fastapi, ma, pdfs, python3, urls
- Language: Python
- Homepage:
- Size: 390 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README

# FastAPI
# **Demonstration**
https://www.loom.com/share/cd1b617cc86b46838e469e908a075ae3?sid=9c269a9a-f0d2-423e-b87f-61682c9553b1
# **scraping content from a given URL**
storing it in a database using SQLAlchemy, within the context of a FastAPI application. It incorporates the following key functionalities:## URL Scraping:
- Extracts content from a specified URL using the requests library.
- Cleans the extracted content by removing scripts and styles (if desired).
- 
## SQLAlchemy Integration:- Creates a database engine and defines a SQLAlchemy model to represent the scraped data.
- Creates a database table based on the model.
- Uses a database session to add and commit scraped content to the database.
## FastAPI Endpoint:- Exposes a POST endpoint (/process_url) that accepts a URL as input.
- Processes the URL, scrapes the content, and stores it in the database.- Returns a response with a unique chat ID and a success message.
## Error Handling:- Implements basic error handling for HTTP requests and database operations.
# **Extract text from uploaded pdf**

## Q&A Chatbot with Document Upload
- This is a FastAPI application for a simple Q&A chatbot that leverages uploaded documents.

## Features:- Supports uploading documents through URLs or PDFs.
- Extracts text content from uploaded documents.
- Processes user chat requests with a specific chat ID.
- Finds the most relevant section within the uploaded document based on the user's question using cosine similarity (Note: currently uses a dummy embedding function, needs replacement).

# **FastAPI Chatbot with Document Upload**
This project implements a simple chatbot that retrieves relevant information from uploaded documents.

## Features:- Supports uploading documents through URLs or PDFs.
- Extracts text content from uploaded documents.
- Processes user questions and identifies the most relevant section based on cosine similarity.## Technologies:
- FastAPI: Web framework for building APIs
- Pydantic: Data validation and serialization
- requests: Making HTTP requests
- BeautifulSoup: Parsing HTML documents
- pdfminer: Extracting text from PDFs## Usage:
### Run the application:
$Bash$ uvicorn main:**app --host 127.0.0.1 --port 8000**
### Upload documents:
### Upload URL:
$ Bash
$ curl -X POST http://localhost:8000/upload_url/ -F chat_id=user1 -F url=https://www.example.com/article.html

## Upload PDF: (using tools like curl or Postman)
Set chat_id in the form data.
Send a multipart request with the PDF file as file.
## Chat interaction:
$ Bash
$ curl -X POST http://localhost:8000/chat/ -H 'Content-Type: application/json' -d '{"chat_id": "user1", "question": "What is the capital of France?"}'
## Return
This will return a JSON response containing the most relevant section from the uploaded document for user1 based on the question.