https://github.com/jcaperella29/ai_llm_set_up
AI-powered research paper summarization using local LLMs (Ollama). Extracts, processes, and summarizes PDFs with structured insights. Ideal for scientific papers & bioinformatics
https://github.com/jcaperella29/ai_llm_set_up
ai llm machine-learning nlp ollama pdf-processing python research
Last synced: about 1 month ago
JSON representation
AI-powered research paper summarization using local LLMs (Ollama). Extracts, processes, and summarizes PDFs with structured insights. Ideal for scientific papers & bioinformatics
- Host: GitHub
- URL: https://github.com/jcaperella29/ai_llm_set_up
- Owner: jcaperella29
- Created: 2025-02-19T16:20:25.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-19T16:25:08.000Z (over 1 year ago)
- Last Synced: 2025-02-19T17:27:25.484Z (over 1 year ago)
- Topics: ai, llm, machine-learning, nlp, ollama, pdf-processing, python, research
- Language: Python
- Homepage:
- Size: 3.91 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Ai_LLM_set_up
# AI LLM Paper Summarizer with Ollama
## Overview
This project automates the extraction and summarization of research papers using **Local Large Language Models (LLMs)** via **Ollama**. It processes **scientific PDFs**, extracts relevant text, and generates structured summaries for each section.
## Features
- 📄 **Extracts text from scientific PDFs**
- 🤖 **Summarizes research papers using Ollama** (Mistral, Gemma, or LLaMA models)
- 🏗️ **Processes large documents in chunks** for better accuracy
- 🔍 **Identifies key topics in life sciences & bioinformatics**
## Setup Instructions
### 1️⃣ Install Dependencies
First, make sure you have Python installed. Then install the required libraries:
```sh
pip install pymupdf requests
```
### 2️⃣ Install & Run Ollama
Download and install **Ollama** from [Ollama's website](https://ollama.com).
Start the Ollama server:
```sh
ollama serve
```
To use a lighter model, install **Gemma** or **Mistral**:
```sh
ollama pull gemma
ollama pull mistral
```
### 3️⃣ Run the Script
Activate the virtual environment (if using one):
```sh
cd path/to/project
.\venv\Scripts\Activate # On Windows
```
Then execute the script:
```sh
python LLM_test.py
```
## Optimization Options
- Reduce **chunk size** (from 3000 to 1500 characters) for faster processing.
- Use **lighter models** like `gemma` for better speed.
- Adjust **Ollama's thread settings** for better CPU performance:
```sh
OLLAMA_NUM_THREADS=8 ollama serve
```
## Next Steps
- 📌 **Citation & Figure Extraction** (Upcoming Feature)
- ⚡ **Parallel Processing** to speed up large document analysis
- 🌐 **Cloud Integration** for faster summaries with OpenAI API
## Contributing
Feel free to fork the repo, submit issues, or suggest improvements!
## License
MIT License - Free to use and modify!