https://github.com/pahul0303/llassist

A tool for processing and analyzing research articles using NLP and Large Language Models (LLMs).
https://github.com/pahul0303/llassist

csharp dotnet literature-review llm nlp

Last synced: 3 months ago
JSON representation

A tool for processing and analyzing research articles using NLP and Large Language Models (LLMs).

Host: GitHub
URL: https://github.com/pahul0303/llassist
Owner: pahul0303
License: other
Created: 2025-07-03T20:55:26.000Z (3 months ago)
Default Branch: main
Last Pushed: 2025-07-04T14:25:35.000Z (3 months ago)
Last Synced: 2025-07-04T15:58:33.814Z (3 months ago)
Topics: csharp, dotnet, literature-review, llm, nlp
Language: C#
Homepage:
Size: 18.6 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# LLAssist

LLAssist is a tool for processing and analyzing research articles using Natural Language Processing (NLP) techniques and Large Language Models (LLMs).

Note:
- The paper with title [LLAssist: Simple Tools for Automating Literature Review Using Large Language Models](https://doi.org/10.48550/arXiv.2407.13993) uses [commit versions prior to `07caad7`](https://github.com/cyharyanto/llassist/tree/07caad7d954f9e64933ffa5aa34d0b745006feea), specifically [commit `3bf51a6`](https://github.com/cyharyanto/llassist/tree/3bf51a695b945e07c77eaa0a323c9aa3e57372bd).

## Features

- Read articles from CSV files
- Extract key semantics (topics, entities, keywords) from article titles and abstracts
- Estimate relevance of articles to research questions
- Generate embeddings for keywords
- Output results in both JSON and CSV formats

## Components

### Program.cs

The main entry point of the application. It orchestrates the process of:
1. Reading articles from a CSV file
2. Processing each article to extract semantics and estimate relevance
3. Writing results incrementally to a CSV file
4. Generating a final JSON output

### Services

#### NLPService

Handles Natural Language Processing tasks:
- Extracting key semantics from text
- Estimating relevance of content to research questions
- Generating embeddings for keywords

#### LLMService

Manages connections to various Large Language Models:
- Ollama Gemma 2 (local)
- GPT-3.5 Turbo (OpenAI)
- GPT-4 (OpenAI)
- Text Embedding model (OpenAI)

#### ArticleService

Handles file I/O operations:
- Reading articles from CSV files
- Writing articles to JSON files
- Writing results to CSV files

## Usage

### Console Mode

```
dotnet run --project llassist.AppConsole
```

Where:
- `` is the path to the CSV file containing the articles
- `` is the path to a text file containing the research questions (one per line)

### Web Application

Run docker compose in the root directory
```
docker-compose up -d
```

Run DB migrations in ApiService dir
```
dotnet ef database update
```

## Output

The program generates two output files:
1. A JSON file (`-result.json`) containing all processed articles with their semantics and relevance scores
2. A CSV file (`-result.csv`) with the same information in a tabular format

## Dependencies

- Microsoft.SemanticKernel
- CsvHelper
- Microsoft.Extensions.Logging

## Notes

- The program uses a local Ollama instance for the Gemma 2 model. Ensure it's running on `http://localhost:11434` before executing the program.
- OpenAI API key is required for GPT models and embeddings. Set it in the `LLMService` constructor.

## Disclaimer

This tool is for research purposes.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pahul0303/llassist

Awesome Lists containing this project

README