https://github.com/mithoon278/openmind-ai-genai-project

A compassionate mental health chatbot built using Retrieval-Augmented Generation (RAG). This project leverages advanced natural language processing techniques, including SentenceTransformers, Pinecone for vector storage, and fine-tuned LLaMA 3.3, to provide thoughtful, context-aware, and empathetic responses.
https://github.com/mithoon278/openmind-ai-genai-project

chunking genai-chatbot groq langchain mental-health-awareness python rag sentence-transformers support

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/mithoon278/openmind-ai-genai-project
Owner: Mithoon278
License: mit
Created: 2025-01-07T16:04:49.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-01-08T06:17:48.000Z (4 months ago)
Last Synced: 2025-01-20T00:39:07.118Z (3 months ago)
Topics: chunking, genai-chatbot, groq, langchain, mental-health-awareness, python, rag, sentence-transformers, support
Language: Jupyter Notebook
Homepage:
Size: 2.69 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# **OPENMIND_AI-RAG PROJECT**

This project implements a mental support chatbot using a Retrieval-Augmented Generation (RAG) framework. The chatbot is designed to provide compassionate, thoughtful, and context-aware responses by leveraging pre-existing datasets and integrating advanced natural language processing techniques.

---

## **Project Overview**
The chatbot:
- Uses a csv file of psychological contexts and responses to generate insightful answers.
- Embeds data using `SentenceTransformers` and stores it in a Pinecone vector database.
- Retrieves relevant context dynamically during conversation.
- Fine-tunes an LLM model to act as a compassionate, friendly personal assistant.
- Processes queries to provide meaningful, concise, and empathetic responses.

---

## **Workflow**
1. **Data Preprocessing**:
- The data is taken from [huggingface](https://huggingface.co/datasets/jkhedri/psychology-dataset/viewer/default/train?p=1), removed the third column from it because of irrelevant responses.
- Converted to csv using pandas.
- Load the dataset using LangChain's CSVLoader.
- Format the data into question-answer pairs.
2. **Text Chunking**:
- Use LangChain's `RecursiveCharacterTextSplitter` for chunking large texts.
3. **Embeddings**:
- Generate vector embeddings using `SentenceTransformers` (`all-MiniLM-L6-v2`).
4. **Storage**:
- Store embeddings in Pinecone for fast and scalable retrieval.
5. **Retrieval-Augmented Generation**:
- Retrieve context based on user queries.
- Use LLaMA 3.3 for generating responses augmented with retrieved context.

---

## **Technologies Used**
- Language Models: LLaMA 3.3
- Embeddings: SentenceTransformers (all-MiniLM-L6-v2)
- Database: Pinecone
- Frameworks: LangChain, Streamlit
- Programming Language: Python

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mithoon278/openmind-ai-genai-project

Awesome Lists containing this project

README