https://github.com/mithoon278/openmind-ai-genai-project
A compassionate mental health chatbot built using Retrieval-Augmented Generation (RAG). This project leverages advanced natural language processing techniques, including SentenceTransformers, Pinecone for vector storage, and fine-tuned LLaMA 3.3, to provide thoughtful, context-aware, and empathetic responses.
https://github.com/mithoon278/openmind-ai-genai-project
chunking genai-chatbot groq langchain mental-health-awareness python rag sentence-transformers support
Last synced: about 2 months ago
JSON representation
A compassionate mental health chatbot built using Retrieval-Augmented Generation (RAG). This project leverages advanced natural language processing techniques, including SentenceTransformers, Pinecone for vector storage, and fine-tuned LLaMA 3.3, to provide thoughtful, context-aware, and empathetic responses.
- Host: GitHub
- URL: https://github.com/mithoon278/openmind-ai-genai-project
- Owner: Mithoon278
- License: mit
- Created: 2025-01-07T16:04:49.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-01-08T06:17:48.000Z (4 months ago)
- Last Synced: 2025-01-20T00:39:07.118Z (3 months ago)
- Topics: chunking, genai-chatbot, groq, langchain, mental-health-awareness, python, rag, sentence-transformers, support
- Language: Jupyter Notebook
- Homepage:
- Size: 2.69 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# **OPENMIND_AI-RAG PROJECT**
This project implements a mental support chatbot using a Retrieval-Augmented Generation (RAG) framework. The chatbot is designed to provide compassionate, thoughtful, and context-aware responses by leveraging pre-existing datasets and integrating advanced natural language processing techniques.
---
## **Project Overview**
The chatbot:
- Uses a csv file of psychological contexts and responses to generate insightful answers.
- Embeds data using `SentenceTransformers` and stores it in a Pinecone vector database.
- Retrieves relevant context dynamically during conversation.
- Fine-tunes an LLM model to act as a compassionate, friendly personal assistant.
- Processes queries to provide meaningful, concise, and empathetic responses.---
## **Workflow**
1. **Data Preprocessing**:
- The data is taken from [huggingface](https://huggingface.co/datasets/jkhedri/psychology-dataset/viewer/default/train?p=1), removed the third column from it because of irrelevant responses.
- Converted to csv using pandas.
- Load the dataset using LangChain's CSVLoader.
- Format the data into question-answer pairs.
2. **Text Chunking**:
- Use LangChain's `RecursiveCharacterTextSplitter` for chunking large texts.
3. **Embeddings**:
- Generate vector embeddings using `SentenceTransformers` (`all-MiniLM-L6-v2`).
4. **Storage**:
- Store embeddings in Pinecone for fast and scalable retrieval.
5. **Retrieval-Augmented Generation**:
- Retrieve context based on user queries.
- Use LLaMA 3.3 for generating responses augmented with retrieved context.---
## **Technologies Used**
- Language Models: LLaMA 3.3
- Embeddings: SentenceTransformers (all-MiniLM-L6-v2)
- Database: Pinecone
- Frameworks: LangChain, Streamlit
- Programming Language: Python