https://github.com/sakhileln/multimodal-chatbot-with-memory
A lightweight, CLI-based conversational agent that processes text and image inputs, remembers past interactions.
https://github.com/sakhileln/multimodal-chatbot-with-memory
chatbot machine-learning nlp numpy opencv spacy tensorflow
Last synced: 3 months ago
JSON representation
A lightweight, CLI-based conversational agent that processes text and image inputs, remembers past interactions.
- Host: GitHub
- URL: https://github.com/sakhileln/multimodal-chatbot-with-memory
- Owner: sakhileln
- License: gpl-3.0
- Created: 2025-03-23T10:23:50.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-16T18:18:41.000Z (about 1 year ago)
- Last Synced: 2025-07-23T19:06:03.077Z (11 months ago)
- Topics: chatbot, machine-learning, nlp, numpy, opencv, spacy, tensorflow
- Language: Python
- Homepage:
- Size: 929 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Simple Multimodal Chatbot with Memory
A lightweight, CLI-based conversational agent that processes text and image inputs, remembers past interactions, and responds contextually. Built with Python, spaCy (text processing), MobileNet (image classification), and a simple memory system, it runs locally on a CPU.
## Overview
This project extends my [Simple Multimodal Agent](https://github.com/sakhileln/multimodal-agent) by adding conversational memory. It takes text prompts (e.g., "What is this?") and image files (e.g., "dog.jpg") via CLI, processes them, and maintains context across interactions. For example:
- Input 1: `--text "What is this?" --image "dog.jpg"` → "This is a dog."
- Input 2: `--text "Is it big?"` → "The dog doesn’t look very big."
The focus is on learning conversational AI, context management, and memory integration while keeping it simple and local.
## Features
- Text intent extraction (e.g., "describe", "classify") using spaCy.
- Image classification using pre-trained MobileNet.
- Memory system to track conversation history (inputs and outputs).
- Context-aware responses based on prior interactions.
- CLI interface with `argparse`.
## Requirements
- Python 3.8+
- Libraries:
- `spacy==3.7.2` (with `en_core_web_sm` model)
- `opencv-python==4.9.0.80`
- `tensorflow==2.15.0`
- `numpy==1.26.4`
- No new dependencies beyond the above.
## Installation
1. Clone the repository:
```bash
git clone https://github.com/sakhileln/multimodal-chatbot-with-memory.git
cd multimodal-chatbot-with-memory
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Download the spaCy model:
```bash
python -m spacy download en_core_web_sm
```
## Usage
- Run the chatbot via CLI:
```bash
python chatbot.py --text "What is this?" --image "path/to/dog.jpg"
```
- Follow up with a context-aware prompt:
```bash
python chatbot.py --text "Is it big?"
```
## Project Structure
- `chatbot.py`: Main script with CLI, memory, and response logic.
- `memory.py`: Memory system to store conversation history.
- `text_processor.py`: Text intent extraction with spaCy.
- `image_processor.py`: Image classification with MobileNet.
- `requirements.txt`: Dependencies list.
## Goals
- Deepen understanding of conversational AI and context management.
- Build a reusable, modular codebase for future experiments.
## License
This project is licensed under the GPL v3.0 License. See the [LICENSE](LICENSE) file for details.
## Contact
- Sakhile III
- [LinkedIn Profile](https://www.linkedin.com/in/sakhile-ndlazi)
- [GitHub Profile](https://github.com/sakhileln)