An open API service indexing awesome lists of open source software.

https://github.com/yohasebe/monadic-chat

πŸ€– + 🐳 + 🐧 Monadic Chat is a locally hosted web application designed to create and utilize intelligent chatbots. By providing a Linux environment on Docker to GPT and other LLMs, it enables code execution and advanced tasks that require external tools.
https://github.com/yohasebe/monadic-chat

ai chat chatbot education framework openai python ruby voice-conversion

Last synced: 10 days ago
JSON representation

πŸ€– + 🐳 + 🐧 Monadic Chat is a locally hosted web application designed to create and utilize intelligent chatbots. By providing a Linux environment on Docker to GPT and other LLMs, it enables code execution and advanced tasks that require external tools.

Awesome Lists containing this project

README

        

## Overview

**Monadic Chat** is a locally hosted web application designed to create and utilize intelligent chatbots. By providing a Linux environment on Docker to GPT and other LLMs, it allows the execution of advanced tasks that require external tools. It supports voice interaction, image and video recognition and generation, and AI-to-AI chat, making it useful not only for various AI applications but also for developing and researching AI-powered applications.

Available for **Mac**, **Windows**, and **Linux** (Debian/Ubuntu) with easy-to-use installers.

[Changelog](https://yohasebe.github.io/monadic-chat/#/changelog)

## Getting Started

- [**Documentation**](https://yohasebe.github.io/monadic-chat) (English/Japanese)
- [**Installation**](https://yohasebe.github.io/monadic-chat/#/installation)

## What is Grounding?

Monadic Chat is an AI framework grounded in the real world. The term **grounding** here has two meanings.

Typically, discourse involves context and purpose, which are referenced and updated as the conversation progresses. Just as in human-to-human conversations, **maintaining and referencing context** is useful, or even essential, in conversations with AI agents. By defining the format and structure of meta-information in advance, it is expected that conversations with AI agents will become more purposeful. The process of users and AI agents advancing discourse while sharing a foundational background is the first meaning of "grounding."

Human users can use various tools to achieve their goals. However, in many cases, AI agents cannot do this. Monadic Chat enables AI agents to execute tasks using external tools by providing them with a **freely accessible Linux environment**. This allows AI agents to more effectively support users in achieving their goals. Since it is an environment on Docker containers, it does not affect the host system. This is the second meaning of "grounding."

## Features

### Basic Structure

- πŸ€– Use of **AI assistants** via various web and local APIs
- βš›οΈ Easy Docker environment setup using a GUI app with **Electron**
- πŸ“ **Synchronized folder** for syncing local files with files inside Docker containers
- πŸ“¦ User-added **apps** and **containers** functionality
- πŸ’¬ Support for both **Human/AI chat** and **AI/AI chat**
- ✨ Chat functionality utilizing **multiple AI models**

### AI + Linux Environment

- 🐧 Provision of a **Linux environment** to AI agents
- 🐳 Tools available to LLMs via **Docker containers**
- Linux (+ apt)
- Ruby (+ gem)
- Python (+ pip)
- PGVector (+ PostgreSQL)
- Selenium (+ Chrome/Chromium)
- ⚑️ Use of LLMs via **online and local** APIs
- πŸ“¦ Each container can be managed via **SSH**
- πŸ““ Integration with **Jupyter Notebook**

### Data Management

- πŸ’Ύ **Export/import** chat data
- πŸ“ **Edit** chat data (add, delete, edit)
- πŸ’¬ Specify the number of messages to send to the API as **context size**
- πŸ“œ Set **roles** for messages (user, assistant, system)
- πŸ”’ Generate and import/export **text embeddings** from PDFs
- πŸ“Ό **Logging** of code execution and tool/function use for debugging

### Voice Interaction

- πŸ”ˆ **Text-to-speech** for AI assistant responses (OpenAI or Elevenlabs)
- πŸŽ™οΈ **Speech recognition** using the Speech-to-Text API (+ display of p-values)
- πŸ—ΊοΈ **Automatic language detection** for text-to-speech
- πŸ—£οΈ Choose the **language and voice** for text-to-speech
- 😊 **Interactive conversation** with AI agents using speech recognition and text-to-speech
- 🎧 Save AI assistant's spoken responses as **MP3 audio** files

### Image/Video Recognition and Generation

- πŸ–ΌοΈ **Image generation** using DALLΒ·E 3 API
- πŸ‘€ Recognition and description of **uploaded images**
- πŸ“š Upload and recognition of **multiple images**
- πŸŽ₯ Recognition and description of **uploaded video content and audio**

### Configuration and Extension

- πŸ’‘ Specify and edit **API parameters** and **system prompts**
- 🧩 Create custom applications with **Monadic DSL** (Domain Specific Language)
- πŸ’Ž Extend functionality using the **Ruby** programming language
- 🐍 Extend functionality using the **Python** programming language
- πŸ” **Web search** capabilities using the [Tavily](https://tavily.com/) API and OpenAI's built-in search feature
- 🌎 Perform **web scraping** using Selenium
- πŸ“¦ Add custom **Docker containers**

### Support for Multiple LLM APIs

- πŸ‘₯ Web API
- [OpenAI GPT](https://platform.openai.com/docs/overview)
- [Google Gemini](https://ai.google.dev/gemini-api)
- [Anthropic Claude](https://www.anthropic.com/api)
- [Cohere](https://cohere.com/)
- [Mistral AI](https://docs.mistral.ai/api/)
- [xAI Grok](https://x.ai/api)
- [Perplexity](https://docs.perplexity.ai/home)
- [DeepSeek](https://www.deepseek.com/)
- πŸ¦™ [Ollama](https://ollama.com/) in the local Docker environment
- Llama
- Phi
- Mistral
- Gemma
- DeepSeek
- πŸ€–πŸ’¬πŸ€– **AI-to-AI** chat functionality

### Conversations as Monads

- ♻️ In addition to the main response from the AI assistant, it is possible to manage the (invisible) **state** of the conversation by obtaining additional responses and updating values within a predefined JSON object

## Developer

Yoichiro HASEBE

[[email protected]]([email protected])

## License

This software is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).