https://github.com/exposedcat/chat-member
Telegram Messenger AI Agent / Laylo v2
https://github.com/exposedcat/chat-member
Last synced: 3 months ago
JSON representation
Telegram Messenger AI Agent / Laylo v2
- Host: GitHub
- URL: https://github.com/exposedcat/chat-member
- Owner: ExposedCat
- Created: 2025-08-06T21:25:47.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-08-08T10:32:17.000Z (5 months ago)
- Last Synced: 2025-08-08T12:20:37.441Z (5 months ago)
- Language: TypeScript
- Size: 35.2 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Telegram AI Chat Member | Laylo v2
This project is a second experiment after
[tg-local-llm](https://github.com/exposedcat/tg-local-llm) aimed at extending AI
assistant possibilities and improving response quality by exposing messenger API
to the agent rather than performing a complex response parsing and processing
manually.
## Tools
- `send_message(text)` sends a message in the chat
- `search_messages(query, dateMin?, dateMax?)` performs a semantic search in the
chat
- `finish()` signal to stop requesting any further agent actions
## Deep Dive
### Message Search
**Message search** is meant to let agent efficiently find messages to handle
user requests such as search itself or selective chat summaries. Under the hood,
message search is implemented in a simple flow:
- Create an embedding for each message in a chat and store it in a
[Qdrant](https://qdrant.tech/) database
- Provide agent with a tool to perform a vector search on a database by given
query. Internally, each request is scoped to a chat ID. Agent is allowed (but
not required to) set hard filters on message payloads, such as filtering
messages by date which can let it process requests such as summary of last 2
hours in the chat.
## Tech
- Written in TypeScript with [grammY](https://grammy.dev) Bot API framework
- Using MongoDB for persistence
- Using
[onnxruntime-node](https://onnxruntime.ai/docs/get-started/with-javascript/node.html)
and
[tokenizers](https://github.com/huggingface/tokenizers/tree/main/bindings/node)
for tokenization and embedding model invocation
- Using [Qdrant](https://qdrant.tech/) to store embeddings
- Using [Ollama](https://ollama.com/) for AI inference
## Hardware & Models
- Developed and tested on RX 6800 XT with 16G VRAM
- Tested with [Qwen3 14B](https://ollama.com/library/qwen3:14b) as a main model
- Tested with
[Multilingual E5 large instruct](https://huggingface.co/intfloat/multilingual-e5-large-instruct)
as embedding model