An open API service indexing awesome lists of open source software.

https://github.com/mjunaidca/hello-sdd-chatagent


https://github.com/mjunaidca/hello-sdd-chatagent

Last synced: 5 months ago
JSON representation

Awesome Lists containing this project

README

          

# ChatWait - Real-time Streaming AI Chat

πŸ€– **A production-ready real-time AI chatbot** featuring token-by-token streaming responses, built with FastAPI, OpenAI Agents SDK, Gemini 2.5 Flash, and Chainlit.

## ✨ Features

### πŸš€ **Real-time Streaming Chat**
- **Token-by-token Streaming**: Watch AI responses generate in real-time
- **Server-Sent Events (SSE)**: Modern streaming protocol
- **Clean Interface**: Optimized for the streaming experience
- **Gemini 2.5 Flash**: Powered by Google's latest AI model

### πŸ› οΈ **Production Ready**
- Async-first architecture
- Comprehensive error handling
- Health monitoring
- Rate limiting support
- Clean separation of concerns

### πŸ”§ **Extensible Design**
- Plugin architecture for new features
- Session management ready
- Context preservation
- Function tool support

## πŸ—οΈ Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Chainlit UI β”‚ β”‚ FastAPI API β”‚ β”‚ OpenAI Agents β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ SDK β”‚
β”‚ - Mode Selectionβ”‚ β”‚ - /chat/wait β”‚ β”‚ β”‚
β”‚ - Chat Interface│◄──►│ - /chat/stream │◄──►│ - Gemini LLM β”‚
β”‚ - Real-time UI β”‚ β”‚ - Health checks β”‚ β”‚ - Streaming β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## πŸš€ Quick Start

### 1. Backend Setup
```bash
cd backend
uv run python main.py
```
- Server starts on http://localhost:8000
- API docs at http://localhost:8000/docs
- Health check at http://localhost:8000/health

### 2. Frontend Setup (Optional)
```bash
cd frontend
chainlit run app.py
```
- Frontend starts on http://localhost:8001
- Dual-mode chat interface

### 3. Test the API
```bash
# Test synchronous chat
curl -X POST http://localhost:8000/api/v1/chat/wait \
-H "Content-Type: application/json" \
-d '{"message": "Hello!"}'

# Test streaming chat
curl -N "http://localhost:8000/api/v1/chat/streaming?message=Hello%20streaming"
```

## πŸ“‘ API Endpoints

### Health Check
```http
GET /health
GET /api/v1/health
GET /api/v1/health/ready
GET /api/v1/health/live
```

### Chat Streaming (Real-time)
```http
GET /api/v1/chat/streaming?message=Your%20message&context_id=optional&last_token_index=0
Accept: text/event-stream
```

**Streaming Response:**
```
data: {"token": "Hello", "token_index": 0, "context_id": "conv-123", "event_type": "token"}
data: {"token": " ", "token_index": 1, "context_id": "conv-123", "event_type": "token"}
data: {"token": "world!", "token_index": 2, "context_id": "conv-123", "event_type": "token"}
data: {"type": "end", "context_id": "conv-123", "final_output": "Hello world!"}
```

### Chat Streaming
```http
GET /api/v1/chat/streaming?message=Your%20message&context_id=optional&last_token_index=0
Accept: text/event-stream
```

**Streaming Response:**
```
data: {"token": "Hello", "token_index": 0, "context_id": "conv-123", "event_type": "token"}
data: {"token": " ", "token_index": 1, "context_id": "conv-123", "event_type": "token"}
data: {"token": "world!", "token_index": 2, "context_id": "conv-123", "event_type": "token"}
data: {"type": "end", "context_id": "conv-123", "final_output": "Hello world!"}
```

## πŸŽ›οΈ Configuration

### Backend (.env)
```bash
# API Configuration
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-2.5-flash
BACKEND_URL=http://localhost:8000

# Server Settings
HOST=localhost
PORT=8000
DEBUG=true

# Rate Limiting
RATE_LIMIT_REQUESTS_PER_MINUTE=60
```

### Frontend (.env)
```bash
# Backend Connection
BACKEND_URL=http://localhost:8000

# Chainlit Settings
CHAINLIT_HOST=localhost
CHAINLIT_PORT=8001
CHAINLIT_DEBUG=true

# UI Options
DEFAULT_CHAT_MODE=wait
SHOW_MODE_SELECTOR=true
```

## πŸ§ͺ Testing

### Automated API Testing
```bash
cd /path/to/chatwait
python test_api.py
```

### Manual API Testing
```bash
# Health check
curl http://localhost:8000/health

# Synchronous chat
curl -X POST http://localhost:8000/api/v1/chat/wait \
-H "Content-Type: application/json" \
-d '{"message": "What is AI?"}'

# Streaming chat
curl -N "http://localhost:8000/api/v1/chat/streaming?message=Explain%20machine%20learning"
```

### Interactive Testing
- **Swagger UI**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
- **Chainlit UI**: http://localhost:8001 (if running)

## πŸ”§ Development

### Project Structure
```
chatwait/
β”œβ”€β”€ backend/ # FastAPI backend
β”‚ β”œβ”€β”€ src/chatwait_backend/ # Main package
β”‚ β”‚ β”œβ”€β”€ api/ # API endpoints
β”‚ β”‚ β”œβ”€β”€ models/ # Pydantic models
β”‚ β”‚ β”œβ”€β”€ services/ # Business logic
β”‚ β”‚ β”œβ”€β”€ routers/ # Route handlers
β”‚ β”‚ └── config.py # Configuration
β”‚ β”œβ”€β”€ tests/ # Backend tests
β”‚ β”œβ”€β”€ main.py # Server startup
β”‚ └── pyproject.toml # Dependencies
β”œβ”€β”€ frontend/ # Chainlit frontend
β”‚ β”œβ”€β”€ app.py # Main Chainlit app
β”‚ β”œβ”€β”€ modes/ # Mode implementations
β”‚ └── .env # Frontend config
└── specs/ # Specifications
└── 001-develop-a-scoped/ # Feature specs
```

### Adding New Features

1. **New Chat Modes**: Add to `frontend/modes/`
2. **New API Endpoints**: Add to `backend/src/chatwait_backend/routers/`
3. **New Models**: Add to `backend/src/chatwait_backend/models/`
4. **New Services**: Add to `backend/src/chatwait_backend/services/`

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `GEMINI_API_KEY` | Gemini API key | Required |
| `BACKEND_URL` | Backend API URL | http://localhost:8000 |
| `CHAINLIT_HOST` | Frontend host | localhost |
| `CHAINLIT_PORT` | Frontend port | 8001 |
| `DEBUG` | Enable debug mode | false |

## πŸš€ Deployment

### Docker (Recommended)
```dockerfile
# Backend
FROM python:3.12
WORKDIR /app
COPY backend/ .
RUN uv sync
CMD ["uv", "run", "python", "main.py"]

# Frontend
FROM python:3.12
WORKDIR /app
COPY frontend/ .
RUN uv sync
CMD ["chainlit", "run", "app.py", "--host", "0.0.0.0"]
```

### Production Considerations
- Set up API key management (AWS Secrets Manager, etc.)
- Configure rate limiting
- Set up monitoring and logging
- Use production WSGI server (Gunicorn, etc.)
- Configure health checks for load balancers

## πŸ“Š Performance

### Benchmarks
- **Response Time**: <200ms per token (streaming)
- **Throughput**: 60 requests/minute (rate limited)
- **Concurrent Users**: Supports multiple simultaneous streams
- **Memory Usage**: Optimized for production workloads

### Monitoring
- Health check endpoints for load balancers
- Structured logging with request IDs
- Performance metrics collection
- Error tracking and alerting

## 🀝 Contributing

1. Follow TDD approach (tests first)
2. Maintain async-first architecture
3. Keep clean separation of concerns
4. Add comprehensive error handling
5. Update documentation

## πŸ“ License

MIT License - see LICENSE file for details.

---

**ChatWait** - Built with ❀️ using FastAPI, OpenAI Agents SDK, and Chainlit