An open API service indexing awesome lists of open source software.

https://github.com/lalomorales22/thinker

Full-stack AI training platform showcasing every capability of the Tinker SDK from Thinking Machines Lab.
https://github.com/lalomorales22/thinker

dpo llms rl rlhf tinker

Last synced: about 1 month ago
JSON representation

Full-stack AI training platform showcasing every capability of the Tinker SDK from Thinking Machines Lab.

Awesome Lists containing this project

README

          

# 🧠 Thinker

**A love letter to Mira Murati** - Full-stack AI training platform showcasing every capability of the Tinker SDK from Thinking Machines Lab.

## What is Thinker?

An IDE-like platform for training self-evolving AI agents. Build code review models, autonomous debuggers, and multi-agent systems using supervised learning, RL, RLHF, and DPO - all through a beautiful interface.

## ✨ Features

### 6 Fully-Functional Views

1. **⚡ Training Dashboard** - Create & monitor training jobs (SL, RL, RLHF, DPO)
2. **📦 Models Library** - Browse, manage, export trained models
3. **💾 Dataset Manager** - Upload & manage training datasets with HuggingFace import
4. **💬 Playground** - Interactive code review & chat with Monaco editor
5. **📊 Analytics** - Training metrics, charts, evaluation results
6. **🤖 Multi-Agent Arena** - Agents compete and collaborate in tournament/swarm modes

### Technical Highlights

- 🎨 Ultra-dark IDE-inspired interface
- 🔄 Real-time WebSocket training updates
- 🧠 Multi-agent RL with tournament/collaborative/swarm modes
- 📈 Recharts visualization ready
- 🤖 AI Training Assistant powered by Ollama (local LLMs)
- 💾 Persistent state management (Zustand)
- 🔍 HuggingFace dataset search and import
- ⚡ DPO (Direct Preference Optimization) training support

## Architecture

**Backend** (`/backend`)
- FastAPI - Async web framework
- WebSocket - Real-time updates
- Tinker SDK - LLM fine-tuning API
- Python 3.10+

**Frontend** (`/frontend`)
- React 18 + TypeScript
- Vite - Build tool & dev server
- TailwindCSS - Styling
- Monaco Editor - Code display
- Zustand - State management
- Socket.io - Real-time client
- Recharts - Data visualization

## 🚀 Quick Start

### Option 1: One-Command Launch (Recommended)

```bash
./START_UI.sh
```

This script:
- Installs frontend dependencies (if needed)
- Launches the dev server at `http://localhost:5173`
- Shows all 5 available views

### Option 2: Manual Setup

**Prerequisites:**
- Python 3.10+
- Node.js 18+
- Tinker API key
- (Optional) Ollama for AI Training Assistant - see [OLLAMA_SETUP.md](OLLAMA_SETUP.md)

**Backend:**
```bash
cd backend
pip install -r requirements.txt
export TINKER_API_KEY=your_key_here
python main.py
```

**Frontend:**
```bash
cd frontend
npm install
npm run dev
```

**Open:** `http://localhost:5173`

## Development

- Backend: `http://localhost:8000`
- Frontend: `http://localhost:5173`
- WebSocket: `ws://localhost:8000/ws`
- Current mode: UI with mock data (API integration ready)

## 📚 Documentation

- **[BUILD.md](BUILD.md)** - Complete vision, technical roadmap & revolutionary features
- **[HOW_TO_USE.md](HOW_TO_USE.md)** - Comprehensive user guide with training types, metrics glossary, and troubleshooting
- **[TRAINING_GUIDE.md](TRAINING_GUIDE.md)** - Step-by-step training guide from zero to production model
- **[HUGGINGFACE_IMPORT.md](HUGGINGFACE_IMPORT.md)** - Git-clone style dataset importing from HuggingFace Hub
- **[OLLAMA_SETUP.md](OLLAMA_SETUP.md)** - 🆕 Setup guide for AI Training Assistant with Ollama

## 🎯 Vision

Thinker is more than a training UI - it's a **living laboratory** for next-generation AI training:

- **Multi-Agent Arena** - Agents compete and collaborate, learning from each other
- **Swarm Intelligence** - Evolutionary approach with N model copies
- **Real-time RLHF** - Interactive feedback during training
- **Self-Improving Reward Models** - Models that learn to evaluate themselves
- **Meta-Learning** - Learn optimal hyperparameters from training history
- **Live Code Execution** - Run code in sandboxed containers for verification
- **RAG-Enhanced Review** - Context from docs, codebases, and git history

See [BUILD.md](BUILD.md) for the complete roadmap of revolutionary features.

## 🛠️ Current Status

**✅ Production-Ready UI:**
- 5-view tactical interface with widget design
- Full API integration with Tinker SDK
- Zustand state management
- Tactical dark theme with sleek highlights
- Rounded widget borders and glassmorphism
- Robust error handling and type safety

**🔧 Recent Fixes:**
- ✅ Fixed ModelsLibrary type validation (handles non-string model names)
- ✅ Fixed TrainingDashboard null-safe rendering (handles undefined job metrics)
- ✅ Fixed backend async/sync method warnings (proper async SDK calls)
- ✅ Improved data mapping with optional chaining and fallback values

**✅ Phase 2 Complete: Enhanced User Experience**
- ✅ Guided Training Wizard with multi-step setup
- ✅ Confirmation modals with deployment progress indicators
- ✅ Real-time training progress dashboard with expandable job cards
- ✅ Dataset validator and previewer with format detection

**✅ Phase 3 Complete: AI Assistant & Advanced Features**
- ✅ Natural Language Training Assistant powered by Ollama (local LLMs)
- ✅ AI assistant knows about Thinker platform and Tinker SDK
- ✅ Automatic fallback to pattern matching when Ollama unavailable
- ✅ HuggingFace dataset search, preview, and import
- ✅ DPO (Direct Preference Optimization) training implementation
- ✅ Multi-Agent RL Arena with tournament/collaborative/swarm modes
- ✅ Bradley-Terry preference learning
- ✅ Concurrent async operations for 30-50% faster training

**✅ Phase 4 Complete: Documentation & Polish**
- ✅ Comprehensive HOW_TO_USE.md with deep-dive training explanations
- ✅ Complete metrics glossary covering all training types
- ✅ Detailed troubleshooting decision trees
- ✅ Step-by-step TRAINING_GUIDE.md from beginner to advanced
- ✅ Interactive onboarding tour for first-time users
- ✅ Hyperparameter tuning recommendations

**✅ BONUS Complete: Technical Debt & Code Quality**
- ✅ **Error Handling**: Custom exception classes with proper HTTP status codes
- ✅ **Logging Infrastructure**: Centralized logging with file rotation and levels
- Console handler for INFO+ messages
- File handler for DEBUG+ messages
- Separate error log file for ERROR+ messages
- Structured logging with contextual information
- ✅ **Backend Testing**: Comprehensive test suite with pytest
- `test_datasets.py` - Dataset upload, validation, listing tests
- `test_training.py` - Training job lifecycle and configuration tests
- Test fixtures for clean state management
- Mock data handling for isolated tests
- ✅ **Frontend Testing**: React testing infrastructure
- Error boundary component with development/production modes
- Vitest configuration with jsdom environment
- Testing utilities setup with React Testing Library
- Component test examples and coverage configuration
- ✅ **Error Boundaries**: React error boundary wrapping entire app
- Graceful error handling with user-friendly fallback UI
- Detailed error information in development mode
- Action buttons (Try Again, Reload, Go Home)
- Error logging and callback support
- ✅ **Validation & Cleanup**: Enhanced data validation
- Dataset format validation (jsonl, json, csv)
- Split percentage validation (must sum to 100%)
- File cleanup on upload failures
- Encoding detection and fallback handling

**✅ Latest Update: Real HuggingFace Dataset Import**
- ✅ Actual HuggingFace datasets library integration (git-clone style)
- ✅ Search and browse thousands of datasets from HuggingFace Hub
- ✅ Real-time dataset download and conversion
- ✅ Automatic field mapping for SL and DPO training types
- ✅ Preview datasets before importing
- ✅ See [HUGGINGFACE_IMPORT.md](HUGGINGFACE_IMPORT.md) for detailed usage guide

**🔜 Next Steps:**
- Add Recharts data visualizations to Analytics view
- Add real-time multi-agent visualization
- Expand AI assistant with more training strategies
- Expand test coverage to 80%+
- Add integration tests for end-to-end workflows

## License

MIT

---

*Made w ❤️ by lalo for Mira*