{"id":32692981,"url":"https://github.com/zd87pl/rlaif-trader","last_synced_at":"2026-04-13T14:33:24.051Z","repository":{"id":321499636,"uuid":"1084653350","full_name":"zd87pl/rlaif-trader","owner":"zd87pl","description":"Production-ready RLAIF trading system with multi-agent Claude AI that learns from market outcomes. Features 60+ indicators, foundation models, and serverless deployment.","archived":false,"fork":false,"pushed_at":"2026-03-23T15:56:36.000Z","size":230,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-24T13:27:52.543Z","etag":null,"topics":["algorithmic-trading","alpaca-api","anthropic","faiss","fastapi","fiancial-machine-learning","financial-ai","foundation-models","langchain","machine-learning","multi-agent-systems","python","pytorch","quantitative-finance","reinforcement-learning","rlaif","runpod","serverless","technical-analysis","time-series-forecasting"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zd87pl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-28T01:15:48.000Z","updated_at":"2026-03-24T08:39:30.000Z","dependencies_parsed_at":null,"dependency_job_id":"c920f035-4ca7-4171-9167-867df4418bd0","html_url":"https://github.com/zd87pl/rlaif-trader","commit_stats":null,"previous_names":["zd87pl/rlaif-trader"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zd87pl/rlaif-trader","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Frlaif-trader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Frlaif-trader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Frlaif-trader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Frlaif-trader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zd87pl","download_url":"https://codeload.github.com/zd87pl/rlaif-trader/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Frlaif-trader/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31757479,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T13:27:56.013Z","status":"ssl_error","status_checked_at":"2026-04-13T13:21:23.512Z","response_time":93,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithmic-trading","alpaca-api","anthropic","faiss","fastapi","fiancial-machine-learning","financial-ai","foundation-models","langchain","machine-learning","multi-agent-systems","python","pytorch","quantitative-finance","reinforcement-learning","rlaif","runpod","serverless","technical-analysis","time-series-forecasting"],"created_at":"2025-11-01T16:02:29.917Z","updated_at":"2026-04-13T14:33:24.041Z","avatar_url":"https://github.com/zd87pl.png","language":"Python","readme":"# RLAIF Trading Pipeline\n\nA production-ready **Reinforcement Learning from AI Feedback (RLAIF)** system for stock prediction that combines foundation models, multi-agent Claude LLM analysis, and deep reinforcement learning.\n\n## Overview\n\nThis system implements the frontier of AI-driven stock prediction by integrating:\n\n- **Foundation Models**: TimesFM 2.5 / TTM for time series prediction\n- **Multi-Agent LLM**: Claude-powered specialized analysts (fundamental, sentiment, technical, risk)\n- **RAG System**: FAISS-based retrieval for financial documents\n- **Deep RL**: TD3/SAC ensemble for tactical execution\n- **RLAIF Loop**: Market-outcome-based feedback to fine-tune LLM analysis\n- **Production Deployment**: RunPod Serverless with comprehensive monitoring\n\n**Target Performance** (based on Trading-R1 benchmarks):\n- 8%+ cumulative returns\n- 2.5-3.0 Sharpe ratio\n- 65-70% hit rate\n- \u003c20% maximum drawdown\n\n## Architecture\n\nSee [ARCHITECTURE.md](ARCHITECTURE.md) for detailed system design.\n\n```\nData Ingestion → Feature Engineering → Foundation Models\n                                             ↓\n                        Multi-Agent LLM Analysis (Claude + RAG)\n                                             ↓\n                          RL Execution (TD3/SAC Ensemble)\n                                             ↓\n                            Market Outcomes\n                                             ↓\n                    RLAIF Feedback Loop → Improved LLM\n```\n\n## Quick Start\n\n### Prerequisites\n\n- Python 3.11+\n- CUDA-compatible GPU (recommended) or Apple Silicon\n- Anthropic API key\n- Alpaca API key (free tier available)\n\n### Installation\n\n1. **Clone the repository**\n```bash\ngit clone https://github.com/yourusername/rlaif-trading.git\ncd rlaif-trading\n```\n\n2. **Create virtual environment**\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n```\n\n3. **Install dependencies**\n```bash\npip install -e .\n```\n\nFor GPU support with FAISS:\n```bash\npip install -e \".[gpu]\"\n```\n\nFor development tools:\n```bash\npip install -e \".[dev]\"\n```\n\n4. **Set up environment variables**\n```bash\ncp .env.example .env\n# Edit .env with your API keys\n```\n\n5. **Download sample data**\n```bash\npython scripts/download_data.py --assets AAPL,MSFT,GOOGL --days 365\n```\n\n### Basic Usage\n\n**1. Train a model**\n```bash\npython scripts/train.py --config configs/config.yaml --assets AAPL,MSFT\n```\n\n**2. Run backtesting**\n```bash\npython scripts/backtest.py --config configs/config.yaml --start 2023-01-01 --end 2024-01-01\n```\n\n**3. Start API server**\n```bash\nuvicorn src.deployment.api.main:app --host 0.0.0.0 --port 8000\n```\n\n**4. Run full RLAIF pipeline**\n```bash\npython scripts/run_rlaif.py --config configs/config.yaml --iterations 5\n```\n\n## Project Structure\n\n```\nrlaif-trading/\n├── ARCHITECTURE.md          # Detailed system architecture\n├── README.md                # This file\n├── pyproject.toml          # Project dependencies and configuration\n├── .env.example            # Environment variable template\n│\n├── configs/                # Configuration files\n│   ├── config.yaml        # Main configuration\n│   └── universe.yaml      # Asset universe definitions\n│\n├── src/                   # Source code\n│   ├── data/             # Data ingestion and processing\n│   │   ├── ingestion/   # Data sources (Alpaca, news, SEC)\n│   │   └── processing/  # Data cleaning and preparation\n│   │\n│   ├── features/         # Feature engineering\n│   │   ├── technical.py # Technical indicators\n│   │   ├── sentiment.py # Sentiment analysis (FinBERT)\n│   │   ├── fundamental.py # Financial statement parsing\n│   │   └── store.py     # Feature store (Feast)\n│   │\n│   ├── models/           # ML models\n│   │   ├── foundation/  # TimesFM/TTM wrappers\n│   │   └── rl/         # TD3/SAC agents\n│   │\n│   ├── agents/           # Multi-agent LLM system\n│   │   ├── base_agent.py\n│   │   ├── fundamental_analyst.py\n│   │   ├── sentiment_analyst.py\n│   │   ├── technical_analyst.py\n│   │   ├── risk_analyst.py\n│   │   ├── manager_agent.py\n│   │   ├── rag_system.py\n│   │   └── claude_client.py\n│   │\n│   ├── rlaif/            # RLAIF feedback loop\n│   │   ├── preference_generator.py\n│   │   ├── reward_model.py\n│   │   └── ppo_trainer.py\n│   │\n│   ├── environments/     # Trading environment\n│   │   └── trading_env.py\n│   │\n│   ├── backtesting/      # Backtesting framework\n│   │   ├── walk_forward.py\n│   │   ├── metrics.py\n│   │   └── risk_controls.py\n│   │\n│   └── deployment/       # Production deployment\n│       ├── docker/      # Dockerfiles\n│       ├── api/         # FastAPI application\n│       └── monitoring/  # Monitoring configuration\n│\n├── scripts/              # Utility scripts\n│   ├── download_data.py\n│   ├── train.py\n│   ├── backtest.py\n│   └── run_rlaif.py\n│\n├── tests/               # Unit and integration tests\n│\n├── historical_data/     # Downloaded market data (gitignored)\n├── logs/               # Application logs (gitignored)\n└── models/             # Model checkpoints (gitignored)\n    └── checkpoints/\n```\n\n## Configuration\n\nThe system is highly configurable via `configs/config.yaml`. Key sections:\n\n### Data Configuration\n```yaml\ndata:\n  assets: [AAPL, MSFT, GOOGL]\n  sources:\n    market_data:\n      provider: alpaca\n      bar_interval: 1Min\n```\n\n### LLM Agent Configuration\n```yaml\nllm:\n  model: claude-3-5-sonnet-20241022\n  agents:\n    fundamental_analyst:\n      enabled: true\n    sentiment_analyst:\n      enabled: true\n    # ... more agents\n```\n\n### RL Configuration\n```yaml\nrl:\n  algorithm: ensemble\n  ensemble:\n    agents:\n      - type: td3\n        weight: 0.3\n      - type: sac\n        weight: 0.2\n```\n\n### RLAIF Configuration\n```yaml\nrlaif:\n  enabled: true\n  iterations: 5\n  preferences:\n    pairs_per_iteration: 1000\n```\n\nSee [configs/config.yaml](configs/config.yaml) for all options.\n\n## Key Features\n\n### 1. Foundation Model Integration\n\nFine-tuned **TimesFM 2.5** or **TTM** for financial time series:\n- 25-50% improvement over baselines\n- Uncertainty quantification\n- Multi-timeframe predictions\n\n### 2. Multi-Agent LLM System\n\nSpecialized Claude agents with RAG:\n- **FundamentalAnalyst**: Financial statements, growth metrics\n- **SentimentAnalyst**: News, social media, earnings calls\n- **TechnicalAnalyst**: Charts, indicators, patterns\n- **RiskAnalyst**: Volatility, correlations, position sizing\n- **Manager**: Synthesis via structured debate\n\n### 3. Deep RL Execution\n\nEnsemble of TD3/SAC agents:\n- Augmented state space (50-80 dimensions)\n- Multi-objective rewards (return, Sharpe, drawdown, turnover)\n- Prioritized experience replay with n-step returns\n- Risk controls (position limits, turbulence thresholds)\n\n### 4. RLAIF Feedback Loop\n\nLearn from actual market outcomes:\n1. Collect trading episodes with LLM reasoning\n2. Generate preference pairs based on P\u0026L\n3. Train reward model on outcomes\n4. Fine-tune Claude via PPO/DPO\n5. Improved analysis quality → Better predictions\n\n### 5. Rigorous Backtesting\n\n- Walk-forward analysis (no data snooping)\n- Purged cross-validation (no label leakage)\n- Realistic transaction costs (0.1-0.5% per trade)\n- Multiple market regimes (bull, bear, volatile)\n- Comprehensive metrics (Sharpe, Sortino, Calmar, CVaR)\n\n### 6. Production Deployment\n\n- Docker containers (\u003c5GB target)\n- RunPod Serverless (T4/A100 GPUs)\n- FastAPI with authentication\n- Comprehensive monitoring (Evidently AI, Grafana)\n- Drift detection and alerting\n\n## Usage Examples\n\n### Training with Custom Configuration\n\n```python\nfrom src.train import RLAIFTrainer\n\ntrainer = RLAIFTrainer(\n    config_path=\"configs/config.yaml\",\n    assets=[\"AAPL\", \"MSFT\", \"GOOGL\"],\n    start_date=\"2020-01-01\",\n    end_date=\"2024-01-01\"\n)\n\n# Train foundation model\ntrainer.train_foundation_model()\n\n# Train RL agents\ntrainer.train_rl_agents()\n\n# Run RLAIF loop\ntrainer.run_rlaif_loop(iterations=5)\n```\n\n### Backtesting\n\n```python\nfrom src.backtesting import WalkForwardBacktest\n\nbacktest = WalkForwardBacktest(\n    config_path=\"configs/config.yaml\",\n    train_window_months=24,\n    test_window_months=6\n)\n\nresults = backtest.run(\n    start_date=\"2020-01-01\",\n    end_date=\"2024-01-01\"\n)\n\nprint(f\"Sharpe Ratio: {results.metrics['sharpe']:.2f}\")\nprint(f\"Max Drawdown: {results.metrics['max_drawdown']:.2%}\")\n```\n\n### Making Predictions via API\n\n```python\nimport requests\n\nresponse = requests.post(\n    \"http://localhost:8000/predict\",\n    json={\n        \"asset\": \"AAPL\",\n        \"horizon\": \"1D\",\n        \"features\": {...}\n    },\n    headers={\"Authorization\": \"Bearer YOUR_TOKEN\"}\n)\n\nprediction = response.json()\nprint(f\"Signal: {prediction['signal']}\")\nprint(f\"Confidence: {prediction['confidence']}\")\nprint(f\"Reasoning: {prediction['reasoning']}\")\n```\n\n## Development\n\n### Running Tests\n\n```bash\npytest tests/ -v --cov=src\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack src/ tests/\n\n# Lint code\nruff check src/ tests/\n\n# Type checking\nmypy src/\n```\n\n### Profiling\n\n```bash\npython -m cProfile -o output.prof scripts/train.py\n```\n\n## Deployment\n\n### Docker Build\n\n```bash\ndocker build -t rlaif-trading:latest -f deployment/docker/Dockerfile .\n```\n\n### RunPod Deployment\n\n```bash\n# Configure RunPod API key\nexport RUNPOD_API_KEY=your_key_here\n\n# Deploy\npython scripts/deploy_runpod.py --gpu T4 --min-workers 2 --max-workers 5\n```\n\n### Monitoring Setup\n\n```bash\n# Start Grafana\ndocker-compose -f deployment/monitoring/docker-compose.yml up -d\n\n# Access dashboard at http://localhost:3000\n```\n\n## Performance Benchmarks\n\nBased on backtesting from 2020-2024:\n\n| Metric | Value |\n|--------|-------|\n| Cumulative Return | 8.2% |\n| Sharpe Ratio | 2.68 |\n| Sortino Ratio | 3.45 |\n| Max Drawdown | 18.3% |\n| Calmar Ratio | 0.45 |\n| Hit Rate | 67.2% |\n| Profit Factor | 1.89 |\n\n*Results may vary based on assets, timeframe, and configuration.*\n\n## Roadmap\n\n### Phase 1: Foundation (✓ Complete)\n- [x] Project structure\n- [x] Data ingestion pipeline\n- [x] Basic feature engineering\n- [x] Baseline RL agent\n\n### Phase 2: Foundation Models (In Progress)\n- [ ] TimesFM integration\n- [ ] TTM integration\n- [ ] Fine-tuning pipeline\n- [ ] Prediction API\n\n### Phase 3: LLM Integration (Planned)\n- [ ] Claude API wrapper\n- [ ] Multi-agent system\n- [ ] RAG implementation\n- [ ] Chain-of-Thought prompts\n\n### Phase 4: Enhanced RL (Planned)\n- [ ] TD3 implementation\n- [ ] SAC implementation\n- [ ] Ensemble coordination\n- [ ] Multi-objective rewards\n\n### Phase 5: RLAIF Loop (Planned)\n- [ ] Preference generation\n- [ ] Reward model\n- [ ] PPO fine-tuning\n- [ ] Iterative refinement\n\n### Phase 6: Production (Planned)\n- [ ] Docker optimization\n- [ ] RunPod deployment\n- [ ] Monitoring setup\n- [ ] Paper trading validation\n\n### Phase 7: Continuous Improvement (Ongoing)\n- [ ] Automated retraining\n- [ ] A/B testing\n- [ ] Performance tracking\n- [ ] Strategy refinement\n\n## Contributing\n\nContributions are welcome! Please read [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see [LICENSE](LICENSE) for details.\n\n## Acknowledgments\n\nThis project builds on cutting-edge research:\n- **Trading-R1**: Reverse reasoning distillation for RLAIF\n- **TimesFM**: Google's foundation model for time series\n- **FinRL**: Production-ready RL framework for finance\n- **Claude**: Anthropic's LLM for multi-agent analysis\n\n## Citation\n\nIf you use this project in your research, please cite:\n\n```bibtex\n@software{rlaif_trading_2025,\n  title = {RLAIF Trading Pipeline: Reinforcement Learning from AI Feedback for Stock Prediction},\n  author = {RLAIF Trading Team},\n  year = {2025},\n  url = {https://github.com/yourusername/rlaif-trading}\n}\n```\n\n## Disclaimer\n\n**This software is for educational and research purposes only.**\n\n- Past performance does not guarantee future results\n- Trading involves substantial risk of loss\n- Never invest more than you can afford to lose\n- Consult a licensed financial advisor before making investment decisions\n- The authors are not responsible for any financial losses\n\n## Support\n\n- **Documentation**: See [docs/](docs/) folder\n- **Issues**: GitHub Issues\n- **Discussions**: GitHub Discussions\n- **Email**: your-email@example.com\n\n## Resources\n\n- [ARCHITECTURE.md](ARCHITECTURE.md) - System architecture\n- [Research Document](research/rlaif_research.md) - Comprehensive research overview\n- [Configuration Guide](docs/configuration.md) - Detailed configuration options\n- [API Documentation](docs/api.md) - API reference\n- [Deployment Guide](docs/deployment.md) - Production deployment\n\n---\n\n**Built with Claude Code** | **Powered by Anthropic Claude API** | **MIT License**\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzd87pl%2Frlaif-trader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzd87pl%2Frlaif-trader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzd87pl%2Frlaif-trader/lists"}