{"id":30727590,"url":"https://github.com/dnakov/hrm-mlx","last_synced_at":"2025-09-16T16:43:30.989Z","repository":{"id":308202951,"uuid":"1031336567","full_name":"dnakov/hrm-mlx","owner":"dnakov","description":"MLX implementation of Hierarchical Reasoning Model (HRM) - Adaptive computation for complex reasoning tasks","archived":false,"fork":false,"pushed_at":"2025-08-27T01:20:50.000Z","size":94,"stargazers_count":26,"open_issues_count":0,"forks_count":4,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-27T10:02:58.799Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dnakov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-03T14:28:06.000Z","updated_at":"2025-08-27T01:20:53.000Z","dependencies_parsed_at":"2025-08-28T16:46:10.247Z","dependency_job_id":null,"html_url":"https://github.com/dnakov/hrm-mlx","commit_stats":null,"previous_names":["dnakov/hrm-mlx"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dnakov/hrm-mlx","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnakov%2Fhrm-mlx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnakov%2Fhrm-mlx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnakov%2Fhrm-mlx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnakov%2Fhrm-mlx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dnakov","download_url":"https://codeload.github.com/dnakov/hrm-mlx/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnakov%2Fhrm-mlx/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273453710,"owners_count":25108473,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-03T02:00:09.631Z","response_time":76,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-03T14:07:40.079Z","updated_at":"2025-09-16T16:43:25.931Z","avatar_url":"https://github.com/dnakov.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Hierarchical Reasoning Model (HRM) - MLX Implementation\n\n![HRM Architecture](https://raw.githubusercontent.com/sapientinc/HRM/main/assets/hrm.png)\n\nThis is a complete MLX (Apple Silicon optimized) implementation of the Hierarchical Reasoning Model from the paper [\"Hierarchical Reasoning Model\"](https://arxiv.org/abs/2506.21734). The implementation is mathematically identical to the [original PyTorch version](https://github.com/sapientinc/HRM) while leveraging MLX for efficient training on Apple Silicon devices.\n\n## Overview\n\nThe Hierarchical Reasoning Model (HRM) is a novel recurrent architecture inspired by hierarchical and multi-timescale processing in the human brain. With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using just 1000 training samples, without pre-training or Chain-of-Thought supervision.\n\n### Key Features\n\n- **Hierarchical Architecture**: Two interdependent recurrent modules operating at different timescales\n- **Adaptive Computation Time (ACT)**: Dynamic computation depth with Q-learning based halting\n- **One-Step Gradient Approximation**: Memory-efficient training with O(1) complexity\n- **Small-Sample Learning**: Near-perfect performance with only 1000 training examples\n- **MLX Optimized**: Efficient training on Apple Silicon (M1/M2/M3/M4)\n\n### Performance\n\nThis implementation achieves performance identical to the original:\n- **Sudoku-Extreme**: Near-perfect accuracy with 1000 samples\n- **Training Time**: ~10 minutes on laptop GPU (original takes similar time on 8x GPU)\n- **Parameters**: ~27M (exact match)\n\n## Installation\n\n### Requirements\n\n- macOS with Apple Silicon (M1/M2/M3/M4)\n- Python 3.8+\n- MLX framework\n\n### Quick Setup\n\n```bash\n# Clone the repository\ngit clone https://github.com/your-repo/hrm-mlx.git\ncd hrm-mlx\n\n# Install dependencies\npip install -r requirements.txt\n```\n\n## Quick Start\n\n### Demo: Sudoku Solver 🧩\n\nTrain a master-level Sudoku AI on your Mac:\n\n```bash\n# Quick training with default parameters\n./train_sudoku.sh\n\n# Or with custom parameters\npython pretrain.py \\\n    --batch_size 32 \\\n    --learning_rate 1e-4 \\\n    --weight_decay 1.0 \\\n    --train_samples 1000 \\\n    --halt_max_steps 8\n```\n\n### Evaluation\n\n```bash\n# Evaluate a trained model\npython evaluate.py \\\n    --checkpoint checkpoints/best_model.npz \\\n    --batch_size 32\n```\n\n## Architecture Details\n\n### Model Components\n\nThe implementation is organized into modular components matching the original structure:\n\n```\nmodels/\n├── __init__.py\n├── common.py          # Initialization utilities\n├── layers.py          # Core layers (Attention, SwiGLU, RMSNorm)\n├── losses.py          # Loss functions (StableMax, ACT losses)\n├── sparse_embedding.py # Sparse embeddings for puzzles\n└── hrm/\n    ├── __init__.py\n    └── hrm_act_v1.py  # Main HRM model with ACT\n```\n\n### Key Implementation Details\n\n1. **Exact Mathematical Match**: All operations match the original PyTorch implementation\n   - Truncated normal initialization with JAX-compatible formula\n   - StableMax activation with epsilon = 1e-30\n   - RMS normalization with float32 precision\n   - Rotary position embeddings (RoPE)\n\n2. **MLX Adaptations**: \n   - Standard attention (no FlashAttention)\n   - `mx.stop_gradient()` for buffer management\n   - MLX optimizers and checkpointing\n\n3. **ACT Implementation**:\n   - Q-learning based halting without replay buffer\n   - Exploration with configurable probability\n   - Bootstrap target computation\n\n## Training Configuration\n\n### Recommended Settings\n\nBased on the original paper for Sudoku-Extreme:\n\n```python\n# Architecture\nd_model = 512         # Model dimension\nH_cycles = 2          # High-level reasoning cycles\nL_cycles = 2          # Low-level computation cycles\nH_layers = 4          # High-level transformer layers\nL_layers = 4          # Low-level transformer layers\n\n# Training\nlearning_rate = 1e-4  # Learning rate\nweight_decay = 1.0    # L2 regularization\nbatch_size = 32       # Batch size\nhalt_max_steps = 8    # Maximum ACT steps\n\n# Data\ntrain_samples = 1000  # Training examples\nmin_difficulty = 20   # Minimum puzzle difficulty\n```\n\n### Known Issues\n\nAs documented in the original implementation:\n\u003e \"For Sudoku-Extreme (1,000-example dataset), late-stage overfitting may cause numerical instability during training and Q-learning. It is advisable to use early stopping once the training accuracy approaches 100%.\"\n\nIf you encounter NaN losses:\n1. The model has likely achieved good performance already\n2. Use early stopping or reduce learning rate\n3. Consider larger batch sizes for stability\n\n## File Structure\n\n```\nhrm-mlx/\n├── README.md              # This file\n├── requirements.txt       # Python dependencies\n├── pretrain.py           # Main training script\n├── evaluate.py           # Evaluation script\n├── train_sudoku.sh       # Quick training script\n├── models/               # Model implementation\n│   ├── common.py         # Common utilities\n│   ├── layers.py         # Neural network layers\n│   ├── losses.py         # Loss functions\n│   ├── sparse_embedding.py\n│   └── hrm/             # HRM specific modules\n├── data/                # Dataset directory\n│   └── sudoku-extreme/  # Sudoku dataset\n└── checkpoints/         # Saved models\n```\n\n## Differences from Original\n\nThis implementation is mathematically identical to the original with these adaptations for MLX:\n\n1. **Attention**: Standard scaled dot-product attention (no FlashAttention)\n2. **Buffers**: Uses `mx.stop_gradient()` instead of PyTorch buffers\n3. **Data Types**: Float32 throughout (MLX limitation for some operations)\n4. **Optimizers**: MLX's AdamW implementation\n5. **Checkpointing**: `.npz` format instead of PyTorch `.pt`\n\n## Advanced Usage\n\n### Custom Training\n\n```python\nfrom models.hrm import HierarchicalReasoningModel\nfrom pretrain import HRMTrainer\n\n# Create model with custom config\nmodel = HierarchicalReasoningModel(\n    vocab_size=vocab_size,\n    d_model=768,       # Larger model\n    H_cycles=4,        # More reasoning cycles\n    L_cycles=4,\n    halt_max_steps=16  # More computation time\n)\n\n# Train with custom settings\ntrainer = HRMTrainer(\n    model=model,\n    learning_rate=5e-5,\n    batch_size=64\n)\n```\n\n### Checkpointing\n\nThe trainer automatically:\n- Saves checkpoints every 10 steps\n- Keeps only the 2 most recent checkpoints\n- Saves best model based on validation accuracy\n- Supports auto-resume from latest checkpoint\n\n## Citation\n\nIf you use this implementation, please cite the original HRM paper:\n\n```bibtex\n@article{wang2025hierarchical,\n  title={Hierarchical Reasoning Model},\n  author={Wang, Guan and Li, Jin and Sun, Yuhao and Chen, Xing and Liu, Changling and Wu, Yue and Lu, Meng and Song, Sen and Yadkori, Yasin Abbasi},\n  journal={arXiv preprint arXiv:2506.21734},\n  year={2025}\n}\n```\n\n## Acknowledgments\n\n- Original HRM authors for the groundbreaking architecture\n- Apple MLX team for the excellent framework\n- The original implementation served as the exact reference\n\n## License\n\nThis implementation follows the same license as the original HRM repository.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdnakov%2Fhrm-mlx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdnakov%2Fhrm-mlx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdnakov%2Fhrm-mlx/lists"}