https://github.com/lucidrains/scaling-vin-pytorch
Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group
https://github.com/lucidrains/scaling-vin-pytorch
artificial-intelligence deep-learning planning value-iteration-networks
Last synced: 3 months ago
JSON representation
Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group
- Host: GitHub
- URL: https://github.com/lucidrains/scaling-vin-pytorch
- Owner: lucidrains
- License: mit
- Created: 2024-09-15T15:52:46.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-23T20:01:40.000Z (about 1 year ago)
- Last Synced: 2025-02-19T01:09:31.489Z (8 months ago)
- Topics: artificial-intelligence, deep-learning, planning, value-iteration-networks
- Language: Python
- Homepage:
- Size: 1.1 MB
- Stars: 36
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Scaling Value Iteration Networks
Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group
## Usage
```python
import torch
from scaling_vin_pytorch import ScalableVINscalable_vin = ScalableVIN(
state_dim = 3,
reward_dim = 2,
num_actions = 10
)state = torch.randn(2, 3, 32, 32)
reward = torch.randn(2, 2, 32, 32)agent_positions = torch.randint(0, 32, (2, 2))
target_actions = torch.randint(0, 10, (2,))
loss = scalable_vin(
state,
reward,
agent_positions,
target_actions
)loss.backward()
action_logits = scalable_vin(
state,
reward,
agent_positions
)
```## Citations
```bibtex
@article{Wang2024ScalingVI,
title = {Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning},
author = {Yuhui Wang and Qingyuan Wu and Weida Li and Dylan R. Ashley and Francesco Faccio and Chao Huang and J{\"u}rgen Schmidhuber},
journal = {ArXiv},
year = {2024},
volume = {abs/2406.08404},
url = {https://api.semanticscholar.org/CorpusID:270391752}
}
``````bibtex
@misc{pflueger2018soft,
title = {Soft Value Iteration Networks for Planetary Rover Path Planning},
author = {Max Pflueger and Ali Agha and Gaurav S. Sukhatme},
year = {2018},
url = {https://openreview.net/forum?id=Sktm4zWRb},
}
``````bibtex
@inproceedings{Tamar2016ValueIN,
title = {Value Iteration Networks},
author = {Aviv Tamar and Sergey Levine and P. Abbeel and Yi Wu and Garrett Thomas},
booktitle = {Neural Information Processing Systems},
year = {2016},
url = {https://api.semanticscholar.org/CorpusID:11374605}
}
```