https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl
A generalized experience replay buffer for reinforcement learning
https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl
Last synced: 9 months ago
JSON representation
A generalized experience replay buffer for reinforcement learning
- Host: GitHub
- URL: https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl
- Owner: JuliaReinforcementLearning
- License: mit
- Created: 2022-04-06T15:45:46.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-11T12:53:27.000Z (over 1 year ago)
- Last Synced: 2024-10-24T06:16:24.778Z (about 1 year ago)
- Language: Julia
- Homepage:
- Size: 269 KB
- Stars: 8
- Watchers: 4
- Forks: 8
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ReinforcementLearningTrajectories
[](https://github.com/JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl/actions/workflows/CI.yml?query=branch%3Amain)
[](https://codecov.io/gh/JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl)
[](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/report.html)
## Design
The relationship of several concepts provided in this package:
```
┌───────────────────────────────────┐
│ Trajectory │
│ ┌───────────────────────────────┐ │
│ │ EpisodesBuffer wrapping a | |
| | AbstractTraces │ │
│ │ ┌───────────────┐ │ │
│ │ :trace_A => │ AbstractTrace │ │ │
│ │ └───────────────┘ │ │
│ │ │ │
│ │ ┌───────────────┐ │ │
│ │ :trace_B => │ AbstractTrace │ │ │
│ │ └───────────────┘ │ │
│ │ ... ... │ │
│ └───────────────────────────────┘ │
│ ┌───────────┐ │
│ │ Sampler │ │
│ └───────────┘ │
│ ┌────────────┐ │
│ │ Controller │ │
│ └────────────┘ │
└───────────────────────────────────┘
```
## `Trajectory`
A `Trajectory` contains 3 parts:
- A `container` to store data. (Usually an `AbstractTraces`)
- A `sampler` to determine how to sample a batch from `container`
- A `controller` to decide when to sample a new batch from the `container`
Typical usage:
```julia
julia> t = Trajectory(Traces(a=Int[], b=Bool[]), BatchSampler(3), InsertSampleRatioControler(1.0, 3));
julia> for i in 1:5
push!(t, (a=i, b=iseven(i)))
end
julia> for batch in t
println(batch)
end
(a = [4, 5, 1], b = Bool[1, 0, 0])
(a = [3, 2, 4], b = Bool[0, 1, 1])
(a = [4, 1, 2], b = Bool[1, 0, 1])
```
**Traces**
- `Traces`
- `MultiplexTraces`
- `CircularSARTTraces`
- `NormalizedTraces`
**Samplers**
- `BatchSampler`
- `MetaSampler`
- `MultiBatchSampler`
- `EpisodesSampler`
**Controllers**
- `InsertSampleRatioController`
- `AsyncInsertSampleRatioController`
Please refer tests for common usage. (TODO: generate docs and add links to above data structures)
## Acknowledgement
This async version is mainly inspired by [deepmind/reverb](https://github.com/deepmind/reverb).