https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl

A generalized experience replay buffer for reinforcement learning
https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl

Last synced: 9 months ago
JSON representation

A generalized experience replay buffer for reinforcement learning

Host: GitHub
URL: https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl
Owner: JuliaReinforcementLearning
License: mit
Created: 2022-04-06T15:45:46.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-05-11T12:53:27.000Z (over 1 year ago)
Last Synced: 2024-10-24T06:16:24.778Z (about 1 year ago)
Language: Julia
Homepage:
Size: 269 KB
Stars: 8
Watchers: 4
Forks: 8
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # ReinforcementLearningTrajectories

[![Build Status](https://github.com/JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl/actions/workflows/CI.yml?query=branch%3Amain)

[![Coverage](https://codecov.io/gh/JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl)

[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/T/Trajectories.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/report.html)

## Design

The relationship of several concepts provided in this package:

```

┌───────────────────────────────────┐

│ Trajectory                        │

│ ┌───────────────────────────────┐ │

│ │ EpisodesBuffer wrapping a     | |

| | AbstractTraces                │ │

│ │             ┌───────────────┐ │ │

│ │ :trace_A => │ AbstractTrace │ │ │

│ │             └───────────────┘ │ │

│ │                               │ │

│ │             ┌───────────────┐ │ │

│ │ :trace_B => │ AbstractTrace │ │ │

│ │             └───────────────┘ │ │

│ │  ...             ...          │ │

│ └───────────────────────────────┘ │

│          ┌───────────┐            │

│          │  Sampler  │            │

│          └───────────┘            │

│         ┌────────────┐            │

│         │ Controller │            │

│         └────────────┘            │

└───────────────────────────────────┘

```

## `Trajectory`

A `Trajectory` contains 3 parts:

- A `container` to store data. (Usually an `AbstractTraces`)

- A `sampler` to determine how to sample a batch from `container`

- A `controller` to decide when to sample a new batch from the `container`

Typical usage:

```julia

julia> t = Trajectory(Traces(a=Int[], b=Bool[]), BatchSampler(3), InsertSampleRatioControler(1.0, 3));

julia> for i in 1:5

           push!(t, (a=i, b=iseven(i)))

       end

julia> for batch in t

           println(batch)

       end

(a = [4, 5, 1], b = Bool[1, 0, 0])

(a = [3, 2, 4], b = Bool[0, 1, 1])

(a = [4, 1, 2], b = Bool[1, 0, 1])

```

**Traces**

- `Traces`

- `MultiplexTraces`

- `CircularSARTTraces`

- `NormalizedTraces`

**Samplers**

- `BatchSampler`

- `MetaSampler`

- `MultiBatchSampler`

- `EpisodesSampler`

**Controllers**

- `InsertSampleRatioController` 

- `AsyncInsertSampleRatioController`

Please refer tests for common usage. (TODO: generate docs and add links to above data structures)

## Acknowledgement

This async version is mainly inspired by [deepmind/reverb](https://github.com/deepmind/reverb).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/juliareinforcementlearning/reinforcementlearningtrajectories.jl

Awesome Lists containing this project

README