https://github.com/axect/forger

Forger: Reinforcement Learning Library in Rust
https://github.com/axect/forger

forger machine-learning reinforcement-learning rust

Last synced: 4 months ago
JSON representation

Forger: Reinforcement Learning Library in Rust

Host: GitHub
URL: https://github.com/axect/forger
Owner: Axect
License: apache-2.0
Created: 2023-11-04T14:52:10.000Z (over 1 year ago)
Default Branch: master
Last Pushed: 2023-11-17T07:01:36.000Z (over 1 year ago)
Last Synced: 2025-01-28T17:53:17.200Z (4 months ago)
Topics: forger, machine-learning, reinforcement-learning, rust
Language: Rust
Homepage:
Size: 1.41 MB
Stars: 3
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE

Awesome Lists containing this project

README

        # Forger - Reinforcement Learning Library in Rust



   



## Introduction

Forger is a Reinforcement Learning (RL) library in Rust, offering a robust and efficient framework for implementing RL algorithms. It features a modular design with components for agents, environments, policies, and utilities, facilitating easy experimentation and development of RL models.

## Features

- **Modular Components**: Includes agents, environments, and policies as separate modules.

- **Efficient and Safe**: Built in Rust, ensuring high performance and safety.

- **Customizable Environments**: Provides a framework to create and manage different RL environments.

- **Flexible Agent Implementations**: Supports various agent strategies and learning algorithms.

- **Extensible Policy Framework**: Allows for the implementation of diverse action selection policies.

## Modules

1. **Policy (`policy`)**:

   - Defines the interface for action selection policies.

   - Includes an implementation of Epsilon Greedy (with Decay) Policy.

2. **Agent (`agent`)**:

   - Outlines the structure for RL agents.

   - Implements Value Iteration - Every Visit Monte Carlo (`VEveryVisitMC`) and Q-Learning - Every Visit Monte Carlo (`QEveryVisitMC`).

3. **Environment (`env`)**:

   - Provides the `Env` trait to define RL environments.

   - Contains `LineWorld`, a simple linear world environment for experimentation.

4. **Prelude (`prelude`)**:

   - Exports commonly used items from the `env`, `agent`, and `policy` modules for convenient access.

## Getting Started

### Prerequisites

- Rust Programming Environment

### Installation

In your project directory, run the following command:

```shell

cargo add forger

```

### Basic Usage

```rust

use forger::prelude::*;

use forger::env::lineworld::{LineWorld, LineWorldAction};

pub type S = usize;             // State

pub type A = LineWorldAction;   // Action

pub type P = EGreedyPolicy;  // Policy

pub type E = LineWorld;         // Environment


fn main() {

    let env = LineWorld::new(

        5,      // number of states

        1,      // initial state

        4,      // goal state

        vec![0] // terminal states

    );

    let mut agent = QEveryVisitMC::::new(0.9); // Q-learning (Everyvisit MC, gamma = 0.9)

    let mut policy = EGreedyPolicy::new(0.5, 0.95);        // Epsilon Greedy Policy (epsilon = 0.5, decay = 0.95)

    for _ in 0 .. 200 {

        let mut episode = vec![];

        let mut state = env.get_init_state();

        loop {

            let action = agent.select_action(&state, &mut policy, &env);

            let (next_state, reward) = env.transition(&state, &action);

            episode.push((state, action.unwrap(), reward));

            match next_state {

                Some(s) => state = s,

                None => break,

            }

        }

        agent.update(&episode);

        policy.decay_epsilon();

    }

}

```

## Examples

1. [**Monte Carlo with Epsilon Decay in `LineWorld`**](./examples/lineworld_mc_edecay.rs):

   - Demonstrates the use of the Q-Learning Every Visit Monte Carlo (`QEveryVisitMC`) agent with an Epsilon Greedy Policy (with decay) in the `LineWorld` environment.

   - Illustrates the process of running multiple episodes, selecting actions, updating the agent, and decaying the epsilon value over time.

   - Updates the agent after each episode.

2. [**TD0 with Epsilon Decay in `GridWorld`**](./examples/gridworld_td0_edecay.rs):

   - Demonstrates the use of the TD0 (`TD0`) agent with an Epsilon Greedy Policy (with decay) in the `GridWorld` environment.

   - Illustrates the process of running multiple episodes, selecting actions, updating the agent, and decaying the epsilon value over time.

   - Updates the agent every steps in each episode.

   - Include test process of trained agent.

## Contributing

Contributions to Forger are welcome! If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.

## License

Forger is licensed under the MIT License or the Apache 2.0 License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/axect/forger

Awesome Lists containing this project

README