https://github.com/axect/forger
Forger: Reinforcement Learning Library in Rust
https://github.com/axect/forger
forger machine-learning reinforcement-learning rust
Last synced: 4 months ago
JSON representation
Forger: Reinforcement Learning Library in Rust
- Host: GitHub
- URL: https://github.com/axect/forger
- Owner: Axect
- License: apache-2.0
- Created: 2023-11-04T14:52:10.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2023-11-17T07:01:36.000Z (over 1 year ago)
- Last Synced: 2025-01-28T17:53:17.200Z (4 months ago)
- Topics: forger, machine-learning, reinforcement-learning, rust
- Language: Rust
- Homepage:
- Size: 1.41 MB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# Forger - Reinforcement Learning Library in Rust
![]()
## Introduction
Forger is a Reinforcement Learning (RL) library in Rust, offering a robust and efficient framework for implementing RL algorithms. It features a modular design with components for agents, environments, policies, and utilities, facilitating easy experimentation and development of RL models.
## Features
- **Modular Components**: Includes agents, environments, and policies as separate modules.
- **Efficient and Safe**: Built in Rust, ensuring high performance and safety.
- **Customizable Environments**: Provides a framework to create and manage different RL environments.
- **Flexible Agent Implementations**: Supports various agent strategies and learning algorithms.
- **Extensible Policy Framework**: Allows for the implementation of diverse action selection policies.## Modules
1. **Policy (`policy`)**:
- Defines the interface for action selection policies.
- Includes an implementation of Epsilon Greedy (with Decay) Policy.2. **Agent (`agent`)**:
- Outlines the structure for RL agents.
- Implements Value Iteration - Every Visit Monte Carlo (`VEveryVisitMC`) and Q-Learning - Every Visit Monte Carlo (`QEveryVisitMC`).3. **Environment (`env`)**:
- Provides the `Env` trait to define RL environments.
- Contains `LineWorld`, a simple linear world environment for experimentation.4. **Prelude (`prelude`)**:
- Exports commonly used items from the `env`, `agent`, and `policy` modules for convenient access.
## Getting Started
### Prerequisites
- Rust Programming Environment
### Installation
In your project directory, run the following command:
```shell
cargo add forger
```### Basic Usage
```rust
use forger::prelude::*;
use forger::env::lineworld::{LineWorld, LineWorldAction};pub type S = usize; // State
pub type A = LineWorldAction; // Action
pub type P = EGreedyPolicy; // Policy
pub type E = LineWorld; // Environmentfn main() {
let env = LineWorld::new(
5, // number of states
1, // initial state
4, // goal state
vec![0] // terminal states
);let mut agent = QEveryVisitMC::::new(0.9); // Q-learning (Everyvisit MC, gamma = 0.9)
let mut policy = EGreedyPolicy::new(0.5, 0.95); // Epsilon Greedy Policy (epsilon = 0.5, decay = 0.95)for _ in 0 .. 200 {
let mut episode = vec![];
let mut state = env.get_init_state();loop {
let action = agent.select_action(&state, &mut policy, &env);
let (next_state, reward) = env.transition(&state, &action);
episode.push((state, action.unwrap(), reward));
match next_state {
Some(s) => state = s,
None => break,
}
}agent.update(&episode);
policy.decay_epsilon();
}
}
```## Examples
1. [**Monte Carlo with Epsilon Decay in `LineWorld`**](./examples/lineworld_mc_edecay.rs):
- Demonstrates the use of the Q-Learning Every Visit Monte Carlo (`QEveryVisitMC`) agent with an Epsilon Greedy Policy (with decay) in the `LineWorld` environment.
- Illustrates the process of running multiple episodes, selecting actions, updating the agent, and decaying the epsilon value over time.
- Updates the agent after each episode.2. [**TD0 with Epsilon Decay in `GridWorld`**](./examples/gridworld_td0_edecay.rs):
- Demonstrates the use of the TD0 (`TD0`) agent with an Epsilon Greedy Policy (with decay) in the `GridWorld` environment.
- Illustrates the process of running multiple episodes, selecting actions, updating the agent, and decaying the epsilon value over time.
- Updates the agent every steps in each episode.
- Include test process of trained agent.## Contributing
Contributions to Forger are welcome! If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.
## License
Forger is licensed under the MIT License or the Apache 2.0 License.