Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Public

Public, general purpose agent for ML Research Benchmark. This agent provides a foundation for comparing and evaluating machine learning research and development tasks that agents can perform.
https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Public

Last synced: 3 days ago
JSON representation

Public, general purpose agent for ML Research Benchmark. This agent provides a foundation for comparing and evaluating machine learning research and development tasks that agents can perform.

Awesome Lists containing this project

README

        

# ML Research Benchmark Baseline Agent

This is our public baseline research and development agent. It is an agentic system designed to serve as a baseline for various AI and machine learning tasks. This agent provides a foundation for comparing and evaluating machine learning research and development tasks that agents can perform. This agent is a simple, single-agent system that uses a task planner and a tools to perform machine learning tasks.

## Features
- Supports multiple AI/ML tasks
- Compatible with different LLM providers (OpenAI, Anthropic)
- Dockerized for easy deployment and reproducibility

[![Example Video](./img/example1.png)](https://www.youtube.com/watch?v=Xhpe8MHk56w)

## Available Tools

The AI Research Benchmark Baseline Agent comes equipped with a variety of tools to assist in different AI and machine learning tasks:

1. **Bash Tool**: Executes bash commands and scripts.

2. **Code Tool**: Manages code operations including writing, inserting, replacing, and deleting code.

3. **GitHub Tool**: Interacts with GitHub repositories to get README files, list files, and retrieve file contents.

4. **Semantic Scholar Tool**: Searches for academic papers, retrieves paper details, citations, and downloads papers.

5. **Python Tool**: Executes Python code.

6. **Return Function Tool**: Handles task completion.

7. **Scratchpad Tool**: Provides a scratchpad for experiment note-taking and temporary storage.

8. **Thought Tool**: Allows the agent to process and record thoughts.

9. **Long-Term Memory Tool**: Manages long-term memory storage and retrieval.

These tools can be used individually or in combination to tackle a wide range of AI research and benchmark tasks. The agent can seamlessly switch between tools as needed for complex operations.

## Prerequisites

- Python 3.x
- Docker (for containerized execution)

## Installation

1. Clone this repository:
```bash
git clone https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Public.git
cd ML-Research-Agent-Public
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

## Usage

Step 1: Create a .env file with the following environment variables:
```bash
OPENAI =
ANTHROPIC =
YOU_API_KEY =
GITHUB_ACCESS_TOKEN =
```

### Running without Docker

Step 2a: Run the agent:
To run the agent without Docker, use the following command:

```bash
python3 run.py --prompt "" --provider ""
```

### Running with Docker

Step 2b: Run the agent with Docker:

Build for CPU:
```
docker build --build-arg BASE_IMAGE=ubuntu:22.04 -t .
```

Build for GPU:
```
docker build --build-arg BASE_IMAGE=nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04 -t .
```

```bash
bash run.sh \
\
\
<"cpu" or gpu_ids eg. 0> \
\

```

Example on CPU:
```bash
bash run.sh ghcr.io/algorithmicresearchgroup/ml-research-agent-public \
"train an mlp on the mnist dataset" \
openai \
"cpu" \
\
/root/ML-Research-Agent-Public/.env
```

Example on GPU:
```bash
bash run.sh ghcr.io/algorithmicresearchgroup/ml-research-agent-public \
"train an mlp on the mnist dataset" \
openai \
0 \
\
/path/to/.env
```

## Contributing

Contributions to improve the baseline agent or add new tasks are welcome. Please submit a pull request or open an issue to discuss proposed changes.

## License

AGPL-3.0

## Contact

For questions or support, please contact Algorithmic Research Group at [email protected]