https://github.com/filimoa/rd-tablebench
https://github.com/filimoa/rd-tablebench
Last synced: over 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/filimoa/rd-tablebench
- Owner: Filimoa
- Created: 2025-01-14T22:42:53.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-01-14T23:45:35.000Z (over 1 year ago)
- Last Synced: 2025-03-17T18:02:05.523Z (over 1 year ago)
- Language: Python
- Size: 19.5 KB
- Stars: 35
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# RD Table Bench Invocation + Grading Code
This repo contains the code for invoking each provider and grading the results. This is a fork of Reducto's original repo https://github.com/reductoai/rd-tablebench with improved support for different LLM providers (OpenAI, Anthropic, Gemini).
The proprietary models that Reduco implemeted have not been touched and will not working with the grading cli.
## Installing Dependencies
```
pip install -r requirements.txt
```
## Downloading Data
https://huggingface.co/datasets/reducto/rd-tablebench/blob/main/README.md
## Env Vars
Create an `.env` file with the following:
```
# directory where the huggingface dataset is downloaded
INPUT_DIR=
# directory where the output will be saved
OUTPUT_DIR=
# note: only need keys for providers you want to use
OPENAI_API_KEY=
GEMINI_API_KEY=
ANTHROPIC_API_KEY=
...
```
## Parsing
`python -m providers.llm --model gemini-2.0-flash-exp --num-workers 10`
## Grading
`python -m grade_cli --model gemini-2.0-flash-exp`