https://github.com/hunkim/o

Last synced: 8 months ago
JSON representation

Toy O

Host: GitHub
URL: https://github.com/hunkim/o
Owner: hunkim
License: mit
Created: 2024-09-14T20:18:41.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-09-21T08:27:13.000Z (over 1 year ago)
Last Synced: 2025-03-31T05:04:43.870Z (10 months ago)
Language: Python
Size: 34.2 KB
Stars: 16
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# o: Step-by-Step Reasoning Demos

This repository contains toy demos for step-by-step reasoning, inspired by:
- STaR paper https://arxiv.org/abs/2203.14465
- OpenAI's o1
- Reflection techniques
- SkunkworksAI/reasoning-0.01 dataset
- Implementation from [bklieger-groq/g1](https://github.com/bklieger-groq/g1)
- And many more

## Quick Demo
- toy-o1: https://toy-o1.streamlit.app/
- toy-o2: https://toy-o2.streamlit.app/

## Overview

We present two approaches:
1. **o1**: Fixed reasoning based on the SkunkworksAI/reasoning-0.01 dataset. You can also leverage web search.
2. **o2**: Relies on the LLM's basic planning skills (code mostly reused from bklieger-groq/g1).

Both approaches have their pros and cons, providing interesting comparisons.

We use [Solar-Pro Preview](https://huggingface.co/upstage/solar-pro-preview-instruct) as the base LLM, but you can try others using Langchain.
## Running Locally
1. Clone this repository
2. Add 'UPSTAGE_API_KEY' to your environment variables (use .streamlit config or .env)
3. Run `make o1` or `make o2`

## Changing Base LLMs
Replace the following code with your preferred LLM:

```python
langchain_upstage import ChatUpstage as Chat
llm = Chat(model="solar-pro")
```

## Limitations
Solar-Pro is only 22B size mode with 4K context windows. https://huggingface.co/upstage/solar-pro-preview-instruct

## Contributions
comments, pull request are always welcome

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hunkim/o

Awesome Lists containing this project

README