https://github.com/kennethleungty/deepseek-r1-ollama-simple-evals

Run and Evaluate DeepSeek-R1 Distilled Models Locally with Ollama and OpenAI's simple-evals
https://github.com/kennethleungty/deepseek-r1-ollama-simple-evals

deepseek deepseek-r1 large-language-models llama llama3 llm ollama openai qwen qwen2-5 simple-evals

Last synced: 6 months ago
JSON representation

Run and Evaluate DeepSeek-R1 Distilled Models Locally with Ollama and OpenAI's simple-evals

Host: GitHub
URL: https://github.com/kennethleungty/deepseek-r1-ollama-simple-evals
Owner: kennethleungty
License: mit
Created: 2025-03-05T14:09:33.000Z (7 months ago)
Default Branch: main
Last Pushed: 2025-04-21T12:55:34.000Z (6 months ago)
Last Synced: 2025-04-23T16:16:26.153Z (6 months ago)
Topics: deepseek, deepseek-r1, large-language-models, llama, llama3, llm, ollama, openai, qwen, qwen2-5, simple-evals
Language: Jupyter Notebook
Homepage:
Size: 1.65 MB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Evaluating DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI's simple-evals

## Context
- The recent launch of the DeepSeek-R1 model sent ripples across the global AI community. It delivered breakthroughs on par with the reasoning models from Meta and OpenAI, achieving this in a fraction of the time and at a significantly lower cost.
- Beyond the headlines and online buzz, how can we assess the model's reasoning abilities using recognized benchmarks?
- DeepSeek's user interface makes it easy to explore its capabilities, but using it programmatically offers deeper insights and more seamless integration into real-world applications.
- Understanding how to run such models locally also provides enhanced control and offline access.
- In this project, we will explore how to use Ollama and OpenAI's simple-evals to evaluate the reasoning capabilities of DeepSeek-R1's distilled models based on the famous GPQA-Diamond benchmark.

**More details coming soon!**

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kennethleungty/deepseek-r1-ollama-simple-evals

Awesome Lists containing this project

README