An open API service indexing awesome lists of open source software.

https://github.com/tkellogg/lrm-reasoning

LRM answers to "if you're flying over a desert in a canoe and your wheels fall off, how many pancakes does it take to cover a dog house?"
https://github.com/tkellogg/lrm-reasoning

Last synced: 3 months ago
JSON representation

LRM answers to "if you're flying over a desert in a canoe and your wheels fall off, how many pancakes does it take to cover a dog house?"

Awesome Lists containing this project

README

          

# LRMs vs. The Absurdist Dog House

My family has long had this absurd riddle:

> If you're flying over a desert in a canoe and your wheels fall off, how many pancakes does it take to cover a dog house?

The answer is also absurd, "8 because lemons don't dance".

Initially, I sent this to [DeepSeek R1](deepseek-r1.md) and was blown away by how
extensive it's thought trace was. Also, how unconfident it was during the trace, but
how it flipped into a different mode, exuding confidence, when giving it's final answer.

I've been using this question as a test whenever I see a new LRM (Large Reasoning Model,
an LLM that does chain-of-thought reasoning before answering. It's not quantifiable as
a benchmark, but it gives a feel for how the model behaves.

Click through the files in this repo. Each of them are different models, with the exception
of [7yo-nephew.md](7yo-nephew.md), which is my 7 year old nephew. Also notable, [phi4](phi4.md)
is a regular LLM but seems to exhibit LRM behavior. And [SuperPrompt.gemini](SuperPrompt.gemini.md)
is regular Gemini but prompted to do chain-of-thought reasoning.