https://github.com/shaheennabi/shaheennabi
aspiring research engineer focused on reasoning, thinking models and reinforcement learning
https://github.com/shaheennabi/shaheennabi
ai-engineer aws data-scientist devops engineer generative-ai-engineer large-language-models machine-learning-engineer mlops personal-readme reasoning reinforcement-learning research thinking-model
Last synced: 24 days ago
JSON representation
aspiring research engineer focused on reasoning, thinking models and reinforcement learning
- Host: GitHub
- URL: https://github.com/shaheennabi/shaheennabi
- Owner: shaheennabi
- Created: 2024-08-08T06:31:17.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2026-05-13T14:38:46.000Z (25 days ago)
- Last Synced: 2026-05-13T16:38:38.757Z (25 days ago)
- Topics: ai-engineer, aws, data-scientist, devops, engineer, generative-ai-engineer, large-language-models, machine-learning-engineer, mlops, personal-readme, reasoning, reinforcement-learning, research, thinking-model
- Homepage:
- Size: 297 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Thanks for tuning hereπ
---
# Who I am
```ββββββββββββββββββββββ```
```β research -- thinking, reasoning models β```
```ββββββββββββββββββββββ```
I study how large language models perform multi-step reasoning and how training and post-training methods can improve their reliability, efficiency, and scalability.
My work focuses on the post-training stack for LLMs β supervised fine-tuning (SFT), preference optimization, reinforcement learning methods such as RLVR, and inference-time compute strategies that improve reasoning without requiring larger models.
Iβm also interested in the interpretability of reasoning models: understanding the internal mechanisms that support multi-step reasoning and diagnosing failures such as shortcut reasoning, reward hacking, and unfaithful chain-of-thought.
Currently building and open-sourcing implementations of reasoning-focused training pipelines and contributing to LLM infrastructure and post-training frameworks.
* I love SpaceX rockets *
