{"id":39495740,"url":"https://github.com/benchopt/benchmark_nanogpt","last_synced_at":"2026-01-18T05:42:10.267Z","repository":{"id":302528119,"uuid":"1012455676","full_name":"benchopt/benchmark_nanogpt","owner":"benchopt","description":null,"archived":false,"fork":false,"pushed_at":"2026-01-15T08:19:25.000Z","size":49,"stargazers_count":1,"open_issues_count":5,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-15T14:55:35.328Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benchopt.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-07-02T11:07:20.000Z","updated_at":"2026-01-15T08:19:25.000Z","dependencies_parsed_at":"2025-07-02T22:18:56.398Z","dependency_job_id":"4c1b7f16-d34f-49c5-bf41-1ceee97f24a6","html_url":"https://github.com/benchopt/benchmark_nanogpt","commit_stats":null,"previous_names":["tommoral/benchmark_nanogpt","benchopt/benchmark_nanogpt"],"tags_count":0,"template":false,"template_full_name":"benchopt/template_benchmark","purl":"pkg:github/benchopt/benchmark_nanogpt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benchopt%2Fbenchmark_nanogpt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benchopt%2Fbenchmark_nanogpt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benchopt%2Fbenchmark_nanogpt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benchopt%2Fbenchmark_nanogpt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benchopt","download_url":"https://codeload.github.com/benchopt/benchmark_nanogpt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benchopt%2Fbenchmark_nanogpt/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28531341,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T00:39:45.795Z","status":"online","status_checked_at":"2026-01-18T02:00:07.578Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-18T05:42:10.212Z","updated_at":"2026-01-18T05:42:10.258Z","avatar_url":"https://github.com/benchopt.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\nBenchmarking deep learning optimization with nanoGPT\n====================================================\n|Build Status| |Python 3.10+|\n\nThis benchmark is dedicated to evaluate new deep learning optimization methods\non the nanoGPT architecture.\nThe optimization problem is defined as in the original speedrun of nanoGPT (see `modded nanogpt`_):\n\n- The training and validation is perfromed on `FineWeb`_ -- Do not change the dataloaders.\n- The training is stopped once the validation loss is below ``3.28``. (Still todo)\n\n\nFor now, the repository contains a single solver, Adam, and run on CPU.\nThe dataloaders are working but with fixed sequence length of 128 tokens.\nWe used the original code from nanoGPT (`GPT2 from llm.c`_), but use the simple dataloader from `modded-nanogpt`_.\n\nTODO:\n\n- Tweak the dataloaders to make it more efficient/less error prone.\n- See if we want to add imporevments to the architecture (QK-norm, Rotary embeddings, etc.).\n\nInstall\n--------\n\nThis benchmark can be run using the following commands:\n\n.. code-block::\n\n   $ pip install -U benchopt\n   $ git clone https://github.com/tomMoral/benchmark_nanogpt\n   $ benchopt run benchmark_nanogpt\n\nApart from the problem, options can be passed to ``benchopt run``, to restrict the benchmarks to some solvers or datasets, e.g.:\n\n.. code-block::\n\n\t$ benchopt run benchmark_nanogpt -s solver1 -d dataset2 --max-runs 10 --n-repetitions 10\n\n\nUse ``benchopt run -h`` for more details about these options, or visit https://benchopt.github.io/api.html.\n\n.. |Build Status| image:: https://github.com/tomMoral/benchmark_nanogpt/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/tomMoral/benchmark_nanogpt/actions\n.. |Python 3.10+| image:: https://img.shields.io/badge/python-3.10%2B-blue\n   :target: https://www.python.org/downloads/release/python-3100/\n\n.. _FineWeb: https://huggingface.co/datasets/HuggingFaceFW/fineweb\n.. _modded nanogpt: https://github.com/KellerJordan/modded-nanogpt\n.. _GPT2 from llm.c: https://github.com/karpathy/llm.c\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenchopt%2Fbenchmark_nanogpt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenchopt%2Fbenchmark_nanogpt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenchopt%2Fbenchmark_nanogpt/lists"}