{"id":21901863,"url":"https://github.com/evanatyourservice/flat-sophia","last_synced_at":"2026-02-03T09:41:38.115Z","repository":{"id":253810008,"uuid":"843954254","full_name":"evanatyourservice/flat-sophia","owner":"evanatyourservice","description":"sophia optimizer further projected towards flat areas of loss landscape","archived":false,"fork":false,"pushed_at":"2024-12-19T00:23:01.000Z","size":275,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-03T00:11:15.559Z","etag":null,"topics":["jax","optax","optimization","second-order-optimization","sophia"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/evanatyourservice.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-08-17T23:30:41.000Z","updated_at":"2024-12-19T00:23:05.000Z","dependencies_parsed_at":"2025-07-17T17:18:28.640Z","dependency_job_id":"a1b11055-c4c5-4194-ac2f-b2d517004acc","html_url":"https://github.com/evanatyourservice/flat-sophia","commit_stats":null,"previous_names":["evanatyourservice/flat-sophia"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/evanatyourservice/flat-sophia","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evanatyourservice%2Fflat-sophia","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evanatyourservice%2Fflat-sophia/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evanatyourservice%2Fflat-sophia/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evanatyourservice%2Fflat-sophia/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/evanatyourservice","download_url":"https://codeload.github.com/evanatyourservice/flat-sophia/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evanatyourservice%2Fflat-sophia/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29039824,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-03T09:33:44.148Z","status":"ssl_error","status_checked_at":"2026-02-03T09:33:43.343Z","response_time":96,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["jax","optax","optimization","second-order-optimization","sophia"],"created_at":"2024-11-28T15:15:22.234Z","updated_at":"2026-02-03T09:41:38.100Z","avatar_url":"https://github.com/evanatyourservice.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# flat-sophia\n\nSophia optimizer further projected towards flat areas of loss landscape\n\nIdeas come mainly from [this paper by Wang et al.](https://arxiv.org/abs/2405.20763)\n\nThey projected adam towards a flatter area using Hvp. Here, since sophia is already \nusing the Hvp, we keep a cheap int8 mask used to further project sophia's update towards\nflatter areas.\n\n\n## A small experiment\n\nrun_experiment.py is a sort of worst case scenerio experiment where a ViT is too \nwide and shallow and is prone to overfitting.\n\nBaseline is orange line, flat-sophia is green line. Projecting updates towards flatter \nareas helped prevent overfitting and the rise in loss.\n\n![Loss](assets/loss.png)\n![Accuracy](assets/accuracy.png)\n\n\n## How it works\n\nThere are two pertinent values, `sharp_fraction` and `dampening_factor`. `sharp_fraction` \nis the fraction of sharpest updates that will be dampened, and `dampening_factor` is the \nfactor by which they'll be scaled down. The example uses `sharp_fraction=0.2` and \n`dampening_factor=10`.\n\nWhenever the preconditioner is updated, we also update the sharpness mask with the largest \n`sharp_fraction` of Hvp values equal to `dampening_factor`, and the rest equal to 1. \nThe final update is then divided by this mask.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevanatyourservice%2Fflat-sophia","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevanatyourservice%2Fflat-sophia","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevanatyourservice%2Fflat-sophia/lists"}