{"id":29830826,"url":"https://github.com/finite-sample/beamgd","last_synced_at":"2025-07-29T10:11:52.876Z","repository":{"id":301026976,"uuid":"1007932946","full_name":"finite-sample/beamgd","owner":"finite-sample","description":"Beam Search Based Gradient Descent","archived":false,"fork":false,"pushed_at":"2025-06-24T19:14:59.000Z","size":80,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-24T20:22:59.426Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finite-sample.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-24T18:58:41.000Z","updated_at":"2025-06-24T19:15:03.000Z","dependencies_parsed_at":"2025-06-24T20:33:03.845Z","dependency_job_id":null,"html_url":"https://github.com/finite-sample/beamgd","commit_stats":null,"previous_names":["finite-sample/beamgd"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/finite-sample/beamgd","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fbeamgd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fbeamgd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fbeamgd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fbeamgd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finite-sample","download_url":"https://codeload.github.com/finite-sample/beamgd/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fbeamgd/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267668843,"owners_count":24124972,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-29T10:11:40.913Z","updated_at":"2025-07-29T10:11:51.330Z","avatar_url":"https://github.com/finite-sample.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Beam-GD: Beam Search Based Gradient Descent\n\nMany classic algorithms in statistics and machine learning rely on greedy principles: forward stepwise regression, hierarchical clustering, CART. At each step, they make the locally optimal choice, often because it's fast and works well enough. But greed has limits. Early choices constrain later options. The path you pick shapes what you can see—and what you miss.\n\nIn combinatorial problems, researchers have long explored ways to relax greediness. Beam search is one such strategy. Rather than committing to the single best option at each step, beam search keeps the top-k candidates and continues with all of them in parallel, selecting the best downstream performers. This allows for exploration without a full brute-force search.\n\nWhat if we brought the same idea to gradient descent?\n\n#### Beam Search for SGD\n\nWe implement a simple dynamic beam search variant for training a neural network classifier:\n\n* Maintain `k` models at each step (the beam).\n* Each model performs one step of SGD.\n* We optionally jitter parameters (add noise) to explore nearby regions.\n* Evaluate each model on a validation set.\n* Keep the top-k performers.\n* Repeat.\n\nThis is like running multiple SGD processes, but actively selecting the best `k` at each step based on validation performance—not just letting them run independently.\n\n#### Does It Work?\n\nWe compared standard SGD and dynamic beam search on a synthetic classification task. Both used the same architecture and training budget. Here's the comparison:\n\n* **Validation loss**: Beam search consistently achieves lower validation loss across epochs.\n* **Test loss**: Final test loss is also lower for beam search.\n\n```python\naseline Test Loss: 0.7224\nBeam Search Test Loss: 0.6632\n```\n\n#### Why It Works\n\nVanilla SGD is greedy—it descends the steepest slope it sees. But that slope may lead to a local minimum, saddle point, or flat region. By maintaining multiple optimization paths and selecting based on validation loss, beam search allows:\n\n* **Exploration**: Injecting noise expands the search.\n* **Selection**: Validation loss acts as an external guide.\n* **Adaptation**: Poor performers are dropped, good ones retained.\n\nIt’s a simple idea—keep more options open, then let performance decide.\n\n#### What's Next\n\nThis beam-based approach adds very little overhead for small models, and could be extended further:\n\n* Beam width decay over time\n* Crossover or ensembling between beams\n* Application to RL, transformers, or large-scale fine-tuning\n\nGreedy is fast. But when you need better solutions, sometimes it pays to be a little less greedy.\n\n---\n\nCode available on request or via Colab-ready snippet.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Fbeamgd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinite-sample%2Fbeamgd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Fbeamgd/lists"}