{"id":20014276,"url":"https://github.com/dyth/dyth","last_synced_at":"2026-03-03T21:31:51.936Z","repository":{"id":82645399,"uuid":"376376042","full_name":"dyth/dyth","owner":"dyth","description":null,"archived":false,"fork":false,"pushed_at":"2026-01-16T23:16:27.000Z","size":80,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-17T11:36:11.875Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dyth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-06-12T20:12:27.000Z","updated_at":"2026-01-16T23:16:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"34fcdfd7-b5db-41b8-9720-b913cc0a492a","html_url":"https://github.com/dyth/dyth","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dyth/dyth","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dyth%2Fdyth","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dyth%2Fdyth/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dyth%2Fdyth/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dyth%2Fdyth/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dyth","download_url":"https://codeload.github.com/dyth/dyth/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dyth%2Fdyth/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30062393,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T18:21:05.932Z","status":"ssl_error","status_checked_at":"2026-03-03T18:20:59.341Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T07:39:53.671Z","updated_at":"2026-03-03T21:31:51.905Z","avatar_url":"https://github.com/dyth.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"### David Yu-Tung Hui, 許宇同\n\nI am an independent researcher interested in Deep Reinforcement Learning.\nMy research focuses on increasing the optimization stability of off-policy gradient-based $Q$-learning algorithms over a range of tasks and hyperparameters.\nI'm especially interested in developing algorithms to solve continuous control tasks.\n\nI've written two works along this research direction:\n\n1. **Stabilizing Q-Learning for Continuous Control**  \nDavid Yu-Tung Hui  \nMSc Thesis, University of Montreal, 2022  \nI showed that using LayerNorm in the critic of DDPG prevented divergence during training in MuJoCo and DeepMind Control continuous control environments, enabling non-trivial behaviors to be learned in the dog-run task of DeepMind Control.  \n[[.pdf]](https://papyrus.bib.umontreal.ca/xmlui/bitstream/handle/1866/32085/Hui_David_Yu-Tung_2022_memoire.pdf)\n[[Errata]](https://gist.github.com/dyth/0324b7a4c2ca4b0f3bab18583b5dc22b)\n\n3. **Double Gumbel Q-Learning**  \nDavid Yu-Tung Hui, Aaron Courville, Pierre-Luc Bacon  \nSpotlight at NeurIPS 2023  \nWe modeled noise introduced by a function approximator in $Q$-learning as a heteroscedastic Gumbel distribution and derived a loss function from this noise model that was effective in off-policy continuous control -- our resultant algorithm achieved ~2x the aggregate performance of SAC after 1M training timesteps.  \n[[.pdf]](https://proceedings.neurips.cc/paper_files/paper/2023/file/07956d40074d6523bad11112b3225c6e-Paper-Conference.pdf)\n[[Reviews]](https://openreview.net/forum?id=UdaTyy0BNB)\n[[Poster (.png)]](https://nips.cc/media/PosterPDFs/NeurIPS%202023/71497.png)\n[[5-min talk]](https://slideslive.com/39009623/double-gumbel-qlearning)\n[[1-hour seminar]](https://www.youtube.com/watch?v=GMNtHLA3bAE)\n[[Code (GitHub)]](https://github.com/dyth/doublegum)\n[[Errata]](https://gist.github.com/dyth/0abd5c5b87184144854a431437de7d44)\n\nIn 2023, I graduated with an MSc from Mila, University of Montreal.\nI'm looking for opportunities where I can continue my research.\n\nFor more information about me, see my [Google Scholar](https://scholar.google.com/citations?user=pXHOdMwAAAAJ\u0026hl=en).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdyth%2Fdyth","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdyth%2Fdyth","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdyth%2Fdyth/lists"}