{"id":20216309,"url":"https://github.com/thudm/autowebglm","last_synced_at":"2025-05-16T18:10:32.071Z","repository":{"id":231645398,"uuid":"781311731","full_name":"THUDM/AutoWebGLM","owner":"THUDM","description":"An LLM-based Web Navigating Agent (KDD'24)","archived":false,"fork":false,"pushed_at":"2024-09-27T10:08:29.000Z","size":13848,"stargazers_count":848,"open_issues_count":13,"forks_count":73,"subscribers_count":28,"default_branch":"main","last_synced_at":"2025-04-07T03:19:31.151Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/THUDM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-03T06:28:11.000Z","updated_at":"2025-04-06T18:15:23.000Z","dependencies_parsed_at":"2024-11-14T06:38:03.570Z","dependency_job_id":null,"html_url":"https://github.com/THUDM/AutoWebGLM","commit_stats":null,"previous_names":["thudm/autowebglm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FAutoWebGLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FAutoWebGLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FAutoWebGLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FAutoWebGLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/THUDM","download_url":"https://codeload.github.com/THUDM/AutoWebGLM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254582907,"owners_count":22095518,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T06:27:20.053Z","updated_at":"2025-05-16T18:10:32.029Z","avatar_url":"https://github.com/THUDM.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1\u003eAutoWebGLM: A Large Language Model-based Web Navigating Agent\u003c/h1\u003e\n\nThis is the official implementation of AutoWebGLM. If you find our open-sourced efforts useful, please 🌟 the repo to encourage our following development!\n\n# Overview\n\n![paper](./assets/framework.png)\n\nAutoWebGLM is a project aimed at building a more efficient language model-driven automated web navigation agent. This project is built on top of the ChatGLM3-6B model, extending its capabilities to navigate the web more effectively and tackle real-world browsing challenges better. \n\n## Features\n\n-   **HTML Simplification Algorithm**: Inspired by human browsing patterns, we've designed an algorithm to simplify HTML, making webpages more digestible for LLM agents while preserving crucial information.\n-   **Hybrid Human-AI Training**: We combine human and AI knowledge to build web browsing data for curriculum training, enhancing the model's practical navigation skills.\n-   **Reinforcement Learning and Rejection Sampling**: We enhance the model's webpage comprehension, browser operations, and efficient task decomposition abilities by bootstrapping it with reinforcement learning and rejection sampling.\n-   **Bilingual Web Navigation Benchmark**: We introduce AutoWebBench—a bilingual (Chinese and English) benchmark for real-world web browsing tasks. This benchmark provides a robust tool for testing and refining the capabilities of AI web navigation agents.\n\n# Evaluation\n\nWe have publicly disclosed our evaluation code, data, and environment. You may conduct the experiment using the following code.\n\n## AutoWebBench \u0026 Mind2Web\n\nYou can find our evaluation datasets at \u003ca href=\"./autowebbench/\" alt=\"autowebbench\"\u003eAutoWebBench\u003c/a\u003e and \u003ca href=\"./mind2web/\" alt=\"mind2web\"\u003eMind2Web\u003c/a\u003e. \nFor the code to perform model inference, please refer to \u003ca href=\"https://huggingface.co/THUDM/chatglm3-6b\" alt=\"chatglm3-6b\"\u003eChatGLM3-6B\u003c/a\u003e.\nAfter obtaining the output file, the score can be obtained through ```python eval.py [result_path]```.\n\n## WebArena\n\nWe have made modifications to the WebArena environment to fit the interaction of our system; see \u003ca href=\"./webarena/\" alt=\"webarena\"\u003eWebArena\u003c/a\u003e. The modifications and execution instructions can be found in \u003ca href=\"./webarena/README.md\" alt=\"readme\"\u003eREADME\u003c/a\u003e.\n\n## MiniWob++\n\nWe have also made modifications to the MiniWob++ environment, see \u003ca href=\"./miniwob++/\" alt=\"miniwob++\"\u003eMiniWob++\u003c/a\u003e. The modifications and execution instructions can be found in \u003ca href=\"./miniwob++/README.md\" alt=\"readme\"\u003eREADME\u003c/a\u003e.\n\n# License\n\nThis repository is licensed under the [Apache-2.0 License](LICENSE). All open-sourced data is for resarch purpose only.\n\n# Citation\nIf you use this code for your research, please cite our paper.\n\n```\n@inproceedings{lai2024autowebglm,\n    author = {Lai, Hanyu and Liu, Xiao and Iong, Iat Long and Yao, Shuntian and Chen, Yuxuan and Shen, Pengbo and Yu, Hao and Zhang, Hanchen and Zhang, Xiaohan and Dong, Yuxiao and Tang, Jie},\n    title = {AutoWebGLM: A Large Language Model-based Web Navigating Agent},\n    booktitle = {Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},\n    pages = {5295–-5306},\n    year = {2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthudm%2Fautowebglm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthudm%2Fautowebglm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthudm%2Fautowebglm/lists"}