{"id":22342425,"url":"https://github.com/protosec-research/autogdb","last_synced_at":"2025-07-25T21:41:40.370Z","repository":{"id":215272373,"uuid":"737493662","full_name":"Protosec-Research/AutoGDB","owner":"Protosec-Research","description":"AutoGDB: Automatic Binary-exploitation Tool combining GPT and GDB","archived":false,"fork":false,"pushed_at":"2024-05-29T02:12:18.000Z","size":2283,"stargazers_count":86,"open_issues_count":1,"forks_count":6,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-08-01T15:14:31.109Z","etag":null,"topics":["ai","artificial-intelligence","binary-exploitation","chatgpt","debugging","dynamic-debug","gdb","gpt-4","langchain","openai","pwn","pwndbg","pwntools","reverse-egineering"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Protosec-Research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-31T08:46:21.000Z","updated_at":"2024-08-01T15:14:36.717Z","dependencies_parsed_at":"2024-05-29T04:41:43.775Z","dependency_job_id":"4d1ac6c5-60e4-4648-9e2a-527a8b2f1495","html_url":"https://github.com/Protosec-Research/AutoGDB","commit_stats":null,"previous_names":["retr0reg/autogdb","protosec-research/autogdb"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Protosec-Research%2FAutoGDB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Protosec-Research%2FAutoGDB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Protosec-Research%2FAutoGDB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Protosec-Research%2FAutoGDB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Protosec-Research","download_url":"https://codeload.github.com/Protosec-Research/AutoGDB/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228070681,"owners_count":17864664,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","artificial-intelligence","binary-exploitation","chatgpt","debugging","dynamic-debug","gdb","gpt-4","langchain","openai","pwn","pwndbg","pwntools","reverse-egineering"],"created_at":"2024-12-04T08:12:35.618Z","updated_at":"2024-12-04T08:12:36.144Z","avatar_url":"https://github.com/Protosec-Research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# AutoGDB\n![AUTOGDB](imgs/AUTOGDB.png)\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://opensource.org/licenses/MIT\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blueviolet.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/retr0reg/AutoGDB/issues\"\u003e\u003cimg src=\"https://img.shields.io/github/issues/retr0reg/AutoGDB?color=blueviolet\"\u003e\u003c/a\u003e\n\u003ca href=\"https://www.python.org/\"\u003e\u003cimg src=\"https://img.shields.io/badge/made%20with-Python3-blueviolet\"\u003e\u003c/a\u003e\n\u003ca href=\"#\"\u003e\u003cimg src=\"https://img.shields.io/badge/platform-osx%2Flinux%2Fwindows-blueviolet\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nAutoGDB focused on combining the power of ChatGPT on reverse-engineering, binary-exploitation jobs. \nThis project includes:\n* `/plugin` **the plugin that your gdb needs to connect to your server**\n* `/server` **fastapi backend server**, use to deal with your information and gpts, acts like a bridge between\n* `/autogdb` **autogdb pip package**\n\nWatch AutoGDB Introduction!!!: https://www.youtube.com/watch?v=FYpdtcC71gg\n\n# Installing:\nIf you are new to AutoGDB, clone our project first!\n# GDB plugin:\n```shell\ngit clone https://github.com/retr0reg/AutoGDB.git \u0026\u0026 cd AutoGDB\nchmod +x ./install.sh \u0026\u0026 ./install.sh\npython3 -m pip install -r requirements.txt\n``` \n`install.sh` will add `plugin/gpt.py` in to your `~/.gdbinit` as a gdb plugin; Or you can do it by editing `~/.gdbinit` manually;\n\n```shell\necho \"/YOUR/PATH/TO/autogdb/plugin/gpt.py\" \u003e\u003e ~/.gdbinit\n```\n\nIn you `gdb` or `pwndbg`, You can run:\n```shell\npwndbg\u003e autogdb \u003cYOUR_SERVER_IP\u003e \u003cYOUR_SERVER_PORT\u003e\n```\n* `YOUR_SERVER_IP` : Your ip for the backend server\n* `YOUR_SERVER_PORT` : Port for this server\n\n## Chat with GDB！\nRun `chat.py`, and chat with your own AutoGDB!\n```\npython3 chat.py\n```\n```\nusage: AutoGDB [-h] [--serverless] [--clue CLUE] [--clean-history]\n\n  Enable GPT in your reversing job with GDB.\n  \n  options:\n    -h, --help       show this help message and exit\n    --serverless     Run AutoGDB without bulit-in server\n    --clue CLUE      Possible provided clues or helpful description of this challenge?\n    --clean-history  Clear previous commandline history of AutoGDB.\n```\n![Alt text](imgs/cli_shot.png)\n\nIf you want to develop our project, check out our wonderful and written-with-heart `autogdb_demo.py`, it explains everything! run it by `python3 autogdb_demo.py`\n\n**In gdb, make sure first to run the program and Ctrl-C to stop, then enter the autogdb command**\n\n\n\nhere's one wonderful example of analysis made by **GPT4, OpenAI**\n```python\n\u003e Entering new AgentExecutor chain...\nTo determine if the binary has a vulnerability, I need to start by analyzing the binary with GDB to check for any obvious signs of insecure patterns or practices such as unsafe functions, lack of bounds checking, or any other potential security weaknesses.\n\nAction: gdb\nAction Input: info functions\nObservation: \"All defined functions:\\n\\nNon-debugging symbols:\\n0x0000000000000610  _init\\n0x0000000000000640  printf@plt\\n0x0000000000000650  read@plt\\n0x0000000000000660  strcmp@plt\\n0x0000000000000670  setvbuf@plt\\n0x0000000000000680  __cxa_finalize@plt\\n0x0000000000000690  _start\\n0x00000000000006c0  deregister_tm_clones\\n0x0000000000000700  register_tm_clones\\n0x0000000000000750  __do_global_dtors_aux\\n0x0000000000000790  frame_dummy\\n0x000000000000079a  main\\n0x0000000000000830  __libc_csu_init\\n0x00000000000008a0  __libc_csu_fini\\n0x00000000000008a4  _fini\\n\"\nThought:The list of functions provided does not immediately suggest a vulnerability, as these are standard functions that one might expect to find in a binary. However, the presence of functions like `printf` and `read` could potentially be of interest, as they can be involved in vulnerabilities if not used correctly (e.g., format string vulnerabilities or buffer overflows). To proceed, I should examine the `main` function to understand how these functions are used within the program.\n\nAction: gdb\nAction Input: disassemble main\nObservation: \"Dump of assembler code for function main:\\n   0x000000000000079a \u003c+0\u003e:\\tpush   rbp\\n   0x000000000000079b \u003c+1\u003e:\\tmov    rbp,rsp\\n   0x000000000000079e \u003c+4\u003e:\\tmov    rax,QWORD PTR [rip+0x20087b]        # 0x201020 \u003cstdout@@GLIBC_2.2.5\u003e\\n   0x00000000000007a5 \u003c+11\u003e:\\tmov    ecx,0x0\\n   0x00000000000007aa \u003c+16\u003e:\\tmov    edx,0x2\\n   0x00000000000007af \u003c+21\u003e:\\tmov    esi,0x0\\n   0x00000000000007b4 \u003c+26\u003e:\\tmov    rdi,rax\\n   0x00000000000007b7 \u003c+29\u003e:\\tcall   0x670 \u003csetvbuf@plt\u003e\\n   0x00000000000007bc \u003c+34\u003e:\\tmov    rax,QWORD PTR [rip+0x20086d]        # 0x201030 \u003cstdin@@GLIBC_2.2.5\u003e\\n   0x00000000000007c3 \u003c+41\u003e:\\tmov    ecx,0x0\\n   0x00000000000007c8 \u003c+46\u003e:\\tmov    edx,0x2\\n   0x00000000000007cd \u003c+51\u003e:\\tmov    esi,0x0\\n   0x00000000000007d2 \u003c+56\u003e:\\tmov    rdi,rax\\n   0x00000000000007d5 \u003c+59\u003e:\\tcall   0x670 \u003csetvbuf@plt\u003e\\n   0x00000000000007da \u003c+64\u003e:\\tmov    edx,0x64\\n   0x00000000000007df \u003c+69\u003e:\\tlea    rsi,[rip+0x20085a]        # 0x201040 \u003cbuf\u003e\\n   0x00000000000007e6 \u003c+76\u003e:\\tmov    edi,0x0\\n   0x00000000000007eb \u003c+81\u003e:\\tmov    eax,0x0\\n   0x00000000000007f0 \u003c+86\u003e:\\tcall   0x650 \u003cread@plt\u003e\\n   0x00000000000007f5 \u003c+91\u003e:\\tlea    rsi,[rip+0xb8]        # 0x8b4\\n   0x00000000000007fc \u003c+98\u003e:\\tlea    rdi,[rip+0x20083d]        # 0x201040 \u003cbuf\u003e\\n   0x0000000000000803 \u003c+105\u003e:\\tcall   0x660 \u003cstrcmp@plt\u003e\\n   0x0000000000000808 \u003c+110\u003e:\\ttest   eax,eax\\n   0x000000000000080a \u003c+112\u003e:\\tjne    0x813 \u003cmain+121\u003e\\n   0x000000000000080c \u003c+114\u003e:\\tmov    eax,0x0\\n   0x0000000000000811 \u003c+119\u003e:\\tjmp    0x826 \u003cmain+140\u003e\\n   0x0000000000000813 \u003c+121\u003e:\\tlea    rdi,[rip+0x200826]        # 0x201040 \u003cbuf\u003e\\n   0x000000000000081a \u003c+128\u003e:\\tmov    eax,0x0\\n   0x000000000000081f \u003c+133\u003e:\\tcall   0x640 \u003cprintf@plt\u003e\\n   0x0000000000000824 \u003c+138\u003e:\\tjmp    0x7da \u003cmain+64\u003e\\n   0x0000000000000826 \u003c+140\u003e:\\tpop    rbp\\n   0x0000000000000827 \u003c+141\u003e:\\tret    \\nEnd of assembler dump.\\n\"\nThought:The disassembled main function shows several key operations. The program is setting up buffer handling with `setvbuf` for both `stdin` and `stdout`, which is a normal operation to control buffering. It then reads up to 0x64 (100) bytes from file descriptor 0 (which is `stdin` in Unix-like systems) into a buffer located at `0x201040`. After reading the input, it compares this buffer with a string located at an offset from the current instruction pointer (the address is not shown in the disassembly but is referenced as `0x8b4`). If the `strcmp` result is zero (meaning the strings are equal), it takes one branch; otherwise, it prints the buffer using `printf` and then loops back to read more input.\n\nThe use of `read` with a fixed size is not immediately indicative of a buffer overflow, as we don't know the actual size of the buffer at `0x201040`. However, the use of `printf` with the buffer as a format string is potentially dangerous because if the buffer contains format specifiers (e.g., `%s`, `%x`), they could be used to leak memory contents or cause a crash, which indicates a format string vulnerability.\n\nTo confirm the size of the buffer and see if there is a potential for buffer overflow, I need to examine the memory layout of the program, specifically the `.bss` section where uninitialized data like buffers are typically stored.\n\nAction: gdb\nAction Input: info variables\nObservation: \"All defined variables:\\n\\nNon-debugging symbols:\\n0x00000000000008b0  _IO_stdin_used\\n0x00000000000008c0  __GNU_EH_FRAME_HDR\\n0x0000000000000a04  __FRAME_END__\\n0x0000000000200da0  __frame_dummy_init_array_entry\\n0x0000000000200da0  __init_array_start\\n0x0000000000200da8  __do_global_dtors_aux_fini_array_entry\\n0x0000000000200da8  __init_array_end\\n0x0000000000200db0  _DYNAMIC\\n0x0000000000200fa0  _GLOBAL_OFFSET_TABLE_\\n0x0000000000201000  __data_start\\n0x0000000000201000  data_start\\n0x0000000000201008  __dso_handle\\n0x0000000000201010  __TMC_END__\\n0x0000000000201010  __bss_start\\n0x0000000000201010  _edata\\n0x0000000000201020  stdout\\n0x0000000000201020  stdout@@GLIBC_2.2.5\\n0x0000000000201030  stdin\\n0x0000000000201030  stdin@@GLIBC_2.2.5\\n0x0000000000201038  completed\\n0x0000000000201040  buf\\n0x00000000002010a8  _end\\n\"\nThought:The `info variables` command in GDB lists global and static variables, including those in the `.bss` section. The buffer `buf` is located at `0x201040`, and the next variable `completed` is at `0x201038`. This suggests that the buffer `buf` is located just before `completed`. The difference in addresses between `buf` and `_end` (which marks the end of the `.bss` section) is `0x10a8 - 0x1040 = 0x68` bytes (104 bytes in decimal).\n\nThis is a critical observation: the `read` function is called with a length of `0x64` (100 bytes), and the buffer `buf` is 104 bytes long. Therefore, there is no direct buffer overflow vulnerability since the `read` call does not exceed the size of the buffer. However, there is very little space for a null terminator, and if the input does not contain a null byte, this could lead to issues with string handling, although this is not confirmed from the current information.\n\nThe potential format string vulnerability identified earlier is still a concern because the `printf` function is called with the buffer as the format string. If an attacker can control the contents of the buffer and include format specifiers, they could exploit this vulnerability.\n\nTo confirm the format string vulnerability, I should check the arguments passed to `printf`. If the buffer is directly used as the format string without any additional arguments, this would confirm the vulnerability.\n\nAction: gdb\nAction Input: disassemble main\nObservation: \"Dump of assembler code for function main:\\n   0x000000000000079a \u003c+0\u003e:\\tpush   rbp\\n   0x000000000000079b \u003c+1\u003e:\\tmov    rbp,rsp\\n   0x000000000000079e \u003c+4\u003e:\\tmov    rax,QWORD PTR [rip+0x20087b]        # 0x201020 \u003cstdout@@GLIBC_2.2.5\u003e\\n   0x00000000000007a5 \u003c+11\u003e:\\tmov    ecx,0x0\\n   0x00000000000007aa \u003c+16\u003e:\\tmov    edx,0x2\\n   0x00000000000007af \u003c+21\u003e:\\tmov    esi,0x0\\n   0x00000000000007b4 \u003c+26\u003e:\\tmov    rdi,rax\\n   0x00000000000007b7 \u003c+29\u003e:\\tcall   0x670 \u003csetvbuf@plt\u003e\\n   0x00000000000007bc \u003c+34\u003e:\\tmov    rax,QWORD PTR [rip+0x20086d]        # 0x201030 \u003cstdin@@GLIBC_2.2.5\u003e\\n   0x00000000000007c3 \u003c+41\u003e:\\tmov    ecx,0x0\\n   0x00000000000007c8 \u003c+46\u003e:\\tmov    edx,0x2\\n   0x00000000000007cd \u003c+51\u003e:\\tmov    esi,0x0\\n   0x00000000000007d2 \u003c+56\u003e:\\tmov    rdi,rax\\n   0x00000000000007d5 \u003c+59\u003e:\\tcall   0x670 \u003csetvbuf@plt\u003e\\n   0x00000000000007da \u003c+64\u003e:\\tmov    edx,0x64\\n   0x00000000000007df \u003c+69\u003e:\\tlea    rsi,[rip+0x20085a]        # 0x201040 \u003cbuf\u003e\\n   0x00000000000007e6 \u003c+76\u003e:\\tmov    edi,0x0\\n   0x00000000000007eb \u003c+81\u003e:\\tmov    eax,0x0\\n   0x00000000000007f0 \u003c+86\u003e:\\tcall   0x650 \u003cread@plt\u003e\\n   0x00000000000007f5 \u003c+91\u003e:\\tlea    rsi,[rip+0xb8]        # 0x8b4\\n   0x00000000000007fc \u003c+98\u003e:\\tlea    rdi,[rip+0x20083d]        # 0x201040 \u003cbuf\u003e\\n   0x0000000000000803 \u003c+105\u003e:\\tcall   0x660 \u003cstrcmp@plt\u003e\\n   0x0000000000000808 \u003c+110\u003e:\\ttest   eax,eax\\n   0x000000000000080a \u003c+112\u003e:\\tjne    0x813 \u003cmain+121\u003e\\n   0x000000000000080c \u003c+114\u003e:\\tmov    eax,0x0\\n   0x0000000000000811 \u003c+119\u003e:\\tjmp    0x826 \u003cmain+140\u003e\\n   0x0000000000000813 \u003c+121\u003e:\\tlea    rdi,[rip+0x200826]        # 0x201040 \u003cbuf\u003e\\n   0x000000000000081a \u003c+128\u003e:\\tmov    eax,0x0\\n   0x000000000000081f \u003c+133\u003e:\\tcall   0x640 \u003cprintf@plt\u003e\\n   0x0000000000000824 \u003c+138\u003e:\\tjmp    0x7da \u003cmain+64\u003e\\n   0x0000000000000826 \u003c+140\u003e:\\tpop    rbp\\n   0x0000000000000827 \u003c+141\u003e:\\tret    \\nEnd of assembler dump.\\n\"\nThought:The re-examination of the `main` function's disassembly confirms that the `printf` call at address `0x081f` uses the buffer at `0x201040` directly as the format string (`rdi` register holds the first argument to functions in the x86_64 calling convention, which is the format string for `printf`). There are no additional arguments being pushed onto the stack or loaded into registers for this `printf` call, which means that any format specifiers in the buffer will be interpreted by `printf` and could potentially be exploited.\n\nGiven this information, it is clear that the binary has a format string vulnerability because it uses uncontrolled input as the format string for `printf`. This could allow an attacker to read from or write to memory locations, depending on the contents of the format string they provide.\n\nThought: I now know the final answer.\nFinal Answer: The binary has a format string vulnerability due to the use of an uncontrolled buffer as the format string argument for `printf`.\n\n\u003e Finished chain.\n```\n# Setup Server-side\n**After the `AutoGDB1.2` Version, Serverside-settings will be done within the `chat.py`**, if you have problems with AutoGDB server, you may run `python3 chat.py --serverless` in serverless mode, with ip and ports in `.server_cache_autogdb.json`\n\nThe Server-side of AutoGDB can be pull and used by using docker;\n```shell\ndocker pull retr0regd/autogdb-server:latest\ndocker run -p 5000:5000 retr0regd/autogdb-server:latest\n```\nAfter this, the server will be running on `5000` port in `127.0.0.1`\n\nOr you may simply use `python3` to install:\n\n```shell\npython3 -m pip install -r requirements.txt \u0026\u0026 chmod +x ./run.sh\n```\nafter installing requirements, you can run the server by `./run.sh`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprotosec-research%2Fautogdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprotosec-research%2Fautogdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprotosec-research%2Fautogdb/lists"}