{"id":13420521,"url":"https://github.com/Dridi/libvmod-querystring","last_synced_at":"2025-03-15T06:33:28.813Z","repository":{"id":4028318,"uuid":"5128632","full_name":"dridi/libvmod-querystring","owner":"dridi","description":"Query-string module for Varnish Cache","archived":false,"fork":false,"pushed_at":"2024-05-20T21:35:30.000Z","size":312,"stargazers_count":98,"open_issues_count":0,"forks_count":26,"subscribers_count":13,"default_branch":"master","last_synced_at":"2024-05-21T04:11:37.887Z","etag":null,"topics":["c","module","querystrings","url","url-parsing","varnish","vmod"],"latest_commit_sha":null,"homepage":"","language":"M4","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dridi.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2012-07-20T21:28:45.000Z","updated_at":"2024-07-31T00:50:09.174Z","dependencies_parsed_at":"2024-07-31T01:00:12.513Z","dependency_job_id":null,"html_url":"https://github.com/dridi/libvmod-querystring","commit_stats":null,"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dridi%2Flibvmod-querystring","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dridi%2Flibvmod-querystring/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dridi%2Flibvmod-querystring/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dridi%2Flibvmod-querystring/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dridi","download_url":"https://codeload.github.com/dridi/libvmod-querystring/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221553292,"owners_count":16841999,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","module","querystrings","url","url-parsing","varnish","vmod"],"created_at":"2024-07-30T22:01:35.392Z","updated_at":"2024-10-26T16:30:58.413Z","avatar_url":"https://github.com/dridi.png","language":"M4","readme":"================\nvmod-querystring\n================\n\nDescription\n===========\n\nThe purpose of this module is to give you a fine-grained control over a URL's\nquery-string in Varnish Cache. It's possible to remove the query-string, clean\nit, sort its parameters or filter it to only keep a subset of them.\n\nThis can greatly improve your hit ratio and efficiency with Varnish, because\nby default two URLs with the same path but different query-strings are also\ndifferent. This is what the RFCs mandate but probably not what you usually\nwant for your web site or application.\n\nA query-string is just a character string starting after a question mark in a\nURL. But in a web context, it is usually a structured key/values store encoded\nwith the ``application/x-www-form-urlencoded`` media type. This module deals\nwith this kind of query-strings.\n\nExamples\n========\n\nConsider the default hashing in Varnish::\n\n    sub vcl_hash {\n        hash_data(req.url);\n        if (req.http.host) {\n            hash_data(req.http.host);\n        } else {\n            hash_data(server.ip);\n        }\n        return (lookup);\n    }\n\nClients requesting ``/index.html`` and ``/index.html?`` will most likely get\nthe exact same response with most web servers / frameworks / stacks / wossname\nbut Varnish will see two different URLs and end up with two duplicate objects\nin the cache.\n\nThis is a problem hard to solve with Varnish alone because it requires some\nknowledge of the back-end application but it can usually be mitigated with\na couple assumptions:\n\n- the application doesn't need query-strings\n- except for POST requests that are not cached\n- and for analytics/tracking purposes\n\nIn this case it can be solved like this::\n\n    import querystring;\n\n    sub vcl_hash {\n        if (req.method == \"GET\" || req.method == \"HEAD\") {\n            hash_data(querystring.remove(req.url));\n        }\n        else {\n            hash_data(req.url);\n        }\n        hash_data(req.http.host);\n        return (lookup);\n    }\n\nThis way Varnish will get the same unique hash for both ``/index.html`` and\n``/index.html?`` but the back-end application will receive the original client\nrequest. Depending on your requirements/goals, you may also take a different\napproach.\n\nSurely enough this module can do more than what a simple regular expression\nsubstitution (``regsub``) could do, right? First, readability is improved. It\nshould be obvious what the previous snippet does with no regex to decipher.\n\nSecond, it makes more complex operations easier to implement. For instance,\nyou may want to remove Google Analytics parameters from requests because:\n\n- they could create cache duplicates for every campaigns\n- the application does not need them, only marketing folks\n- the user's browser makes AJAX calls to GA regardless\n- they can be delivered to marketing via ``varnishncsa``\n\nIt could be solved like this::\n\n    import std;\n    import querystring;\n\n    sub vcl_init {\n        new ga = querystring.filter();\n        ga.add_regex(\"^utm_.*\");\n    }\n\n    sub vcl_recv {\n        std.log(\"ga:\" + ga.extract(req.url, mode = keep));\n        set req.url = ga.apply(req.url);\n    }\n\nThis is enough to remove all Analytics parameters you may use (``utm_source``,\n``utm_medium``, ``utm_campaign`` etc) and keep the rest of the query-string\nunless there are no other parameters in which case it's simply removed. The\nlog statement allows you to get those analytics parameters (and only them) in\n``varnishncsa`` using the format string ``%{VCL_Log:ga}x``.\n\nAll functions are documented in the manual page ``vmod_querystring(3)``.\n\nInstallation\n============\n\nThe module relies on the GNU Build System, also known as autotools. To install\nit, start by grabbing the latest release [1]_ and follow these steps::\n\n    # Get to the source tree\n    tar -xzf vmod-querystring-${VERSION}.tar.gz\n    cd vmod-querystring-${VERSION}\n\n    # Build and install\n    ./configure\n    make\n    make check # optional\n    sudo make install\n\nYou only need to have Varnish (at least 6.0.6) and its development files\ninstalled on your system. Instead of manually installing the module you can\nbuild packages, see below. The ``configure`` script also needs ``pkg-config``\ninstalled to find Varnish development files.\n\nIf your Varnish installation did not use the default ``/usr`` prefix, you\nwill likely need to at least set the ``pkg-config`` path to find your Varnish\ninstallation. For example add this in your environment before running\n``./configure``::\n\n    export PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig\n\nOr the approach recommended by autoconf::\n\n    ./configure PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig ...\n\nThe module is then configured for an installation inside ``${PREFIX}``, unless\nthe ``--prefix`` option was used in the ``configure`` execution. For more\ninformation about what can be configured, run ``./configure --help``.\n\nAlongside the release archive, you will find a PDF export of the module's\nmanual.\n\nRPM Packaging\n=============\n\nInstead of directly installing the package you can build an RPM::\n\n    make rpm\n\nThe resulting packages can be found in the ``rpmbuild`` directory in your\nbuild tree.\n\nIf you need to build an RPM for a different platform you may use ``mock(1)``\nwith the proper ``--root`` option. All you got to do is run ``make mockbuild``\nand set the desired flags in the ``MOCK_OPTS`` variable. For instance, to\nbuild RPMs for CentOS 7::\n\n    make mockbuild MOCK_OPTS='--root epel-7-x86_64'\n\nThe resulting packages can be found in the ``mockbuild`` directory in your\nbuild tree.\n\nDPKG Packaging\n==============\n\nDPKG packaging is also available with ``dpkg-buildpackage(1)``, using the\n``deb`` target::\n\n    make deb\n\nIt is possible to either redefine the ``DPKG_BUILDPACKAGE`` command or simply\nadd options via ``DPKG_BUILDPACKAGE_OPTS``. For example to specify a specific\nprivilege escalation method::\n\n    make deb DPKG_BUILDPACKAGE_OPTS=-rfakeroot\n\nThe resulting packages can be found in the ``dpkgbuild`` directory in your\nbuild tree. By default sources and changes are NOT signed, in order to sign\npackages the ``DPKG_BUILDPACKAGE`` variable MUST be redefined.\n\nIf you need to build a Deb for a specific platform you may use ``pdebuild(1)``\nand ``pbuilder(8)`` to set up the base tarball and then run ``make pdebuild``\nand set the desired flags in the ``PDEBUILD_OPTS`` variable. For instance to\nbuild debs for Debian Sid, assuming your environment is properly configured\nto switch between distributions::\n\n    make pdebuild PDEBUILD_OPTS='-- --distribution sid'\n\nThe resulting packages can be found in the ``pdebuild`` directory in your\nbuild tree.\n\nAs an alternative to ``pdebuild(1)`` you may prefer ``sbuild(1)`` instead.\nSimilarly, you may run ``make sbuild`` and set the desired flags in the\n``SBUILD_OPTS`` variable. For instance to build debs for Debian Sid, assuming\nyour environment is properly configured to switch between distributions::\n\n    make sbuild SBUILD_OPTS='--dist sid'\n\nThe resulting packages can be found in the ``sbuild`` directory in your\nbuild tree.\n\nHacking\n=======\n\nWhen working on the source code, there are additional dependencies:\n\n- autoconf\n- automake\n- libtool\n- rst2man (python3-docutils)\n- varnish (at least 6.0.6)\n\nYou will notice the lack of a ``configure`` script, it needs to be generated\nwith the various autotools programs. Instead, you can use the ``bootstrap``\nscript that takes care of both generating and running ``configure``. It also\nworks for VPATH_ builds.\n\n.. _VPATH: https://www.gnu.org/software/automake/manual/html_node/VPATH-Builds.html\n\nArguments to the ``bootstrap`` script are passed to the underlying execution\nof the generated ``configure`` script. Once ``bootstrap`` is done, you can\nlater run the ``configure`` script directly if you need to reconfigure your\nbuild tree or use more than one VPATH.\n\nSee also\n========\n\nTo learn more about query-strings and HTTP caching, you can have a look at the\nrelevant RFCs:\n\n- `RFC 1866 Section 8.2.1`__: The form-urlencoded Media Type\n- `RFC 3986 Section 3`__: Syntax Components\n- `RFC 7234 Section 2`__: Overview of Cache Operation\n\n__ https://tools.ietf.org/html/rfc1866#section-8.2.1\n__ https://tools.ietf.org/html/rfc3986#section-3\n__ https://tools.ietf.org/html/rfc7234#section-2\n\nThe test suite also shows the differences in cache hits and misses with and\nwithout the use of this module.\n\n.. [1] https://github.com/Dridi/libvmod-querystring/releases/latest\n","funding_links":[],"categories":["TODO scan for Android support in followings"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDridi%2Flibvmod-querystring","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDridi%2Flibvmod-querystring","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDridi%2Flibvmod-querystring/lists"}