{"id":18828386,"url":"https://github.com/dealfonso/htmlq","last_synced_at":"2026-01-22T17:30:16.033Z","repository":{"id":62569612,"uuid":"405042895","full_name":"dealfonso/htmlq","owner":"dealfonso","description":"command line utility for HTML querying using css selectors","archived":false,"fork":false,"pushed_at":"2021-10-20T10:16:15.000Z","size":19,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-30T21:37:16.773Z","etag":null,"topics":["command-line","command-line-tool","html","html-select","monitoring-tool"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dealfonso.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-09-10T10:30:29.000Z","updated_at":"2022-09-09T08:03:28.000Z","dependencies_parsed_at":"2022-11-03T17:15:33.946Z","dependency_job_id":null,"html_url":"https://github.com/dealfonso/htmlq","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dealfonso%2Fhtmlq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dealfonso%2Fhtmlq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dealfonso%2Fhtmlq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dealfonso%2Fhtmlq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dealfonso","download_url":"https://codeload.github.com/dealfonso/htmlq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239763648,"owners_count":19692812,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line","command-line-tool","html","html-select","monitoring-tool"],"created_at":"2024-11-08T01:24:50.930Z","updated_at":"2026-01-22T17:30:15.994Z","avatar_url":"https://github.com/dealfonso.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# htmlq - command line HTML query \nThis is a simple command line utility to query HTML content as if you were using jQuery selectors. The idea is to be able to use commands like the next one:\n\n```console\n$ wget -q -O- www.google.com | htmlq title\n```\n\nand get an output like this one:\n\n```console\n\u003ctitle\u003eGoogle\u003c/title\u003e\n```\n\nThe command is mainly useful for scripting. For example, it is possible to try to get the version of a wordpress installation, by checking the `meta name=\"generator\"` tag:\n\n```console\n$ htmlq -u www.wordpress.com 'meta[name=\"generator\"]'\n\u003cmeta content=\"WordPress.com\" name=\"generator\"/\u003e\n```\n\nOr the title of a web page:\n\n```console\n$ htmlq -u www.google.com title\n\u003ctitle\u003eGoogle\u003c/title\u003e\n```\n\nIt is even possible to get the value of a particular attribute in a tag\n\n```console\n$ htmlq -u www.wordpress.com 'meta[name=\"generator\"]' -a name\ngenerator\n```\n\nOr even do more sophisticated queries than remove internal elements, empty values, etc. As an example, the next query gets the title string of the different items in a search of items in ebay:\n\n```\n$ htmlq -u \"https://www.ebay.com/sch/i.html?_nkw=laptop\"  \"li.s-item h3.s-item__title\" -s \"\\n\" --rm span -a . -n\n```\n\n## Installing\n\nInstall using `python-pip`:\n\n```\n$ pip install htmlq\n```\n\nor building from source:\n\n```\n$ pip install bs4 html5lib urllib3 requests pathlib\n...\n$ git clone https://github.com/dealfonso/htmlq.git\n$ cd htmlq\n$ python3 setup.py install\n```\n\n## Use Cases\n\nThe most simple way to use `htmlq` is to get a tag from a web page:\n\n```console\n$ htmlq -u www.github.com title\n\u003ctitle\u003eGitHub: Where the world builds software · GitHub\u003c/title\u003e\n```\n\n_(*) this example gets the title of a web page_\n\n---\n\nBut we may want to get other tag...\n\n```console\n$ htmlq -u www.github.com a\n\u003ca class=\"px-2 py-4 color-bg-info-inverse color-text-white show-on-focus js-skip-to-content\" href=\"#start-of-content\"\u003eSkip to content\u003c/a\u003e\u003ca href=\"https://docs.github.com/articles/supported-browsers\"\u003e\n          Learn more about the browsers we support.\n        \u003c/a\u003e\u003ca aria-label=\"Homepage\" class=\"mr-4\" data-ga-click=\"(Logged out) Header, go to homepage, icon:logo-wordmark\" href=\"https://github.com/\"\u003e\n          \u003csvg aria-hidden=\"true\" class=\"octicon octicon-mark-github color-text-white\" data-view-component=\"true\" height=\"32\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"32\"\u003e\n    \u003cpath d=\"M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z\" fill-rule=\"evenodd\"\u003e\u003c/path\u003e\n\u003c/svg\u003e\n        \u003c/a\u003e\u003ca class=\"d-inline-block d-lg-none f5 color-text-white no-underline border color-border-tertiary rounded-2 px-2 py-1 mr-3 mr-sm-5\" data-hydro-click='{\"event_type\":\"authentication.click\",\"payload\":{\"location_in_page\":\"site header\",\"repository_id\":null,\"auth_type\":\"SIGN_UP\",\"originating_url\":\"https://github.com/\",\"user_id\":null}}' data-hydro-click-hmac=\"520d87e8f83281e6946b192f0f840552721c7fcba9b9c36d802e898a816314e2\" href=\"/signup?ref_cta=Sign+up\u0026amp;ref_loc=header+logged+out\u0026amp;ref_page=%2F\u0026amp;source=header-home\"\u003e\n                Sign up\n...\n```\n\n_(*) this example gets the \"a\" tags of a web page_\n\n---\n\nWe obtained a lot of information, and that is why we wanted to narrow the query to remove those that we do not need\n\n```console\n$ htmlq -u www.github.com \"a[aria-label]\"\n\u003ca aria-label=\"Homepage\" class=\"mr-4\" data-ga-click=\"(Logged out) Header, go to homepage, icon:logo-wordmark\" href=\"https://github.com/\"\u003e\n          \u003csvg aria-hidden=\"true\" class=\"octicon octicon-mark-github color-text-white\" data-view-component=\"true\" height=\"32\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"32\"\u003e\n    \u003cpath d=\"M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z\" fill-rule=\"evenodd\"\u003e\u003c/path\u003e\n\u003c/svg\u003e\n        \u003c/a\u003e\u003ca aria-label=\"Go to GitHub homepage\" class=\"color-text-primary\" data-hydro-click='{\"event_type\":\"analytics.event\",\"payload\":{\"category\":\"Footer\",\"action\":\"go to home\",\"label\":\"text:home\",\"originating_url\":\"https://github.com/\",\"user_id\":null}}' data-hydro-click-hmac=\"062d687e04e8668f63bed700cfd9281766aa03a46d1beb6c750d751f692a1442\" href=\"/\"\u003e\n          \u003cimg alt=\"GitHub\" class=\"footer-logo-mktg\" decoding=\"async\" height=\"30\" loading=\"lazy\" src=\"https://github.githubassets.com/images/modules/site/icons/footer/github-logo.svg\" width=\"84\"/\u003e\n        \u003c/a\u003e\n```\n\n_(*) this example gets the link tags of a web page, but those that do have the attribute aria-label set_\n\n---\n\nBut we only want to select the _a_ tag, without some of the inner tags\n\n```console\n$ htmlq -u www.github.com \"a[aria-label]\" --rm svg --rm img\n\u003ca aria-label=\"Homepage\" class=\"mr-4\" data-ga-click=\"(Logged out) Header, go to homepage, icon:logo-wordmark\" href=\"https://github.com/\"\u003e\n\n        \u003c/a\u003e\u003ca aria-label=\"Go to GitHub homepage\" class=\"color-text-primary\" data-hydro-click='{\"event_type\":\"analytics.event\",\"payload\":{\"category\":\"Footer\",\"action\":\"go to home\",\"label\":\"text:home\",\"originating_url\":\"https://github.com/\",\"user_id\":null}}' data-hydro-click-hmac=\"062d687e04e8668f63bed700cfd9281766aa03a46d1beb6c750d751f692a1442\" href=\"/\"\u003e\n\n        \u003c/a\u003e\n```\n\n_(*) the --rm parameter enables to remove inner queries on each entry_\n\n---\n\nBut we only need the value of the label attribute, one on each line:\n\n```console\n$ htmlq -u www.github.com \"a[aria-label]\" --rm svg --rm img -a aria-label -s \"\\n\"\nHomepage\nGo to GitHub homepage\n```\n\n_(*) the -a parameter enables to obtain the values of the attributes for each resulting entry, and -s sets the separator between results_\n\n---\n\nFinally we want to get also the destination URL, with a pretty arrow:\n\n```console\nhtmlq -u www.github.com \"a[aria-label]\" --rm svg --rm img -a aria-label -a href -s \"\\n\" -S \" -\u003e \"\nHomepage -\u003e https://github.com/\nGo to GitHub homepage -\u003e /\n```\n\n_(*) we may include multiple attributes (using multiple -a entries) and join them with specific connectors using -S_\n\n## Detailed options\n\nThe usage syntax for `htmlq` is the next:\n\n```console\nhtmlq [-h] [-f FILENAME] [-u URL] [-a ATTRIBUTE] [-r RMQUERY] [-s SEPARATOR] [-S FIELDSEPARATOR] [-n] [-N] [-1] [-U USER_AGENT] query\n```\n\nThere are multiple options and flags for `htmlq` and here we try to explain each of them.\n\n- __-f | --filename \\\u003cfilename\\\u003e__ reads content of the file `filename`. It is possible to use the whole path to the file (e.g. `/path/to/my/file`) or use special paths (e.g. `~/myfile`). If no filename nor url is included, `htmlq` will read from the standard input.\n\n- __-u | --url \\\u003curl\\\u003e__ retrieves the content to be parsed from the url. It is advisable to include the whole schema in the url (e.g. `https://my.url`). If not included, the `https` schema will be tried in first place, and if it fails, `http` will be tried. If a file is included in the commandline, this parameter will be ignored. If no filename nor url is included, `htmlq` will read from the standard input.\n\n- __-a | --attr \\\u003cattribute list\\\u003e__ if a query obtains a set of tags as a result, the default behavior (if this parameter is not set) is to output the result of the whole obtained html fragments. Instead, if an attribute is queried (using _-a_) the output will be the values of each of these attributes for the entry. In case that an attribute is not in the html node, its output will be _blank_. It is possible to query multiple attributes by including multiple _-a_ entries (e.g. `-a href -a aria-label`). There is an special attribute (.) which refers to the text representation of the entry (i.e. `-a .`).\n\n- __-r | --rm \\\u003cquery\\\u003e__ the query to the html document may contain child nodes (e.g. \\\u003cul\\\u003e\\\u003cli\\\u003e\\\u003cli\\\u003e\\\u003c/ul\\\u003e). When querying for `\u003cul\u003e`, the result will be the `\u003cul\u003e` node along with its `\u003cli\u003e` child nodes. Using `-r` it is possible to delete the `\u003cli\u003e` nodes. It is possible to remove multiple child trees by including multiple `--rm` queries.\n\n- __-s | --separator \\\u003cseparator\\\u003e__ this is the string used to join the output of the result of the different entries. It is possible to include escaped strings (e.g. `\\n`) or whole arbitraty strings (e.g. `\\n -\u003e`). The default value is `\\0`.\n\n- __-S | --field-separator \\\u003cseparator\\\u003e__ this is the string used to join the values of the different attributes obtained from an entry, using `-a` parameter. It is possible to include escaped strings (e.g. `\\n`) or whole arbitraty strings (e.g. `-\u003e`). The default value is `,`.\n\n- __-n | --no-empty-lines__ using this flag, `htmlq` will not include empty lines (i.e. lines whose value is _blank_ as a result of the combination of attributes).\n\n- __-N | --no-empty-attr__ using this flag, `htmlq` will not include the value of attributes that are empty (i.e. lines whose value is _blank_ as a result of the combination of attributes). Using this option, the number of resulting attributes may differ from the number of requested attributes (e.g. `-a href -a class -a id` may be converted to `/,mylink` if _class_ is not set for an entry).\n\n- __-1 | --only-first__ if a query obtains multiple results, using this flag, `htmlq` will deal only with the first one (thus ignoring the rest).\n\n- __-U | --user-agent \\\u003cuser agent string\\\u003e__. Using this parameter, it is possible to set an arbitraty user agent string to retrieve the web page. You can check your user agent string in this web: https://www.whatsmyua.info\n\n- __query__. This is the query string that wants to be retrieved from the html web page. It is possible to use queries that retrieve multiple trees. In this case, `htmlq` will consider them as individual entries and will deal with all of them (or only the first if using flag `-1`).\n\n# urlf - format url\n\nThis command is an add-on to _htmlq_, as a command line application to deal with URLs and extracting information about them. \n\nThe original purpose was to extract the values of variables in URLs, so that their values can be used in scripts. An example:\n\n```console\n$ urlf -v oq \"https://www.google.com/search?q=github\u0026oq=github\u0026sourceid=chrome\u0026ie=UTF-8\"\ngithub\n```\n\n_(*) This example gets the value of var **oq**._\n\nThen the application has evolved to enable rewritting URLs, using the commandline as in the next example:\n\n```console\n$ urlf \"https://www.google.com/search\\?q=github\u0026oq=github\u0026sourceid=chrome\u0026ie=UTF-8\" -F \"%s://%H?oq=%#oq#\"\nhttps://www.google.com?oq=github\n```\n\n_(*) This example rewrites the URL to build a new one that removes the path and just includes the value of var **oq**._\n\n## Detailed options\n\nThere are multiple options and flags for `urlf` and here we try to explain each of them.\n\n```console\nusage: urlf [-h] [-U] [-s] [-u] [-w] [-H] [-p] [-P] [-q] [-m] [-f] [-v var name] [-j SEPARATOR] [-F format string] [-V] urls [urls ...]\n```\n\n- __-h, --help__: shows the help\n- __-V, --version__: show program's version number and exit\n- __-U, --url__: displays the URL as provided in the input.\n- __-s, --scheme__: shows the scheme provided in the url (e.g. https)\n- __-u, --username__: shows the username to accede to the url (i.e. user in https://user@pass:myserver.com)\n- __-w, --password__: shows the password to accede to the url (i.e. pass in https://user@pass:myserver.com)\n- __-H, --hostname__: shows the hostname in the url (i.e. myserver.com in https://myserver.com/my/path)\n- __-p, --port__: shows the port in the url (i.e. 443 in https://myserver.com:443/my/path)\n- __-P, --path__: shows the path in the url (i.e. my/path in https://myserver.com/my/path)\n- __-q, --query__: shows the query in the url (i.e. q=1\u0026r=2 in https://myserver.com/my/path?q=1\u0026r=2)\n- __-m, --parameters__: shows the parameters to accede to the url (i.e. param in https://myserver.com/my/path;param?q=1\u0026r=2)\n- __-f, --fragment__: shows the fragment in the url (i.e. sec1 in https://myserver.com/my/path#sec1)\n- __-v var name, --var var name\n                        show the value of a var in the query string (this parameter may appear multiple times, to get the values of multiple variables; they will appear in the same order than appeared in the commandline)\n- __-j | --join-string \\\u003cSEPARATOR\\\u003e__: \n                        character (or string) used to separate the different fields (default: \u003cblank space\u003e)\n- __-F | --format-string \\\u003cformat string\\\u003e__: \n                        user defined format string to get a custom output of the URL parts. Any arbitrary field or character may appear in this string, and the fields are substituted using the letter in the shorthand flag of each parameter, preceded by symbol %. E.g. `urlf -H` is the same than `urlf -F \"%H\"`; e.g. `urlf -s -H` is the same than `urlf -F \"%s%H\"`, but you can use `urlf -F \"%s://%H\"` to obtain a better output. In the case of variables, the value is obtained by surrounding the name of the var by symbol # and prepending symbol %; e.g. `urlf -v q` is the same than `urlf -F \"%#q#\"`.\n\n# A combined example (guessing wordpress version)\nIn case that we wanted to get the version of a wordpress installation, we could check meta tag and get the content:\n\n```\n$ htmlq -u www.grycap.upv.es 'meta[name=\"generator\"]'\n\u003cmeta content=\"WordPress 5.8.1\" name=\"generator\"/\u003e\n$ htmlq -u www.grycap.upv.es 'meta[name=\"generator\"]' -a content\nWordPress 5.8.1\n```\n\nBut many plugins hide the version in the tag, so we can try to guess the version from the links:\n\n```\n$ htmlq -u www.grycap.upv.es 'link[href*=\"?ver=\"]' -s '\\n'\n\u003clink href=\"https://www.grycap.upv.es/wp-includes/css/dist/block-library/style.min.css?ver=5.8.1\" id=\"wp-block-library-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/style.css?ver=5.8.1\" id=\"specia-style-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/colors/default.css?ver=5.8.1\" id=\"specia-default-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/owl.carousel.css?ver=5.8.1\" id=\"owl-carousel-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/bootstrap.css?ver=5.8.1\" id=\"bootstrap-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/woo.css?ver=5.8.1\" id=\"woo-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/form.css?ver=5.8.1\" id=\"specia-form-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/typography.css?ver=5.8.1\" id=\"specia-typography-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/media-query.css?ver=5.8.1\" id=\"specia-media-query-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/widget.css?ver=5.8.1\" id=\"specia-widget-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/animate.min.css?ver=5.8.1\" id=\"animate-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/text-rotator.css?ver=5.8.1\" id=\"specia-text-rotator-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/css/menus.css?ver=5.8.1\" id=\"specia-menus-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/themes/specia/inc/fonts/font-awesome/css/font-awesome.min.css?ver=5.8.1\" id=\"font-awesome-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n\u003clink href=\"https://www.grycap.upv.es/wp-content/plugins/forget-about-shortcode-buttons/public/css/button-styles.css?ver=2.1.2\" id=\"forget-about-shortcode-buttons-css\" media=\"all\" rel=\"stylesheet\" type=\"text/css\"/\u003e\n```\n\nFrom the links, we see that wordpress includes the version of wordpress in the \"ver\" variable for any link; so we may get the value of such variable using `urlf`:\n\n```\n$ htmlq -u www.grycap.upv.es 'link[href*=\"?ver=\"]' -s '\\n' -a href | ./urlf.py -v ver -\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n5.8.1\n2.1.2\n5.8.1\n```\n\nAnd now, if we get the most used value, it will probably be the one that refers to the wordpress version (because other plugins may also use that variable for its purposes):\n\n```\n$ htmlq -u www.grycap.upv.es 'link[href*=\"?ver=\"]' -s '\\n' -a href | ./urlf.py -v ver - | sort | uniq -c | sort -k1 -n -r\n  15 5.8.1\n   1 2.1.2\n```\n\nAnd the first one will be the most voted version.\n\nNow we can compare to the currently available wordpress version:\n\n```\n$ curl -s https://api.wordpress.org/core/version-check/1.7/ | jq \".offers[].version\" | tr -d '\"' | sort -V | tail -n 1\n5.8.1\n```\n\nOur final script would be something like the next one:\n\n```bash\n#!/bin/bash\nMYVERSION=\"$(htmlq -u www.grycap.upv.es 'link[href*=\"?ver=\"]' -s '\\n' -a href | ./urlf.py -v ver - | sort | uniq -c | sort -k1 -n -r | head -n 1 | awk '{print $2}')\"\nCURRENTVERSION=\"$(curl -s https://api.wordpress.org/core/version-check/1.7/ | jq \".offers[].version\" | tr -d '\"' | sort -V | tail -n 1)\"\n\nLATESTVERSION=\"$(echo \"$MYVERSION\n$CURRENTVERSION\" | sort -V -r | head -n 1)\"\nif [ \"$LATESTVERSION\" = \"$MYVERSION\" ]; then\n        echo \"you have the latest version of wordpress ($LATESTVERSION)\"\nelse\n        echo \"you should update your wordpress version\"\n        exit 1\nfi\nexit 0\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdealfonso%2Fhtmlq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdealfonso%2Fhtmlq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdealfonso%2Fhtmlq/lists"}