{"id":16931195,"url":"https://github.com/maxgio92/wgrep","last_synced_at":"2025-10-07T17:32:11.524Z","repository":{"id":246535481,"uuid":"821414066","full_name":"maxgio92/wgrep","owner":"maxgio92","description":"Like grep(1) but for web sites. ","archived":false,"fork":false,"pushed_at":"2024-06-28T16:34:43.000Z","size":71,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-01T14:21:29.682Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maxgio92.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-28T13:38:27.000Z","updated_at":"2024-07-02T05:39:31.000Z","dependencies_parsed_at":"2024-06-28T15:02:08.729Z","dependency_job_id":"5647cc72-38d4-44e4-b19a-a7c82d464f7a","html_url":"https://github.com/maxgio92/wgrep","commit_stats":null,"previous_names":["maxgio92/wgrep"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/maxgio92/wgrep","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgio92%2Fwgrep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgio92%2Fwgrep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgio92%2Fwgrep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgio92%2Fwgrep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maxgio92","download_url":"https://codeload.github.com/maxgio92/wgrep/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgio92%2Fwgrep/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278813460,"owners_count":26050562,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T20:43:27.053Z","updated_at":"2025-10-07T17:32:11.261Z","avatar_url":"https://github.com/maxgio92.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Latest release](https://img.shields.io/github/v/release/maxgio92/wgrep?style=for-the-badge)](https://github.com/maxgio92/wgrep/releases/latest)\n[![License](https://img.shields.io/github/license/maxgio92/wgrep?style=for-the-badge)](COPYING)\n![Go version](https://img.shields.io/github/go-mod/go-version/maxgio92/wgrep?style=for-the-badge)\n\n# ![](./logo.svg) like grep but for web sites\n\n`wgrep` (world wide web grep) search for patterns in a web site directory hierarchy over HTTP, through hypertext references.\n\nBy default, it searches for patterns inside paragraphs.\n\nThe tool is inspired by GNU `grep(1)` and `wget(1)`.\n\n### Usage\n\n```\nwgrep PATTERN URL [flags]\n```\n\nFor details please read the CLI [documentation](./docs/wgrep.md).\n\n#### Recursive search\n\n```shell\nwgrep --recursive|-r PATTERN URL\n```\n\n#### Case insensitive patterns\n\n```shell\nwgrep --ignore-case|-i PATTERN URL\n```\n\n#### Include specific locations only\n\nWhile referenced locations that have a host name different from the one specified in the `URL` argument are skipped by default, it's possible to include only locations of which HTTP path follows a specific pattern.\n\nSimilarly to how `grep` allows with the `--include` flag to include specific locations in the search, it's possible to filter the pages by URL when recursively look for a pattern.\nThe include location filter pattern supports **regular expressions** in the [Go flavor](https://pkg.go.dev/regexp/syntax).\n\n```shell\nwgrep -r --include \"my-section\\/.+\" PATTERN URL\n```\n\n#### Search on specific HTML elements\n\nBy default, the element filter is set to \"p\", as standard paragraphs are represented in HTML. However this filter can be customized with the `--element`|`-e` flag:\n\n```shell\nwgrep --element|-e \"article\" PATTERN URL\n```\n\nThe element filter supports [GoQuery](https://github.com/PuerkitoBio/goquery) patterns.\nFor example, this allows to select elements based on class attributes:\n\n```shell\nwgrep -e \".my-class\" PATTERN URL\n```\n\nFor more information about the selector syntax please refer to the [GoQuery](https://github.com/PuerkitoBio/goquery) documentation.\n\n### In action\n\n```shell\n$ wgrep --include \"posts\\/\" -ri kubernetes https://blog.maxgio.me\nhttps://blog.maxgio.me/posts/k8s-stride-05-denial-of-service/:\nUsers that are authorized to make patch requests to the Kubernetes API server can send a specially crafted patch of type json-patch (e.g. kubectl patch - type json or Content-Type: application/json-patch+json) that consumes excessive resources while processing, causing a denial of service on the API server.\n\nhttps://blog.maxgio.me/posts/stride-threat-modeling-kubernetes-elevation-of-privileges/: Hello everyone, a long time has passed after the 5th part of this journey through STRIDE thread modeling in Kubernetes has been published.\nIf you recall well, STRIDE is a model of threats for identifying security threats, by providing a mnemonic for security threats in six categories:\n\nhttps://blog.maxgio.me/posts/stride-threat-modeling-kubernetes-elevation-of-privileges/:\nIn Kubernetes Role-Based Access Control authorizes or not access to Kubernetes resources through roles, but we also have underlying infrastructure resources, and Kubernetes provides primitives to authorize workload to access operating system resources, like Linux namespaces.\n...\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxgio92%2Fwgrep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxgio92%2Fwgrep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxgio92%2Fwgrep/lists"}