{"id":21573682,"url":"https://github.com/syss-research/smbcrawler","last_synced_at":"2025-05-16T14:09:17.141Z","repository":{"id":97675973,"uuid":"375467686","full_name":"SySS-Research/smbcrawler","owner":"SySS-Research","description":"smbcrawler is no-nonsense tool that takes credentials and a list of hosts and 'crawls' (or 'spiders') through those shares","archived":false,"fork":false,"pushed_at":"2025-03-24T07:46:43.000Z","size":1156,"stargazers_count":160,"open_issues_count":1,"forks_count":21,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-12T11:55:40.152Z","etag":null,"topics":["pentest","red-team-tools","shares","smb","smbcrawler"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SySS-Research.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-06-09T19:27:08.000Z","updated_at":"2025-04-11T08:57:08.000Z","dependencies_parsed_at":"2024-05-18T16:45:01.981Z","dependency_job_id":"3277a4e7-1dd4-4481-b9f6-1af6bd704b28","html_url":"https://github.com/SySS-Research/smbcrawler","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SySS-Research%2Fsmbcrawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SySS-Research%2Fsmbcrawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SySS-Research%2Fsmbcrawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SySS-Research%2Fsmbcrawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SySS-Research","download_url":"https://codeload.github.com/SySS-Research/smbcrawler/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254544158,"owners_count":22088808,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pentest","red-team-tools","shares","smb","smbcrawler"],"created_at":"2024-11-24T12:07:42.901Z","updated_at":"2025-05-16T14:09:17.094Z","avatar_url":"https://github.com/SySS-Research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"SmbCrawler\n==========\n\nSmbCrawler is no-nonsense tool that takes credentials and a list of hosts\nand 'crawls' (or 'spiders') through those shares. Features:\n\n* takes host names, IP addresses, IP ranges, or an nmap xml file as input\n* checks permissions (check for 'write' permissions is opt-in, because it\n  requires creating an empty directory on the share)\n* crawling depth is customizable\n* outputs results in machine-readable formats or as an interactive HTML report\n* pass-the-hash support\n* auto-download interesting files\n* report potential secrets\n* threaded\n* pausable\n* interactively skip single shares and hosts\n\n\nInstallation\n------------\n\nIf you require instructions on how to install a Python package, I recommend\nyou make sure you [have `pipx`\ninstalled](https://pipx.pypa.io/stable/installation/) and run `pipx install\nsmbcrawler`.\n\nSmbCrawler can automatically convert some binary files like PDF, XLSX, DOCX, ZIP, etc. to plain text using [MarkItDown](https://github.com/microsoft/markitdown).\nBecause this package is pulling a lot of dependencies, it is marked as an extra.\nHowever, it is highly recommended to get the best results. If you want to automatically convert binaries, install SmbCrawler like this:\n\n```console\npipx install 'smbcrawler[binary-conversion]'\n```\n\nAdding shell completion is highly recommended. As a Python app using the\n`click` library, you can add tab completion to bash, zsh and fish using the [usual\nmechanism](https://click.palletsprojects.com/en/8.1.x/shell-completion/#enabling-completion).\n\n\nExample\n-------\n\nRun it like this (10 threads, maximum depth 5):\n\n```\n$ smbcrawler crawl -i hosts.txt -u pen.tester -p iluvb0b -d contoso.local -t 10 -D 5\n```\n\n\nMajor changes in version 1.0\n----------------------------\n\nSmbCrawler has undergone a major overhaul. The most significant changes are:\n\n* We cleaned up the CLI and introduced a \"profile\" mechanism to steer the\n  behavior of the crawler\n* The output is now a sqlite database instead of scattered JSON files\n* Permissions are now reported more granularly\n\nThe old CLI arguments regarding \"interesting files\", \"boring shares\" and so\non was clunky and confusing. Instead we now use \"profiles; see below for\ndetails.\n\nAlso, I realized I basically reinvented relational databases, except did so\nvery poorly, so why not use sqlite directly?\nThe sqlite approach enables us to produce a nice interactive HTML report\nwith good performance. You can still export results in various formats if\nyou need to use the data in some tool pipeline.\n\nThe old way SmbCrawler reported permissions sometimes wasn't very useful.\nFor example, it's not uncommon that you have read permissions in the root\ndirectory of the share, but all sub directories are protected, e.g. for user\nprofiles. SmbCrawler will now report how deep it was able to read the\ndirectory tree of a share and whether it maxed out or could have gone deeper\nif you had supplied a higher value for the maximum depth argument.\n\nIf you prefer the old version, it's still available on PyPI and installable\nwith `pipx install smbcrawler==0.2.0`, for example.\n\nUsage\n-----\n\nDuring run time, you can use the following keys:\n\n* `p`: pause the crawler and skip single hosts or shares\n* `\u003cspace\u003e`: print the current progress\n* `s`: print a more detailed status update\n\nFor more information, run `smbcrawler -h`.\n\n\nNotes\n-----\n\nEven in medium sized networks, SmbCrawler will find tons of data. The\nchallenge is to reduce false positives.\n\n### Notes on permissions\n\nIt's important to realize that permissions can apply on the service level\nand on the file system level. The remote SMB service may allow you to\nauthenticate and your user account may have read permissions in principle,\nbut it could lack these permissions on the file system.\n\nSmbCrawler will report if you have permissions to:\n\n* authenticate against a target as guest and list shares\n* authenticate against a target with the user creds\n* access a share as guest\n* access a share with the user creds\n* create a directory in the share's root directory\n* the deepest directory level of a share that could be accessed (limited by the\n  `--depth` argument)\n\nBecause it is non-trivial to check permissions of SMB shares without\nattempting the action in question, SmbCrawler will attempt to create a\ndirectory on each share. Its name is `smbcrawler_DELETEME_\u003c8 random characters\u003e`\nand will be deleted immediately, but be aware anyway.\n\n\u003e [!WARNING]\n\u003e Sometimes you have the permission to create directories, but not to delete\n\u003e them, so you will leave an empty directory there.\n\n### Profiles\n\nTo decide what to do with certain shares, files or directories, SmbCrawler\nhas a feature called \"profiles\". Take a look at the [default\nprofile](https://github.com/SySS-Research/smbcrawler/blob/main/src/smbcrawler/default_profile.yml).\n\nProfiles are loaded from files with extensions `*.yml` or `*.yaml` from\nthese locations:\n\n* The built-in default profile\n* `$XDG_DATA_HOME/smbcrawler/` (`~/.local/share/smbcrawler` by default)\n* The current working directory\n* The extra directory defined by `--extra-profile-directory`\n* The extra files defined by `--extra-profile-file`\n\nProfiles from each location override previous definitions.\n\nThe `regex` value defines whether a profile matches, and the last matching\nprofile will be used. All regular expressions are case-insensitive, mirroring\nthe most common behavior in the Windows world.\n\nSince it can be confusing how profiles from different sources work together,\nmake sure to make use of the `--dry-run` parameter. It shows you the\neffective configuration and does nothing more.\n\nLet's look at each section, which is always a list of dictionaries. Each of\nthe keys of the dictionary is an arbitrary label and each of the values\nis again a dictionary with different properties.\n\n#### Files\n\n* `comment`: A helpful string describing this profile\n* `regex`: A regular expression that defines which files this profile\n  applies to. The *last* regex that matches is the one that counts.\n* `regex_flags`: An array of flags which will be passed to the regex [`match`\n  function](https://docs.python.org/3/library/re.html#flags)\n* `high_value` (default: `false`): If a file is \"high value\", its presence will be reported,\n  but it will not necessarily be downloaded (think virtual hard drives -\n  important, but too large to download automatically)\n* `download` (default: `true`): If `true`, the first 200KiBi will be\n  downloaded (or the entire file if `high_value=true`) and parsed for\n  secrets\n\n#### Shares and directories\n\n* `comment`, `regex`, `regex_flags`: Same as above\n* `high_value`: its presence will be reported and crawl depth changed to\n  infinity\n* `crawl_depth`: Crawl this share or directory up to a different depth than\n  what is defined by the `--depth` argument\n\n#### Secrets\n\n* `comment`, `regex_flags`: Same as above\n* `regex`: A regular expression matching the secret. The secret itself can\n  be a named group with the name `secret`.\n\n\n### Typical workflow\n\nIt makes sense to first run SmbCrawler with crawling depth 0 to get an idea of\nwhat you're dealing with. In this first run, you can enable the write check\nwith `-w`:\n\n```\n$ smbcrawler -C permissions_check.crwl crawl -D0 -t10 -w \\\n    -i \u003cINPUT FILE\u003e -u \u003cUSER\u003e -d \u003cDOMAIN\u003e -p \u003cPASSWORD\u003e\n```\n\nAfterwards, you can identify interesting and boring shares for your next run\nor several runs. Some shares like `SYSVOL` and `NETLOGON` appear many times,\nso you should set the crawl depth to zero on your next run and pick one host\nto scan these duplicate shares in a third run. Here is an example:\n\n```\n$ smbcrawler -C dc_only.crwl crawl -D -1 \u003cDC IP\u003e \\\n    -u \u003cUSER\u003e -d \u003cDOMAIN\u003e -p \u003cPASSWORD\u003e\n$ smbcrawler -C full.crwl crawl -D5 -t10 -i \u003cNEW INPUT FILE\u003e \\\n    -u \u003cUSER\u003e -d \u003cDOMAIN\u003e -p \u003cPASSWORD\u003e \\\n    --extra-profile-file skip_sysvol.yml\n```\n\nHere, `skip_sysvol.yml` would be:\n\n```yaml\nshares:\n  sysvol:\n    comment: \"Skip sysvol and netlogon share\"\n    regex: 'SYSVOL|NETLOGON'\n    crawl_depth: 0\n```\n\nFeel free to include other shares here which you may think are not worth\ncrawling.\n\n### Output\n\nThe raw data is contained in an SQLite database and a directory (`output.crwl` and\n`output.crwl.d` by default). The directory contains two more directories: one with\nthe downloaded files unique-ified by the hash content and a directory\nmirroring all shares with symlinks pointing to the content files. The latter\nis good for grepping through all downloaded files.\n\nThe data can be transformed to various formats. You can also simply access\nthe database with `sqlitebrowser`, for example. Some useful views have been\npre-defined. Or you can output JSON and use `jq` to mangle the data.\n\nIf you want to display all shares that you were able to read beyond the root\ndirectory in a LaTeX table, for instance, use this query:\n\n```sql\nSELECT target_id || \" \u0026 \" || name || \" \u0026 \" || remark || \" \\\\\"\nFROM share\nWHERE read_level \u003e 0\nORDER BY target_id, name\n```\n\nThere is also an experimental HTML output feature. It may not be entirely\nuseful yet for large amounts of data.\n\n### Help out\n\nIf you notice a lot of false positives or false negatives in the reported\nsecrets, please help out and let me know. Community input is important when\ntrying to improve automatic detection. Best case scenario: provide a pull\nrequest with changes to the default profile file.\n\n\nCredits\n-------\n\nAdrian Vollmer, SySS GmbH\n\n\nLicense\n-------\n\nMIT License; see `LICENSE` for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsyss-research%2Fsmbcrawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsyss-research%2Fsmbcrawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsyss-research%2Fsmbcrawler/lists"}