{"id":44487322,"url":"https://github.com/mutating/dirstree","last_synced_at":"2026-05-31T23:02:06.390Z","repository":{"id":316962551,"uuid":"1065457054","full_name":"mutating/dirstree","owner":"mutating","description":"Another library for iterating through the contents of a directory","archived":false,"fork":false,"pushed_at":"2026-05-29T08:52:36.000Z","size":172,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-29T09:25:47.568Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mutating.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-27T19:10:15.000Z","updated_at":"2026-05-29T08:52:20.000Z","dependencies_parsed_at":"2025-09-27T22:20:32.670Z","dependency_job_id":"742dba49-9b8f-4ad7-b3af-1340f64b3c5a","html_url":"https://github.com/mutating/dirstree","commit_stats":null,"previous_names":["pomponchik/dirstree","mutating/dirstree"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/mutating/dirstree","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mutating%2Fdirstree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mutating%2Fdirstree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mutating%2Fdirstree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mutating%2Fdirstree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mutating","download_url":"https://codeload.github.com/mutating/dirstree/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mutating%2Fdirstree/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33752286,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-13T02:11:32.275Z","updated_at":"2026-05-31T23:02:06.385Z","avatar_url":"https://github.com/mutating.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdetails\u003e\n  \u003csummary\u003eⓘ\u003c/summary\u003e\n\n[![Downloads](https://static.pepy.tech/badge/dirstree/month)](https://pepy.tech/project/dirstree)\n[![Downloads](https://static.pepy.tech/badge/dirstree)](https://pepy.tech/project/dirstree)\n[![Coverage Status](https://coveralls.io/repos/github/mutating/dirstree/badge.svg?branch=main)](https://coveralls.io/github/mutating/dirstree?branch=main)\n[![Lines of code](https://sloc.xyz/github/mutating/dirstree/?category=code)](https://github.com/boyter/scc/)\n[![Hits-of-Code](https://hitsofcode.com/github/mutating/dirstree?branch=main)](https://hitsofcode.com/github/mutating/dirstree/view?branch=main)\n[![Test-Package](https://github.com/mutating/dirstree/actions/workflows/tests_and_coverage.yml/badge.svg)](https://github.com/mutating/dirstree/actions/workflows/tests_and_coverage.yml)\n[![Python versions](https://img.shields.io/pypi/pyversions/dirstree.svg)](https://pypi.python.org/pypi/dirstree)\n[![PyPI version](https://badge.fury.io/py/dirstree.svg)](https://badge.fury.io/py/dirstree)\n[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/mutating/dirstree)\n\n\u003c/details\u003e\n\n![logo](https://raw.githubusercontent.com/mutating/dirstree/develop/docs/assets/logo_1.svg)\n\n\nThere are many libraries for traversing directories. You can also do this using the standard library. What makes this library different:\n\n- 💎 Beautiful, laconic syntax.\n- ⚗️ Filtering by file extensions, text patterns in [`.gitignore` format](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository#_ignoring), and using custom callables.\n- 🐍 Natively works with both [`Path` objects](https://docs.python.org/3/library/pathlib.html#basic-use) from the standard library and strings.\n- ❌ Support for [cancellation tokens](https://github.com/pomponchik/cantok).\n- 👯‍♂️ Combining multiple crawling methods in one object.\n\n\n## Table of contents\n\n- [**Installation**](#installation)\n- [**Basic usage**](#basic-usage)\n- [**Applying a function to each path**](#applying-a-function-to-each-path)\n- [**Filtering**](#filtering)\n- [**Working with Cancellation Tokens**](#working-with-cancellation-tokens)\n- [**Combination**](#combination)\n- [**Transactionality**](#transactionality)\n\n\n## Installation\n\nYou can install [`dirstree`](https://pypi.org/project/dirstree/) with `pip`:\n\n```bash\npip install dirstree\n```\n\nYou can also use [`instld`](https://github.com/pomponchik/instld) to quickly try out this package and others without installing them.\n\n\n## Basic usage\n\nThe library is easy to use:\n\n- Create a crawler object, passing the path to the base directory and, if necessary, additional arguments.\n- Iterate through it.\n\nThe simplest example would look like this:\n\n```python\nfrom dirstree import Crawler\n\ncrawler = Crawler('.')\n\nfor file in crawler:\n    print(file)\n```\n\n\u003e ↑ This recursively prints all files in the current directory, including files in nested directories. At each iteration, we get a new [`Path` object](https://docs.python.org/3/library/pathlib.html#basic-use).\n\n\n## Applying a function to each path\n\nIf you just want to run a function for each file the crawler finds, you don't have to write the loop yourself — every crawler has an `apply()` method:\n\n```python\nCrawler('src', exclude=['tests/**']).apply(print)\n```\n\n\u003e ↑ This will print the entire contents of the directory, except for the excluded locations.\n\n\u003e ⓘ All of the crawler's settings are respected, exactly as they would be during normal iteration.\n\n\n## Filtering\n\nBy default, crawlers iterate over files only. If you need every filesystem entity found under the base directory, pass `only_files=False`:\n\n```python\ncrawler = Crawler('.', only_files=False)\n```\n\nIterating through the files in the directory, you may not want to view all files, but only files of a certain type. To do this, ignore all other files. How to do it? There are three ways:\n\n- Bypass only files with the specified [extensions](https://en.wikipedia.org/wiki/Filename_extension), such as `.txt`, `.doc`, or `.py`.\n- Bypass files whose paths follow a specific text pattern.\n- Use an arbitrary function to determine whether you need each specific path or not.\n\n\nTo select a specific method, you need to pass a specific parameter when creating the crawler object. Of course, all the methods can be combined with each other.\n\nTo set the file extensions you are interested in, use the `extensions` parameter:\n\n```python\ncrawler = Crawler('.', extensions=['.txt'])  # Iterate only on .txt files.\n```\n\n\u003e ⓘ The `extensions` parameter is available only in the default file-only mode, so it cannot be combined with `only_files=False`.\n\nAlso, if you only need Python files, you can use a special class to bypass them only, without specifying extensions:\n\n```python\nfrom dirstree import PythonCrawler\n\ncrawler = PythonCrawler('.')  # Iterate only on .py files.\n```\n\n\u003e ⓘ `PythonCrawler` is always file-only.\n\nTo specify which files and directories you do NOT want to iterate over, use the `exclude` parameter:\n\n```python\ncrawler = Crawler('.', exclude=['.git', 'venv'])  # Exclude \".git\" and \"venv\" directories.\n```\n\n\u003e ↑ Please note that we use the [`.gitignore` format](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository#_ignoring) here.\n\nIf you need a universal way to filter out unnecessary paths, pass your function as the `filter` parameter:\n\n```python\ncrawler = Crawler('.', filter=lambda path: len(str(path)) == 7)  # Iterate only on paths that are 7 characters long.\n```\n\n\n## Working with Cancellation Tokens\n\nYou can set an arbitrary condition under which file traversal will stop using [cancellation tokens](https://cantok.readthedocs.io/en/latest/the_pattern/) from the [`cantok`](https://github.com/pomponchik/cantok) library.\n\n\u003e There are two ways to do this ↓\n\n1. If you use the crawler as a one-time object for a single iteration, set the token when creating it:\n\n```python\nfor path in Crawler('.', token=TimeoutToken(0.0001)): # Limit the iteration time to 0.0001 seconds.\n    print(path)\n```\n\n2. If you plan to use the crawler object several times, use the `go()` method for iteration and pass a new token to it every time:\n\n```python\ncrawler = Crawler('.')\n\nfor path in crawler.go(token=TimeoutToken(0.0001)): # Limit the iteration time to 0.0001 seconds.\n    print(path)\n```\n\n\u003e ↑ Follow these rules to avoid accidentally \"baking\" an expired token inside a crawler object.\n\nBy default, cancellation stops iteration silently — the caller cannot tell it apart from natural exhaustion. Pass `raise_on_cancel=...` to make the crawler raise an exception on cancellation instead:\n\n```python\nfor path in Crawler('.', token=TimeoutToken(0.0001), raise_on_cancel=True):\n    print(path)\n```\n\n\u003e ↑ `raise_on_cancel=True` re-raises the native `cantok` exception; `raise_on_cancel=MyError(\"...\")` raises that exact instance; `raise_on_cancel=MyError` instantiates the class with the cantok message and raises that. Default is `False` (silent).\n\n\n## Combination\n\nYou can combine multiple crawler objects into one using the usual addition operator, like this:\n\n```python\nfor path in Crawler('../dirstree') + Crawler('../cantok'):\n    print(path)\n```\n\n\u003e ↑ The paths that you will iterate over will be automatically deduplicated.\n\n\u003e ↑ You can also impose arbitrary restrictions on each of the summed objects, all of them will be taken into account.\n\nYou can also pass multiple paths to a single crawler object:\n\n```python\nfor path in Crawler('../dirstree', '../cantok'):\n    print(path)\n```\n\n\u003e ↑ In this case, there is no deduplication of paths.\n\n\n## Transactionality\n\nIf you plan to modify the directory while iterating over it — for example, deleting or moving files inside an `apply()` callback — pass `freeze=True` to take a snapshot of every matching path up front, then iterate that snapshot instead of the live filesystem:\n\n```python\nCrawler('path/to/directory', freeze=True).apply(lambda p: p.unlink())\n```\n\n\u003e ↑ The snapshot is built on the first step of iteration, with every filter and cancellation token already applied. After that, any creation, renaming or deletion happening in the directory does not affect what is yielded — each call to `go()` or `iter()` produces its own fresh snapshot.\n\n\u003e ↑ Without `freeze=True` the order of yielded paths depends on the live state of the filesystem, so mid-iteration mutation may silently skip or duplicate entries.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmutating%2Fdirstree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmutating%2Fdirstree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmutating%2Fdirstree/lists"}