https://github.com/moreati/scanwalk
Walk a directory tree using os.scandir(), generating DirEntry objects
https://github.com/moreati/scanwalk
Last synced: 7 months ago
JSON representation
Walk a directory tree using os.scandir(), generating DirEntry objects
- Host: GitHub
- URL: https://github.com/moreati/scanwalk
- Owner: moreati
- License: mit
- Created: 2022-07-31T13:11:53.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-11-21T21:27:41.000Z (almost 3 years ago)
- Last Synced: 2025-03-18T07:47:41.814Z (7 months ago)
- Language: Python
- Homepage: https://pypi.org/project/scanwalk/
- Size: 150 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
scanwalk
========`scanwalk.walk()` walks a directory tree, generating `DirEntry` objects.
It's an alternative to `os.walk()` modelled on `os.scandir()`.```pycon
>>> import scanwalk
>>> for entry in scanwalk.walk('demo'):
... print('📁' if entry.is_dir() else '📄', entry.path)
...
📁 demo
📁 demo/dir2
📁 demo/dir1
📁 demo/dir1/dir1.1
📄 demo/dir1/dir1.1/file_a
📄 demo/dir1/file_c
📁 demo/dir1/dir1.2
📄 demo/dir1/dir1.2/file_b
```a rough equivalent using `os.walk()` would be
```pycon
>>> import os
>>> for parent, dirnames, filenames in os.walk('demo'):
... print('📁', parent)
... for name in filenames:
... print('📄', os.path.join(parent, name))
...
📁 demo
📁 demo/dir2
📁 demo/dir1
📄 demo/dir1/file_c
📁 demo/dir1/dir1.1
📄 demo/dir1/dir1.1/file_a
📁 demo/dir1/dir1.2
📄 demo/dir1/dir1.2/file_b
```to skip the contents of a directory set the `DireEntry.skip` attribute
```pycon
>>> import scanwalk
>>> for entry in scanwalk.walk('demo'):
... if entry.name == 'dir1.1':
... entry.skip = True
... else:
... print(entry.path)
...
demo
demo/dir2
demo/dir1
demo/dir1/file_c
demo/dir1/dir1.2
demo/dir1/dir1.2/file_b
```## Comparison
| | `os.walk()` | `scanwalk.walk()` |
|-------------|--------------------------------------|----------------------------------------------------|
| Yields | `(dirpath, dirnames, filenames)` | `DirEntry` objects |
| Consumers | Nested `for` loops | Flat `for` loop, list comprehension, or generator expression |
| Grouping | Directories & files seperated | Directories & files intermingled |
| Traversal | Depth first or breadth first | Semi depth first, directories traversed on arrival |
| Exceptions | `onerror()` callback | `try`/`except` block |
| Allocations | Builds intermediate lists | Direct from `os.scandir()` |
| Maturity | Mature | Alpha |
| Tests | Thorough automated unit tests | None |
| Performance | 1.0x | 1.1 - 1.2x faster |## Installation
```sh
python -m pip install scanwalk
```## Requirements
- Python 3.7+
## License
MIT
## Questions and Answers
### What's wrong with `os.walk()`?
`os.walk()` is plenty good enough, it's just an awkward return type to use
inside a list comprehension, a generator expression, or similar.### Why use `scanwalk`?
`scanwalk.walk()` eeks out a little more speed (10-20% in an adhoc benchmark).
It doesn't require nested for loops, so code is a bit easier to read and write.
In particular list comprehensions and generator expressions become simpler.### Why not use `scanwalk`?
`scanwalk` is still alpha, mostly untested, and almost entirely undocumented.
It only supports newer Pythons, on platforms with a working `os.scandir()`.`scanwalk.walk()` behaviour differs from `os.walk()`
- directories and files are intermingled, rather than seperated
- Traversal is always semi depth-first## Related work
- [`scandir`](https://pypi.org/project/scandir/) - backport of `os.scandir()`
for Python 2.7 and 3.4## TODO
- Implement context manager protocol, similar to `os.scandir()`
- Documentation
- Tests
- Continuous Integration
- Coverage
- Code quality checks (MyPy, flake8, etc.)
- `scanwalk.copytree()`?
- `scanwalk.DirEntry.depth`?
- Linux io_uring support?