https://github.com/sp1thas/scrapy-folder-tree
[MIRROR OF https://codeberg.org/sp1thas/scrapy-folder-tree] A scrapy pipeline which stores files using folder trees.
https://github.com/sp1thas/scrapy-folder-tree
folder-structure folder-tree hacktoberfest python3 scrapy scrapy-extension scrapy-pipeline
Last synced: 4 months ago
JSON representation
[MIRROR OF https://codeberg.org/sp1thas/scrapy-folder-tree] A scrapy pipeline which stores files using folder trees.
- Host: GitHub
- URL: https://github.com/sp1thas/scrapy-folder-tree
- Owner: sp1thas
- License: mit
- Created: 2022-02-01T10:18:44.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2025-01-06T17:13:04.000Z (5 months ago)
- Last Synced: 2025-01-11T21:10:52.020Z (4 months ago)
- Topics: folder-structure, folder-tree, hacktoberfest, python3, scrapy, scrapy-extension, scrapy-pipeline
- Language: Python
- Homepage: https://scrapy-folder-tree.simakis.me/?utm=gh
- Size: 576 KB
- Stars: 9
- Watchers: 3
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: docs/contributing.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# scrapy-folder-tree
[](https://results.pre-commit.ci/latest/github/sp1thas/scrapy-folder-tree/master)
[](https://codecov.io/gh/sp1thas/scrapy-folder-tree)

[](https://github.com/sp1thas/scrapy-folder-tree/blob/master/LICENSE)

This is a scrapy pipeline that provides an easy way to store files and images using various folder structures.
## Supported folder structures:
Given this scraped file: `05b40af07cb3284506acbf395452e0e93bfc94c8.jpg`, you can choose the following folder structures:
Using the file name
class: `scrapy-folder-tree.ImagesHashTreePipeline`
```
full
├── 0
. ├── 5
. . ├── b
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
```Using the crawling time
class: `scrapy-folder-tree.ImagesTimeTreePipeline`
```
full
├── 0
. ├── 11
. . ├── 48
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
```Using the crawling date
class: `scrapy-folder-tree.ImagesDateTreePipeline`
```
full
├── 2022
. ├── 1
. . ├── 24
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
```## Installation
```shell
pip install scrapy-folder-tree
```## Usage
Use the following settings in your project:
```python
ITEM_PIPELINES = {
'scrapy_folder_tree.FilesHashTreePipeline': 300
}
```