Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sp1thas/scrapy-folder-tree
[MIRROR OF https://codeberg.org/sp1thas/scrapy-folder-tree] A scrapy pipeline which stores files using folder trees.
https://github.com/sp1thas/scrapy-folder-tree
folder-structure folder-tree hacktoberfest python3 scrapy scrapy-extension scrapy-pipeline
Last synced: 9 days ago
JSON representation
[MIRROR OF https://codeberg.org/sp1thas/scrapy-folder-tree] A scrapy pipeline which stores files using folder trees.
- Host: GitHub
- URL: https://github.com/sp1thas/scrapy-folder-tree
- Owner: sp1thas
- License: mit
- Created: 2022-02-01T10:18:44.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2025-01-06T17:13:04.000Z (14 days ago)
- Last Synced: 2025-01-11T21:10:52.020Z (9 days ago)
- Topics: folder-structure, folder-tree, hacktoberfest, python3, scrapy, scrapy-extension, scrapy-pipeline
- Language: Python
- Homepage: https://scrapy-folder-tree.simakis.me/?utm=gh
- Size: 576 KB
- Stars: 9
- Watchers: 3
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: docs/contributing.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# scrapy-folder-tree
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/sp1thas/scrapy-folder-tree/master.svg)](https://results.pre-commit.ci/latest/github/sp1thas/scrapy-folder-tree/master)
[![codecov](https://codecov.io/gh/sp1thas/scrapy-folder-tree/branch/master/graph/badge.svg?token=Y4LGLWOD11)](https://codecov.io/gh/sp1thas/scrapy-folder-tree)
![PyPI](https://img.shields.io/pypi/v/scrapy-folder-tree)
[![GitHub license](https://img.shields.io/github/license/sp1thas/scrapy-folder-tree)](https://github.com/sp1thas/scrapy-folder-tree/blob/master/LICENSE)
![PyPI - Format](https://img.shields.io/pypi/format/scrapy-folder-tree)
![PyPI - Status](https://img.shields.io/pypi/status/scrapy-folder-tree)This is a scrapy pipeline that provides an easy way to store files and images using various folder structures.
## Supported folder structures:
Given this scraped file: `05b40af07cb3284506acbf395452e0e93bfc94c8.jpg`, you can choose the following folder structures:
Using the file name
class: `scrapy-folder-tree.ImagesHashTreePipeline`
```
full
├── 0
. ├── 5
. . ├── b
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
```Using the crawling time
class: `scrapy-folder-tree.ImagesTimeTreePipeline`
```
full
├── 0
. ├── 11
. . ├── 48
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
```Using the crawling date
class: `scrapy-folder-tree.ImagesDateTreePipeline`
```
full
├── 2022
. ├── 1
. . ├── 24
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
```## Installation
```shell
pip install scrapy-folder-tree
```## Usage
Use the following settings in your project:
```python
ITEM_PIPELINES = {
'scrapy_folder_tree.FilesHashTreePipeline': 300
}
```