Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rudokemper/json-media-scraper
A Python script to scrape all media content from a JSON file
https://github.com/rudokemper/json-media-scraper
Last synced: 19 days ago
JSON representation
A Python script to scrape all media content from a JSON file
- Host: GitHub
- URL: https://github.com/rudokemper/json-media-scraper
- Owner: rudokemper
- Created: 2023-10-06T20:44:39.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-06T20:45:35.000Z (over 1 year ago)
- Last Synced: 2024-12-12T14:39:39.111Z (29 days ago)
- Language: Python
- Size: 1000 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# JSON Media Scraper
This script is designed to scrape media files from a JSON file and a given URL and save them to specific directories based on their type (image, audio, video). The script uses a JSON file as input, which should contain the URLs of the media files to be downloaded.
### Usage
To use this script, you need to provide two command-line arguments:
- `-u` or `--url`: The base URL to scrape from.
- `-f` or `--file`: The path to the JSON file to load data from.Here is an example of how to run the script:
```
python ./main.py -u http://your_url -f ./your_json_file
```### Directory Structure
The script will automatically create the following directories if they do not exist:
- `./media/images/` for image files
- `./media/audio/` for audio files
- `./media/video/` for video files### JSON File Structure
The JSON file should contain the URLs of the media files to be downloaded. The script will traverse through the JSON data and download any media files it encounters based on the file header `Content-Type`.