Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mariot/chan-downloader
CLI to download all images/webms in a 4chan thread
https://github.com/mariot/chan-downloader
4chan 4chan-downloader crawler scraper
Last synced: 24 days ago
JSON representation
CLI to download all images/webms in a 4chan thread
- Host: GitHub
- URL: https://github.com/mariot/chan-downloader
- Owner: mariot
- License: mit
- Created: 2019-07-08T00:56:09.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-08-22T08:22:22.000Z (over 2 years ago)
- Last Synced: 2024-08-11T12:33:02.860Z (5 months ago)
- Topics: 4chan, 4chan-downloader, crawler, scraper
- Language: Rust
- Size: 45.9 KB
- Stars: 44
- Watchers: 4
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
chan-downloader
===============
Clone of [4chan-downloader](https://github.com/Exceen/4chan-downloader/) written in RustCLI to download all images/webms of a 4chan thread.
If you use the reload flag, previously saved image won't be redownloaded.
Best results obtained while using the option `-c 4` (4 concurrent downloads).
```bash
USAGE:
chan-downloader [FLAGS] [OPTIONS] --threadFLAGS:
-h, --help Prints help information
-r, --reload Reload thread every t minutes to get new images
-V, --version Prints version informationOPTIONS:
-c, --concurrent Number of concurrent requests (Default is 2)
-i, --interval Time between each reload (in minutes. Default is 5)
-l, --limit Time limit for execution (in minutes. Default is 120)
-o, --output Output directory (Default is 'downloads')
-t, --thread URL of the thread
```chan_downloader
===============
You can also use chan_downloader, the library used## save_image
Saves the image from the url to the given path. Returns the path on success
```rust
use reqwest::Client;
use std::env;
use std::fs::remove_file;
let client = Client::new();
let workpath = env::current_dir().unwrap().join("1489266570954.jpg");
let url = "https://i.4cdn.org/wg/1489266570954.jpg";
let answer = chan_downloader::save_image(url, workpath.to_str().unwrap(), &client).unwrap();assert_eq!(workpath.to_str().unwrap(), answer);
remove_file(answer).unwrap();
```## get_page_content
Returns the page content from the given url.
```rust
use reqwest::Client;
let client = Client::new();
let url = "https://boards.4chan.org/wg/thread/6872254";
match chan_downloader::get_page_content(url, &client) {
Ok(page) => println!("Content: {}", page),
Err(err) => eprintln!("Error: {}", err),
}
```## get_thread_infos
Returns the board name and thread id.
```rust
let url = "https://boards.4chan.org/wg/thread/6872254";
let (board_name, thread_id) = chan_downloader::get_thread_infos(url);assert_eq!(board_name, "wg");
assert_eq!(thread_id, "6872254");
```## get_image_links
Returns the links and the number of links from a page. Note that the links are doubled.
```rust
use reqwest::Client;
let client = Client::new();
let url = "https://boards.4chan.org/wg/thread/6872254";
match chan_downloader::get_page_content(url, &client) {
Ok(page_string) => {
let (links_iter, number_of_links) = chan_downloader::get_image_links(page_string.as_str());assert_eq!(number_of_links, 4);
for cap in links_iter.step_by(2) {
println!("{} and {}", &cap[1], &cap[2]);
}
},
Err(err) => eprintln!("Error: {}", err),
}
```