Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arghyadipchak/craww
Gemini (protocol) crawler written in Rust
https://github.com/arghyadipchak/craww
crawler gemini gemini-protocol rust
Last synced: 3 days ago
JSON representation
Gemini (protocol) crawler written in Rust
- Host: GitHub
- URL: https://github.com/arghyadipchak/craww
- Owner: arghyadipchak
- Created: 2022-11-18T14:01:53.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-11-03T13:05:16.000Z (2 months ago)
- Last Synced: 2024-11-03T14:18:47.477Z (2 months ago)
- Topics: crawler, gemini, gemini-protocol, rust
- Language: Rust
- Homepage:
- Size: 49.8 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Craww
Gemini Crawler written in Rust. Information Retrieval project of [Arghyadip](https://github.com/arghyadipchak/) and [Gurdit](https://github.com/16bitmood/) @[CMI](https://www.cmi.ac.in)
## Getting Started
### For Docker (Recommended)
1. Install docker and docker-compose-plugin
2. Clone the repository
```sh
git clone https://github.com/arghyadipchak/craww
```
3. Create a config.toml file (example config given)
4. Build and Run
```sh
docker compose up
```
### For Non-Docker1. [Install Rust](https://www.rust-lang.org/tools/install)
2. Clone the repository
```sh
git clone https://github.com/arghyadipchak/craww
```
3. Build Craww
```sh
cargo build --release
```
4. Create a `config.toml` file (example config below)
5. Run Craww
```sh
./target/release/craww
```
OR You can run Craww directly with
```sh
cargo run
```## Configuration
Example config file (`config.toml`)
```toml
root = "gemini.circumlunar.space" #Root Seed
timeout = 5 #Connection Timeout(in secs)
database = "store.db" #Sqlite file[cache] #Bloom Filter config
expected_web_pages = 100000
false_positive_rate = 0.01
```