Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dotcomboom/Gophew
Gopher crawler and search engine
https://github.com/dotcomboom/Gophew
gopher gopher-crawler gopher-server pituophis
Last synced: 3 days ago
JSON representation
Gopher crawler and search engine
- Host: GitHub
- URL: https://github.com/dotcomboom/Gophew
- Owner: dotcomboom
- License: unlicense
- Created: 2019-03-03T20:45:42.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-02-14T23:50:32.000Z (over 1 year ago)
- Last Synced: 2024-08-02T05:11:56.347Z (3 months ago)
- Topics: gopher, gopher-crawler, gopher-server, pituophis
- Language: Python
- Size: 25.4 KB
- Stars: 3
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Gophew
Gopher crawler and search-enabled server powered by Pituophis.## crawler.py
This script creates the searchable database and outputs it as JSON. Edit settings inside the script.```python
###########################settings = {
'limit_host': 'your.live.host', # Host to limit to (for indexing single servers, which is highly recommended)
'only_record_host': True,
'path_must_start_with': '/', # What the path/selector must start with
'db_filename': 'db.json', # Filename to use for the database
'delay': 2, # x second delay between grabbing files; please be courteous to servers you don't own!
'crawl_url': 'gopher://your.live.host/1/', # URL to crawl (after finished updating the index)
'cooldown': 86400, # Required cooldown in ms before crawling a URL again
'ignore_types': ['i', '3'] # Types of items that should be ignored and not recorded
}###########################
```## gophew.py
The frontend Gopher server, that uses [Pituophis](https://github.com/dotcomboom/Pituophis) with an alternate handler.```python
###########################settings = {
# Pituophis server options
'host': 'your.live.host',
'port': 70,
'pub_dir': 'pub/',# Gophew
'index': 'db.json', # Index to use (generated by crawler.py)
'alternate_titles': True, # Whether to display alternate titles
'referrers': True, # Whether to display referring URLs
'search_path': '/search', # What the path must start with in order to do a search (a file shouldn't exist here for the alt handler to go off)
'typestrings': True, # Allow filtering searches by type, i.e. /search01 for textfiles and directories.
'root_path': '/', # Path to link to on the results page
'allow_empty_queries': False, # Whether to allow empty search queries
# Below lines can be disabled by setting them to None
'root_text': 'Back to root',
'new_search_text': 'Try another search',
'new_search_text_same_filter': 'Try another search with the same criteria',
'results_caption': 'Results for {} (out of {} items)',
'types_caption': 'Filtering types: {}',
'empty_queries_not_allowed_msg': 'Empty search queries are not allowed on this server.'
}###########################
```