Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kordless/grub-2.0
Grub is an AI powered Web crawler.
https://github.com/kordless/grub-2.0
Last synced: 3 months ago
JSON representation
Grub is an AI powered Web crawler.
- Host: GitHub
- URL: https://github.com/kordless/grub-2.0
- Owner: kordless
- Created: 2020-10-19T18:41:48.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-12-17T17:02:55.000Z (almost 2 years ago)
- Last Synced: 2024-05-15T04:34:19.099Z (6 months ago)
- Language: Python
- Homepage:
- Size: 8.97 MB
- Stars: 19
- Watchers: 3
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Grub-2.0
Grub is a system for crawling and indexing documents and then running machine learning models to them. This is a work in progress.# Deploy a Solr "Neural" Indexer.
These scripts deploy a single Solr instance (9.1.0) running on Google Cloud.There is also a [Docker version of Solr available](https://hub.docker.com/_/solr), if you don't use Google Cloud.
## Option #1 - Run on Google Cloud
Change into the solr/scripts directory and create a file called secrets.sh:```
$ vi secrets.sh
TOKEN=f00bar
:x
```
Now that's saved, deploy the instance:
```
$ ./deploy-solr.sh
```
> Instance will be running in 2.5 minutes, listening on port 8389.The URL for the admin interface looks like this: http://solr:[email protected]:8389
## Option #2 - Run a Docker Container
You can also run Solr in a Docker container:```
docker run -p 8983:8983 -t solr
```# API
## Fastener
Deploy a controller box for managing spot Solr instances using an API. The scripts should be copied to Google Compute instance templates.```
$ ./deploy-fastener.sh
```Instance will be running and listening on port 80.