{"id":30451909,"url":"https://github.com/varppi/jupitersearch","last_synced_at":"2026-04-18T00:31:40.524Z","repository":{"id":229285081,"uuid":"773081281","full_name":"varppi/JupiterSearch","owner":"varppi","description":"JupiterSearch distributed text search database","archived":false,"fork":false,"pushed_at":"2025-08-09T07:35:06.000Z","size":85,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-23T18:06:58.998Z","etag":null,"topics":["database","distributed-database","text-search"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/varppi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-16T17:36:37.000Z","updated_at":"2025-08-09T07:35:10.000Z","dependencies_parsed_at":null,"dependency_job_id":"80490f22-0230-4dc1-837f-022a2b0eb79a","html_url":"https://github.com/varppi/JupiterSearch","commit_stats":null,"previous_names":["r00tendo/jupitersearch","spoofimei/jupitersearch","varppi/jupitersearch"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/varppi/JupiterSearch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varppi%2FJupiterSearch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varppi%2FJupiterSearch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varppi%2FJupiterSearch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varppi%2FJupiterSearch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/varppi","download_url":"https://codeload.github.com/varppi/JupiterSearch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varppi%2FJupiterSearch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31951255,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T17:29:20.459Z","status":"ssl_error","status_checked_at":"2026-04-17T17:28:47.801Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","distributed-database","text-search"],"created_at":"2025-08-23T14:19:04.059Z","updated_at":"2026-04-18T00:31:40.489Z","avatar_url":"https://github.com/varppi.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg width=500 src=\"https://github.com/Varppi/JupiterSearch/assets/72181445/df7259fc-862f-4c47-848a-b53edf473c31\"\u003e\u003c/img\u003e\n\n![Maintained](https://img.shields.io/badge/Maintained%3F-yes-green.svg?style=for-the-badge)\n![Go](https://img.shields.io/badge/Go-00ADD8?style=for-the-badge\u0026logo=go\u0026logoColor=white)\n![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge\u0026logo=docker\u0026logoColor=white)\n![Linux](https://img.shields.io/badge/Linux-FCC624?style=for-the-badge\u0026logo=linux\u0026logoColor=black)\n# JupiterSearch (version 1.3.3)\n\nJupiterSearch is an easy to set up distributed text search database that is designed for searching for unique information or keywords like serial numbers, email addresses, and domain names from huge amounts of unstructured data, for example, websites, documents, and emails.\n\n**What JupiterSearch offers you:**\n- Easy to set up\n- Suitable for unstructured data like emails/documents/web pages\n- Can handle terabytes of data\n- Client library\n- Trivial horizontal scaling\n\n**What JupiterSearch is NOT good for:**\n- Relational data\n- Extremely sensitive data (because HTTPS is not enabled by default and keys are stored plaintext in conf files)\n\n\u003cbr\u003e\n\n**Todo (in chronological order from oldest to newest):**\n- [x] Custom tokenization\n- [x] HTTPS\n- [x] Multiple queries\n- [X] Docker \n- [X] Make repo public \n- [X] Github wiki\n\nNew improvement ideas are welcome :)\n\u003cbr\u003e\n\n# Getting started\n### Prerequisites\n- Go (preferably the latest version)\n- Linux -based system or any other OS with Docker\n- At least 2GB of disk space\n\n\u003cbr\u003e\n\n### Installation\nDownload JupiterSearch either by using `git clone` or by downloading and unpacking the zip file on this page.\n\n```sh\ngit clone https://github.com/Varppi/JupiterSearch\n```\n\n\u003cbr\u003e\n\nRun `make install` as root to automatically download the dependencies, compile the programs, and install JupiterServer, JupiterNode, and JupiterClient on your system (/usr/local/bin).\n\n```sh\nsudo make install\n```\n\n\u003cbr\u003e\n\n### Docker\nYou can also run JupiterSearch with docker. To do so, begin by creating a network.\n```sh\ndocker network create --subnet 172.18.0.0/16 JupiterSearch\n```\nBy default, the IP range for this newly created network will be 172.18.0.0/16\n\n\u003cbr\u003e\n\nBefore building and running the images, configure the settings to your liking at configs/ (do not edit data dir if you won't use -v)\n\n\u003cbr\u003e\n\nNow build the image(s)\n```\n# JupiterNode:\ndocker build -t jupiternode -f JupiterNode-Dockerfile .\n\n# JupiterServer:\ndocker build -t jupiterserver -f JupiterServer-Dockerfile .\n```\n\nRun the image(s)\n```\n# JupiterServer:\ndocker run --net JupiterSearch --ip 172.18.0.50 jupiterserver\n\n# JupiterNode:\ndocker run --net JupiterSearch -p 9190:9190 --ip 172.18.0.51 jupiternode #Change 9190:9190 to the correct ports if you changed the defaults\n```\n\nIf you want persistent storage, use the -v flag to mount a directory from your host system to the docker image's data directory.\n```\ndocker run --net JupiterSearch -p 9190:9190 --ip 172.18.0.51 -v pathfromhostsystem:/JupiterSearch/data jupiternode \n```\n\n\u003cbr\u003e\n\n### Config settings (the ones marked with * are required)\n- #### Node\n  - *`datadir`: Path to where the database will be stored in\n  - *`max_concurrent_ingests`: Amount of concurrent store requests that are allowed\n  - *`name`: The name that will show as the source for results when you query something\n    \n- #### Master server\n  - *`client_key` \u003cb\u003e(IMPORTANT)\u003c/b\u003e: This is essentially the password for the whole system. Clients authenticate using this.\n  - *`nodes` \u003cb\u003e(IMPORTANT)\u003c/b\u003e: List of nodes separated by a space like this: `nodes=http://127.0.0.1:9192 http://127.0.0.1:9193`\n\n- #### Universal\n  - *`api_listen`: What host the rest API will be binded to\n  - *`node_key`: A key that the master server will use to authenticate itself to the node\n  - `tls_cert`: Location to a public certificate (for encrypted rest API traffic)\n  - `tls_private`: Location to a private key (used with tls_cert)\n\n\u003cbr\u003e\n\n### Configuring node(s)\nOpen \u003cb\u003e\u003ci\u003e/etc/JupiterSearh/JupiterNode.conf\u003c/i\u003e\u003c/b\u003e with your favorite text editor on the machine you want to use as a node.\n\nWhen you open the file, you will be greeted with these default settings:\n```env\nname=main_node\ndatadir=data\napi_listen=127.0.0.1:9192\nnode_key=JupiterKey\nmax_concurrent_ingests=5\n```\n\nMost of these you can leave to default, but I highly recommend changing the `key`, since if you don't, and bind JupiterNode to all interfaces, anyone on the network could get access to your node. \n\n#### Making JupiterNode accessible from LAN\nUnless you're planning to use JupiterSearch on a single machine that runs both the JupiterServer and JupiterNode, you would want to change `api_listen` to bind all interfaces or just your specific network adapter:\n```sh\napi_listen=0.0.0.0:9192\n```\n\n\u003cbr\u003e\n\n### Configuring the master server\nOpen \u003cb\u003e\u003ci\u003e/etc/JupiterSearh/JupiterServer.conf\u003c/i\u003e\u003c/b\u003e with your favorite text editor on the machine you want to use as the master server (the one clients can use to store and query data).\n\nThese are the default settings:\n```sh\napi_listen=127.0.0.1:9190\nnode_key=JupiterKey\nclient_key=changeme\nnodes=http://127.0.0.1:9192\n```\n\nChange the `client_key` to something strong and random. Think of it as an API key. A client that has it can do everything.\n\nIf you changed `node_key` from the defaults in the node configs, set the same key as a value for `node_key` on the server configs as well.\n\nAdd your nodes to the `nodes` variable, separated by a space character.\n\n\u003cbr\u003e\n\n### Customizing the tokenization\nBy default, JupiterSearch extracts all the words and other information by running this regex against the data: `[\\w+.+_+@]{4,}`.\n\nHowever, you can customize this by editing the regex found in \u003cb\u003e\u003ci\u003e/etc/JupiterSearch/tokenization_regex\u003c/i\u003e\u003c/b\u003e.\n\n\n# Usage\nThere are two ways you can run JupiterNode and JupiterServer.\n- As a service\n- Commandline\nI recommend first running both on the commandline with the `--debug` flag to make sure everything is working, but after that, it would be easier to run them as a service.\n#### Commandline\nJupiterServer:\n```\nJupiterServer --start --debug\n```\n\nJupiterNode:\n```\nJupiterNode --start --debug\n```\n\n#### Service\nJupiterServer:\n```\nsystemctl start JupiterServer\n```\n\nJupiterNode:\n```\nsystemctl start JupiterNode\n```\n\nRemember to run JupiterNode first, since JupiterServer tries to connect to all the nodes within the config file, and if it is unsuccessful, it will ignore the node(s).\n\n\u003cbr\u003e\n\n### JupiterClient\nUnless you want to code a client yourself, using JupiterClient is a solid option for manually operating JupiterSearch.\n\nJupiterClient syntax:\n```sh\nJupiterClient --server \u003cmaster server url\u003e --key \u003cclient_key\u003e \u003carguments\u003e\n```\n\nExample:\n```sh\nJupiterClient --server http://127.0.0.1:9190 --key 3ms9dk2lfhs83bf9s20 --upload movies.json\n```\n\n\u003cbr\u003e\n\n# Making your own client\nIf you don't want to use JupiterClient or want to integrate JupiterSearch into your Golang projects, you may be interested in creating your own client. Fortunately, this is very easy with the help of the JupiterSearch client library.\n\nRead more here: \u003ca href=\"https://github.com/Varppi/JupiterSearch/wiki/Client-library-usage\"\u003e https://github.com/Varppi/JupiterSearch/wiki/Client-library-usage\u003c/a\u003e\n\n# Understanding JupiterSearch\n### JupiterSearch parts\nJupiterSearch consists of three parts:\n- The client\n- The master server\n- The node(s)\n  \n#### Client\nBy client, I refer to any program that wants to store or query data from JupiterSearch. This could be the official client (JupiterClient) or another one that someone built using the client library.\n\n#### Master server\nThe master server is the service clients interact with. It keeps track of all the nodes, removes inactive ones, and makes sure that the data is equally spread out among all the nodes.\n\n#### Node(s)\nNode is the service that actually has the data and can query it. It receives commands/requests from the master server and responds to them appropriately.\n\n\n\u003cimg width=700 src=\"https://github.com/Varppi/JupiterSearch/assets/72181445/04f567f8-b517-49cd-99db-f19fb3bc54ce\"\u003e\u003c/img\u003e\n\n\u003cbr\u003e\n\n### Data storage\nJupiterSearch uses \u003ca href=\"https://github.com/dgraph-io/badger\"\u003eBadger\u003c/a\u003e as its underlying database.\n\nWhen the master server receives a document to be stored, this is what happens in the backend:\n1. Master server: Looks at all the node(s) database sizes, picks one with the smallest database, and forwards the request to it.\n2. Node: Stores the full document in the database with a unique ID.\n3. Node: Converts the document to lowercase, tokenizes it (gets all words from it), and removes duplicates.\n4. Node: Loops through all the words/usernames/emails and stores them with the ID of the full document.\n\n\u003cbr\u003e\n\n#\n\n### Contributions are welcome \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvarppi%2Fjupitersearch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvarppi%2Fjupitersearch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvarppi%2Fjupitersearch/lists"}