{"id":13689106,"url":"https://github.com/rtrevinnoc/FUTURE","last_synced_at":"2025-05-01T23:32:15.069Z","repository":{"id":155595088,"uuid":"260459445","full_name":"rtrevinnoc/FUTURE","owner":"rtrevinnoc","description":"A private, free, open-source search engine built on a P2P network","archived":true,"fork":false,"pushed_at":"2023-10-26T21:11:37.000Z","size":14053,"stargazers_count":21,"open_issues_count":1,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-08-03T15:15:24.829Z","etag":null,"topics":["css3","flask","flask-application","gensim","glove","glove-embeddings","glove-vectors","hnswlib","html5","javascript","js","json","lmdb","machine-learning","mongodb","natural-language-processing","natural-language-understanding","python","python3","search-engine"],"latest_commit_sha":null,"homepage":"https://wearebuildingthefuture.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rtrevinnoc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"COPYING.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-05-01T12:57:23.000Z","updated_at":"2023-10-26T21:12:03.000Z","dependencies_parsed_at":null,"dependency_job_id":"e8a5c37c-4dff-4cfb-b477-98d036e3e138","html_url":"https://github.com/rtrevinnoc/FUTURE","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rtrevinnoc%2FFUTURE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rtrevinnoc%2FFUTURE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rtrevinnoc%2FFUTURE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rtrevinnoc%2FFUTURE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rtrevinnoc","download_url":"https://codeload.github.com/rtrevinnoc/FUTURE/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224282123,"owners_count":17285775,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["css3","flask","flask-application","gensim","glove","glove-embeddings","glove-vectors","hnswlib","html5","javascript","js","json","lmdb","machine-learning","mongodb","natural-language-processing","natural-language-understanding","python","python3","search-engine"],"created_at":"2024-08-02T15:01:33.798Z","updated_at":"2024-11-12T13:30:47.493Z","avatar_url":"https://github.com/rtrevinnoc.png","language":"Python","funding_links":["https://www.buymeacoffee.com/searchatfuture","https://www.buymeacoffee.com/searchatfuture'"],"categories":["flask"],"sub_categories":[],"readme":"# NOTICE:\r\nThis repository is now read-only. Work has been moved to the Pear project, see ![](https://github.com/rtrevinnoc/fruit) for frontend and ![](https://github.com/rtrevinnoc/tree) for backend\r\n\r\n![Website](https://img.shields.io/website?down_color=red\u0026down_message=offline\u0026up_color=green\u0026up_message=online\u0026url=https%3A%2F%2Fwearebuildingthefuture.com) [![Documentation Status](https://readthedocs.org/projects/wearebuildingthefuture/badge/?version=latest)](https://wearebuildingthefuture.readthedocs.io/en/latest/?badge=latest) ![GitHub](https://img.shields.io/github/license/rtrevinnoc/FUTURE) ![Keybase BTC](https://img.shields.io/keybase/btc/rtrevinnoc)\\\r\n[![Buy Me A Coffee](https://camo.githubusercontent.com/031fc5a134cdca5ae3460822aba371e63f794233/68747470733a2f2f7777772e6275796d6561636f666665652e636f6d2f6173736574732f696d672f637573746f6d5f696d616765732f6f72616e67655f696d672e706e67)](https://www.buymeacoffee.com/searchatfuture)\r\n\r\n# FUTURE\r\n\r\n![Screenshot_20200517_192300](https://user-images.githubusercontent.com/7103315/82164538-bea0e600-9876-11ea-8d42-c8a1b126d8fb.png)\r\n\r\nFUTURE is a completely stand alone, open-source search engine that's focused on privacy and decentralization, so that any user can also self-host their own instance to contribute to a shared index of web pages accessible through any one of them. Given the small index that it currently has, it also works as a meta-search engine, mixing its own results with others from public Searx instances, to be capable of answering any request properly. [Here is a small presentation](https://future-pitch.glitch.me/#0) that serves to show why FUTURE is different, better and how it accomplishes that.\r\n\r\nThe decentralization aspect of the search engine is a core feature as it allows anyone to expand the index and improve the service, while also increasing reliability by redundancy. Currently the main node is located at https://wearebuildingthefuture.com.\r\n\r\nIf you are planning to host your own instance, we strongly encourage you to consider using [Uberspace](https://uberspace.de/en/) as they offer an excellent service and instances for a fair price.\r\n\r\n\r\n\r\n## HOW DOES IT WORK?\r\n\r\n![Graph](https://cdn.glitch.com/ede86e6d-2c5a-40c6-b1a1-546bb881a618%2Fhow_it_works.png?v=1612302725088)\r\n\r\n\r\n\r\n## DOCUMENTATION\r\n\r\nDocumentation is available on-line at https://wearebuildingthefuture.readthedocs.io/en/latest/ and in the `docs` directory.\r\n\r\n### QUICKSTART\r\n\r\nAfter cloning the repository, add a `config.py` file, which will allow you to customize important parts of your instance without directly modifying the source code and struggling with updates. It is suggested to start with this configuration template, which is essentially equal to the one used for the main instance:\r\n\r\n```python\r\n#!/usr/bin/env python3\r\n# -*- coding: utf8 -*-\r\nimport secrets\r\nfrom web3 import Web3\r\nfrom tranco import Tranco\r\n\r\nt = Tranco(cache=True, cache_dir='.tranco')\r\n\r\nWTF_CSRF_ENABLED = True\r\nSECRET_KEY = secrets.token_urlsafe(16)\r\nHOST_NAME = \"my_public_future_instance\"         # THE NAMES 'private' and 'wearebuildingthefuture.com' are reserved for private and main nodes, respectively.\r\nSEED_URLS = [\"http://\" + x for x in t.list().top(1000)]\r\nPEER_PORT = 3000\r\nHOME_URL = \"wearebuildingthefuture.com\"\r\nLIMIT_DOMAINS = None\r\nALLOWED_DOMAINS = []\r\nCONCURRENT_REQUESTS = 10\r\nCONCURRENT_REQUESTS_PER_DOMAIN = 2.0\r\nCONCURRENT_ITEMS = 100\r\nREACTOR_THREADPOOL_MAXSIZE = 20\r\nDOWNLOAD_MAXSIZE = 10000000\r\nAUTOTHROTTLE = True\r\nTARGET_CONCURRENCY = 2.0\r\nMAX_DELAY = 30.0\r\nSTART_DELAY = 1.0\r\nDEPTH_PRIORITY = 1\r\nLOG_LEVEL = 'INFO'\r\nCONTACT = \"rtrevinnoc@wearebuildingthefuture.com\"\r\nMAINTAINER = \"Roberto Treviño Cervantes\"\r\nFIRST_NOTICE = \"Written and Mantained By \u003ca href='https://keybase.io/rtrevinnoc'\u003eRoberto Treviño\u003c/a\u003e\"\r\nSECOND_NOTICE = \"Proudly Hosted on \u003ca href='https://uberspace.de/en/'\u003eUberspace\u003c/a\u003e\"\r\nDONATE = \"\u003ca href='https://www.buymeacoffee.com/searchatfuture'\u003eDONATE\u003c/a\u003e\"\r\nCOLABORATE = \"\u003ca href='https://github.com/rtrevinnoc/FUTURE'\u003eCOLABORATE\u003c/a\u003e\"\r\nCACHE_TIMEOUT = 15\r\nCACHE_THRESHOLD = 100\r\nCOMPLEMENTARY_VECTOR_CACHE = -1\r\ntry:\r\n\tWEB3API = Web3(Web3.HTTPProvider('http://127.0.0.1:8545'))\r\n\tETHEREUM_ACCOUNT = WEB3API.eth.accounts[0]\r\n\tCONTRACT_CODE = 'future-token/build/contracts/FUTURE.json'\r\n\tCONTRACT_ADDRESS = \"0x2ebDA3D6B2F24aE57164b0384daa9af2C0D17323\"\r\nexcept:\r\n\tpass\r\n```\r\n\r\n**NOTE:** In case you want to use a docker container, simpy run the following commands before everything else below (Or use the pre-built image from [DockerHub](https://hub.docker.com/repository/docker/rtrevinnoc/future)):\r\n\r\n```bash\r\ndocker build -t future .\r\ndocker run -i -t -p 3000:3000 future bash\r\n```\r\n\r\nAfter you have configurated your FUTURE instance, but before you can start the server, you will be required to add a minimum of ~25 urls to your local index, by executing:\r\n\r\n```bash\r\nchmod +x bootstrap.sh\r\n./bootstrap.sh\r\n./build_index.sh\r\n```\r\n\r\nAt any point in time, you can check how much webpages are in your local index by executing:\r\n\r\n```bash\r\npython3 count_index.py\r\n```\r\n\r\nAnd eventually, you can interrupt the crawler by executing:\r\n\r\n```bash\r\n./save_index.sh\r\n```\r\n\r\nNaturally, you can restart it using `./build_index.sh`. And with this, you can start your development server with:\r\n\r\n```bash\r\n./future.py\r\n```\r\n\r\nHowever, if you are planning to contribute to the shared index by making your instance public, it is recommended to use uWSGI. We suggest using this configuration template, with `touch uwsgi.ini`, as it is used on the main instance.\r\n\r\n```yaml\r\n[uwsgi]\r\nmodule = future:app\r\npidfile = future.pid\r\nhttp-socket = :3000\r\nchmod-socket = 660\r\nstrict = true\r\nmaster = true\r\nenable-threads = true\r\nvacuum = true                        ; Delete sockets during shutdown\r\nsingle-interpreter = true\r\ndie-on-term = true                   ; Shutdown when receiving SIGTERM (default is respawn)\r\nneed-app = true\r\n\r\ndisable-logging = true               ; Disable built-in logging\r\nlog-4xx = true                       ; but log 4xx's anyway\r\nlog-5xx = true                       ; and 5xx's\r\n\r\ncheaper-algo = busyness\r\nprocesses = 6                        ; Maximum number of workers allowed\r\ncheaper = 1                          ; Minimum number of workers allowed\r\ncheaper-initial = 2                  ; Workers created at startup\r\ncheaper-overload = 1                 ; Length of a cycle in seconds\r\ncheaper-step = 1                     ; How many workers to spawn at a time\r\n\r\ncheaper-busyness-multiplier = 30     ; How many cycles to wait before killing workers\r\ncheaper-busyness-min = 20            ; Below this threshold, kill workers (if stable for multiplier cycles)\r\ncheaper-busyness-max = 70            ; Above this threshold, spawn new workers\r\ncheaper-busyness-backlog-alert = 4   ; Spawn emergency workers if more than this many requests are waiting in the queue\r\ncheaper-busyness-backlog-step = 2    ; How many emergency workers to create if there are too many requests in the queue\r\n```\r\n\r\nFinally, start your public node to contribute to the shared network with the following command:\r\n\r\n```bash\r\nuwsgi uwsgi.ini\r\n```\r\n\r\n\r\n## DEPENDENCIES\r\n\r\nBelow are listed all the projects upon which __FUTURE__ rests.\r\nName | License\r\n---|---\r\n[Flask](https://github.com/pallets/flask)|BSD 3-Clause\r\n[Werkzeug](https://github.com/pallets/werkzeug)|BSD 3-Clause                \r\n[SymSpell](https://github.com/wolfgarbe/SymSpell/)|MIT\r\n[Polyglot](https://github.com/aboSamoor/polyglot/)|GPL v3                   \r\n[Beautifulsoup ](https://code.launchpad.net/beautifulsoup)|BSD 2-Clause              \r\n[BSON Python bindings](https://github.com/py-bson/bson)|Apache 2.0                \r\n[NumPy](https://github.com/numpy/numpy)|BSD 3-Clause     \r\n[GeoPy](https://github.com/geopy/geopy)|MIT                   \r\n[SciKit Learn](https://github.com/scikit-learn/scikit-learn)|BSD 3-Clause                 \r\n[Pandas](https://github.com/pandas-dev/pandas)|BSD 3-Clause     \r\n[Gensim](https://github.com/RaRe-Technologies/gensim)|LGPL 2.1                      \r\n[NLTK](https://github.com/nltk/nltk)|Apache 2.0      \r\n[Scrapy](https://github.com/scrapy/scrapy)|BSD License                   \r\n[H5PY](https://github.com/h5py/h5py)|BSD 3-Clause              \r\n[LMBD](https://github.com/LMDB/lmdb)|OpenLDAP\r\n[LMBD Python bindings](https://github.com/jnwatson/py-lmdb)|OpenLDAP                    \r\n[tldextract](https://github.com/john-kurkowski/tldextract)|BSD 3-Clause       \r\n[WTForms](https://github.com/wtforms/wtforms)|BSD 3-Clause               \r\n[Flask_wtf](https://github.com/lepture/flask-wtf)|BSD 3-Clause\r\n[HNSWLib](https://github.com/nmslib/hnswlib)|Apache 2.0\r\n[JQuery](https://github.com/jquery/jquery)|MIT                      \r\n[JQuery UI](https://github.com/jquery/jquery-ui)|MIT             \r\n[Particles JS](https://github.com/VincentGarreau/particles.js/)|MIT             \r\n[Ionicons](https://github.com/ionic-team/ionicons)|MIT         \r\n[Source Sans Pro](https://github.com/adobe-fonts/source-sans-pro)|OFL 1.1                   \r\n[GloVe](https://github.com/stanfordnlp/GloVe)|Apache 2.0\r\n[SPARQLWrapper](https://github.com/RDFLib/sparqlwrapper)|W3C License      \r\n[TextScrambler](https://codepen.io/soulwire/pen/mErPAK)|BSD-like   \r\n\r\n\r\n\r\n### FUTURE on w3m\r\n\r\n[![asciicast](https://asciinema.org/a/331246.svg)](https://asciinema.org/a/331246?autoplay=1)\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frtrevinnoc%2FFUTURE","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frtrevinnoc%2FFUTURE","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frtrevinnoc%2FFUTURE/lists"}