{"id":20028045,"url":"https://github.com/geekalexis/search-engine","last_synced_at":"2026-05-21T05:03:23.719Z","repository":{"id":171478968,"uuid":"622367643","full_name":"GeekAlexis/search-engine","owner":"GeekAlexis","description":"A distributed, RESTful search engine powered by AWS","archived":false,"fork":false,"pushed_at":"2023-09-14T20:05:15.000Z","size":12559,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-13T01:34:48.411Z","etag":null,"topics":["aws","hadoop","search-engine","webapp"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GeekAlexis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-01T22:34:22.000Z","updated_at":"2024-06-28T17:59:16.000Z","dependencies_parsed_at":null,"dependency_job_id":"2740add6-0ce0-4907-bdd8-cedda11aca60","html_url":"https://github.com/GeekAlexis/search-engine","commit_stats":null,"previous_names":["geekalexis/search-engine"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/GeekAlexis/search-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeekAlexis%2Fsearch-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeekAlexis%2Fsearch-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeekAlexis%2Fsearch-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeekAlexis%2Fsearch-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GeekAlexis","download_url":"https://codeload.github.com/GeekAlexis/search-engine/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeekAlexis%2Fsearch-engine/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33289546,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-21T02:57:32.698Z","status":"ssl_error","status_checked_at":"2026-05-21T02:57:31.990Z","response_time":62,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","hadoop","search-engine","webapp"],"created_at":"2024-11-13T09:12:57.338Z","updated_at":"2026-05-21T05:03:23.676Z","avatar_url":"https://github.com/GeekAlexis.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Distributed Search Engine\nA search engine partially based on Google, circa 1998.\n\u003cimg src=\"demo.png\" width=\"1000\"/\u003e\n\n- Retrieval and ranking incorporates BM25 and PageRank\n- Distributed crawler/indexer/link analysis for computing document index and metadata\n- A RESTful server that supports server-side caching and concurrent queries\n\n\u003cimg src=\"architecture.png\" width=\"700\"/\u003e\n\nSee our [technical report](report.pdf) for system design, scalability, and more search demos.\n\n## Extra Features\n- Excerpts with highlighted hits are loaded dynamically and shown on the result page.\n- Web UI integrates search results from News and Yelp webservices.\n- Web UI supports search query autocomplete.\n- Web UI supports loading 10 pages of search results.\n\n## Tech\n\n- [ReactJS](https://reactjs.org/)\n- [MUI](https://mui.com/)\n- [News API](https://newsapi.org/)\n- [Yelp Fushion API](https://www.yelp.com/developers/documentation/v3/get_started)\n- [IP Geolocation API](https://ip-api.com/)\n- [Spark Java](https://sparkjava.com/)\n- [JDBC](https://mvnrepository.com/artifact/org.postgresql/postgresql)\n- [HikariCP](https://github.com/brettwooldridge/HikariCP)\n- [Jsoup](https://mvnrepository.com/artifact/org.jsoup/jsoup)\n- [Guava](https://mvnrepository.com/artifact/com.google.guava/guava)\n- [OpenNLP](https://opennlp.apache.org)\n\n## Quick Start\n\n### Server\n\nSpecify `server/src/main/resources/config.properties` that contains your API keys and database credentials (not provided).\n\n```ini\ndb.url=jdbc:postgresql://host:port/database\ndb.user=username\ndb.pass=password\nnews.apiKey=abcdefghijk\nyelp.apiKey=abcdefghijk\n```\n\n```sh\ncd server\nmvn clean install\nmvn exec:java\n```\n\n### Client\n\nTo run on local development machine, node \u003e= 14.0.0 is required.\n\n```sh\ncd client\nnpm i\nnpm start\n```\n\n## Precomputed Components\nSee READMEs below for implemented features, source files, and instructions of each component.\n\n- [Indexer](indexer/README.md)\n- [PageRank](pagerank/README.md)\n- [Crawler](crawler/README.md)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeekalexis%2Fsearch-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgeekalexis%2Fsearch-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeekalexis%2Fsearch-engine/lists"}