{"id":19558500,"url":"https://github.com/mach-kernel/suggest","last_synced_at":"2025-11-20T02:02:51.985Z","repository":{"id":146171494,"uuid":"233320888","full_name":"mach-kernel/suggest","owner":"mach-kernel","description":"A stab at implementing a Google-style typeahead suggestion from a dump of previous queries.","archived":false,"fork":false,"pushed_at":"2020-01-13T02:47:54.000Z","size":33,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-26T08:19:08.456Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mach-kernel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-01-12T01:15:58.000Z","updated_at":"2020-01-13T02:47:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"e75fbc7d-936d-4d2e-98bf-8e081c904bad","html_url":"https://github.com/mach-kernel/suggest","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mach-kernel/suggest","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mach-kernel%2Fsuggest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mach-kernel%2Fsuggest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mach-kernel%2Fsuggest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mach-kernel%2Fsuggest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mach-kernel","download_url":"https://codeload.github.com/mach-kernel/suggest/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mach-kernel%2Fsuggest/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285359022,"owners_count":27158216,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-20T02:00:05.334Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T04:47:18.857Z","updated_at":"2025-11-20T02:02:51.970Z","avatar_url":"https://github.com/mach-kernel.png","language":"Java","readme":"# suggest\nA stab at implementing a Google-style typeahead suggestion from a dump of previous queries. Written in pure Java 8 with no external dependencies.\n\n![](https://i.imgur.com/mZtpfb9.gif)\n\n## Getting Started\n\n```bash\nwget http://www.cim.mcgill.ca/~dudek/206/Logs/AOL-user-ct-collection/aol-data.tar.gz\ntar -xvf aol-data.tar.gz\ngunzip AOL-user-ct-collection/user-ct-test-collection-01.txt.gz\n# Skip the first line, and take the \ntail -n +2 AOL-user-ct-collection/user-ct-test-collection-01.txt | cut -f2 \u003e justquery.txt\ngradle --console plain run\n```\n\n```\nsuggest\u003e loadfile justquery.txt\nsuggest\u003e suggest food\n347 - food network\n218 - foodnetwork.com\n198 - foodnetwork\n115 - foodtv\n69 - foodtv.com\n68 - food network.com\n61 - food jokes\n55 - food tv\n41 - food channel\n39 - food stamps\n```\n\n## Data\n\nTo keep things easy, loading data is limited to reading newline delimited text files. It is surprisingly hard to find these kinds of datasets. The list of data used to test this program are listed below:\n\n- [Web Search Query Logs](https://jeffhuang.com/search_query_logs.html)\n  - The AOL mirror is the only live one\n\n## Methodology\n\n- Google defaults to showing 10 suggestions, so we will do the same.\n- The rank of the suggestion will be determined by the frequency of each unique query.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmach-kernel%2Fsuggest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmach-kernel%2Fsuggest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmach-kernel%2Fsuggest/lists"}