{"id":13596089,"url":"https://github.com/metarank/metarank","last_synced_at":"2025-05-14T08:05:43.425Z","repository":{"id":37095679,"uuid":"288792333","full_name":"metarank/metarank","owner":"metarank","description":"A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine","archived":false,"fork":false,"pushed_at":"2025-01-20T21:10:30.000Z","size":24761,"stargazers_count":2119,"open_issues_count":135,"forks_count":92,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-04-11T03:38:09.666Z","etag":null,"topics":["automl","data-engineering","data-science","deep-learning","feature-engineering","feature-extraction","kubernetes","machine-learning","neural-networks","personalization","ranking","scala","search"],"latest_commit_sha":null,"homepage":"https://metarank.ai","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/metarank.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-08-19T17:16:08.000Z","updated_at":"2025-04-08T17:51:44.000Z","dependencies_parsed_at":"2022-07-14T06:00:37.937Z","dependency_job_id":"0d4b62f6-cdbc-4532-a0fe-f1f7988c4381","html_url":"https://github.com/metarank/metarank","commit_stats":null,"previous_names":[],"tags_count":50,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/metarank%2Fmetarank","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/metarank%2Fmetarank/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/metarank%2Fmetarank/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/metarank%2Fmetarank/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/metarank","download_url":"https://codeload.github.com/metarank/metarank/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254101598,"owners_count":22014908,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automl","data-engineering","data-science","deep-learning","feature-engineering","feature-extraction","kubernetes","machine-learning","neural-networks","personalization","ranking","scala","search"],"created_at":"2024-08-01T16:02:07.732Z","updated_at":"2025-05-14T08:05:38.415Z","avatar_url":"https://github.com/metarank.png","language":"Scala","funding_links":[],"categories":["Scala","人工智能","推荐系统算法库与列表","📚 Project Purpose"],"sub_categories":["资源传输下载","Machine Learning (Entry-Level)"],"readme":"\u003ch1 align=\"center\"\u003e\n    \u003ca style=\"text-decoration: none\" href=\"https://www.metarank.ai\"\u003e\n      \u003cimg width=\"120\" src=\"https://raw.githubusercontent.com/metarank/metarank/master/doc/img/logo.svg\" /\u003e\n      \u003cp align=\"center\"\u003eMetarank: real time personalization as a service\u003c/p\u003e\n    \u003c/a\u003e\n\u003c/h1\u003e\n\u003ch2 align=\"center\"\u003e\n  \u003ca href=\"https://docs.metarank.ai\"\u003eDocs\u003c/a\u003e | \u003ca href=\"https://metarank.ai\"\u003eWebsite\u003c/a\u003e | \u003ca href=\"https://metarank.ai/slack\"\u003eCommunity Slack\u003c/a\u003e | \u003ca href=\"https://blog.metarank.ai\"\u003eBlog\u003c/a\u003e | \u003ca href=\"https://demo.metarank.ai\"\u003eDemo\u003c/a\u003e\n\u003c/h2\u003e\n\n[![CI Status](https://github.com/metarank/metarank/workflows/Tests/badge.svg)](https://github.com/metarank/metarank/actions)\n[![License: Apache 2](https://img.shields.io/badge/License-Apache2-green.svg)](https://opensource.org/licenses/Apache-2.0)\n![Last commit](https://img.shields.io/github/last-commit/metarank/metarank)\n![Last release](https://img.shields.io/github/release/metarank/metarank)\n[![Join our slack](https://img.shields.io/badge/Slack-join%20the%20community-blue?logo=slack\u0026style=social)](https://metarank.ai/slack)\n\n\n# What is Metarank?\n\n[Metarank](https://metarank.ai) is an open-source ranking service. It can help you to build a personalized semantic/neural search and recommendations.\n\nIf you just want to get started, try:\n* the [quickstart](https://docs.metarank.ai/introduction/quickstart) tutorial of implementing Learning-to-Rank on top of your search engine.\n* a [semantic search guide](TODO) of building an LLM-based neural search.\n* a [collaborative filtering recommendations guide](TODO) to create a \"you may also like\" widget as seen on many e-commerce stores.\n\n## Why Metarank?\n\nWith Metarank, you can make your existing search and recommendations **smarter**:\n* Integrate customer signals like clicks and purchases into the ranking - and optimize for maximal CTR!\n* Track [visitor profile](https://docs.metarank.ai/reference/overview/feature-extractors/user-session) and make search results adapt to user actions with real-time personalization.\n* Use [LLMs in bi- and cross-encoder mode](https://docs.metarank.ai/reference/overview/feature-extractors/text) to make your search understand the true meaning of search queries.\n\nMetarank is **fast**:\n* optimized for reranking latency, it can handle even large result sets within 10-20ms. See [benchmarks](https://docs.metarank.ai/introduction/performance).\n* as a stateless cloud-native service (with state managed by Redis), it can scale horizontally and process thousands of RPS. See [Kubernetes deployment guide](https://docs.metarank.ai/reference/deployment-overview/kubernetes) for details.\n\nSave your **development time**:\n* Metarank can compute dozens of typical ranking signals out of the box: CTR, referer, User-Agent, time, etc - you don't need to write custom ad-hoc code for most common ranking factors. See [the full list of supported ranking signals](https://docs.metarank.ai/reference/overview/feature-extractors) in our docs.\n* There are integrations with many possible streaming processing systems to ingest visitor signals: See [data sources](https://docs.metarank.ai/reference/overview/data-sources) for details.\n\n## What can you build with Metarank?\n\nMetarank helps you build advanced ranking systems for search and recommendations:\n* Semantic search: use state-of-the-art LLMs to make your Elasticsearch/OpenSearch understand the meaning of your queries\n* Recommendations: traditional collaborative-filtering and new-age semantic content recommendations.\n* Learning-to-Rank: optimize your existing search\n\n## Content\n\nBlog posts:\n* [Learn-to-Rank with OpenSearch and Metarank](https://opensearch.org/blog/ltr-with-opensearch-and-metarank/)\n* [Hybrid Search and Learning-to-Rank with Metarank](https://www.pinecone.io/learn/metarank/)\n* [Solving a search cold-start problem with aggregated CTR](https://blog.metarank.ai/solving-a-search-cold-start-problem-with-aggregated-ctr-b88c14f4d03c)\n* [Personalized search with Metarank and Elasticsearch](https://blog.metarank.ai/personalized-search-with-metarank-and-elasticsearch-a5a098548da7)\n\nMeetups and conference talks:\n* [Building an open-source online Learn-to-rank engine](https://www.youtube.com/watch?v=lbbp4CFWZGk), Haystack EU 23, [slides](https://metarank.github.io/haystack-eu22/#/)\n* [Overcoming position and presentation biases in search and recommender systems](https://www.youtube.com/watch?v=PqbYdDiwKBY), Data Natives Meetup Berlin, [slides](https://metarank.github.io/bias-talk/#/)\n* [Learning-to-rank: Deep, fast, precise - choose any two](https://www.youtube.com/watch?v=oXfFqAKf4Ac), DataTalks meetup, [slides](https://metarank.github.io/datatalks-ltr-talk/#/)\n\n## Main features\n\n* Semantic neural search: [TODO]\n* Recommendations: [trending](configuration/recommendations/trending.md) and [similar-items](configuration/recommendations/similar.md) (MF ALS).\n* Personalization: [secondary reranking](quickstart/quickstart.md) (LambdaMART)\n* AutoML: [automatic feature generation](howto/autofeature.md) and [model re-training](howto/model-retraining.md)\n* A/B testing: [multiple model serving](configuration/overview.md#models)\n\n## Demo\n\nYou can play with Metarank demo on [demo.metarank.ai](https://demo.metarank.ai):\n\n![Demo](doc/img/demo.gif)\n\nThe demo itself and [the data used](https://github.com/metarank/msrd) are open-source and you can grab a copy of training events and config file [in the github repo](https://github.com/metarank/metarank/tree/master/src/test/resources/ranklens).\n\n## Metarank in One Minute\n\nLet us show how you can start personalizing content with LambdaMART-based reranking in just under a minute:\n\n1. Prepare the data: we will get the dataset and config file from the [demo.metarank.ai](https://demo.metarank.ai)\n2. Start Metarank in a standalone mode: it will import the data, train the ML model and start the API.\n3. Send a couple of requests to the API.\n\n### Step 1: Prepare data\n\nWe will use the [ranklens dataset](https://github.com/metarank/ranklens), which is used in our [Demo](https://demo.metarank.ai), so just download the data file\n\n```bash\ncurl -O -L https://github.com/metarank/metarank/raw/master/src/test/resources/ranklens/events/events.jsonl.gz\n```\n\n### Step 2: Prepare configuration file\n\nWe will again use the configuration file from our [Demo](https://demo.metarank.ai). It utilizes in-memory store, so no other dependencies are needed.\n\n\n```bash\ncurl -O -L https://raw.githubusercontent.com/metarank/metarank/master/src/test/resources/ranklens/config.yml\n```\n\n### Step 3: Start Metarank!\n\nWith the final step we will use Metarank’s `standalone` mode that combines training and running the API into one command:\n\n```bash\ndocker run -i -t -p 8080:8080 -v $(pwd):/opt/metarank metarank/metarank:latest standalone --config /opt/metarank/config.yml --data /opt/metarank/events.jsonl.gz\n```\n\nYou will see some useful output while Metarank is starting and grinding through the data. Once this is done, you can send requests to `localhost:8080` to get personalized results.\n\nHere we will interact with several movies by clicking on one of them and observing the results. \n\n\u003e First, let's see the initial output provided by Metarank without before we interact with it\n\n```bash\n# get initial ranking for some items\ncurl http://localhost:8080/rank/xgboost \\\n    -d '{\n    \"event\": \"ranking\",\n    \"id\": \"id1\",\n    \"items\": [\n        {\"id\":\"72998\"}, {\"id\":\"67197\"}, {\"id\":\"77561\"},\n        {\"id\":\"68358\"}, {\"id\":\"79132\"}, {\"id\":\"103228\"}, \n        {\"id\":\"72378\"}, {\"id\":\"85131\"}, {\"id\":\"94864\"}, \n        {\"id\":\"68791\"}, {\"id\":\"93363\"}, {\"id\":\"112623\"}\n    ],\n    \"user\": \"alice\",\n    \"session\": \"alice1\",\n    \"timestamp\": 1661431886711\n}'\n\n# {\"item\":\"72998\",\"score\":0.9602446652021992},{\"item\":\"79132\",\"score\":0.7819134441404151},{\"item\":\"68358\",\"score\":0.33377910321385645},{\"item\":\"112623\",\"score\":0.32591281190727805},{\"item\":\"103228\",\"score\":0.31640256043322723},{\"item\":\"77561\",\"score\":0.3040782705414116},{\"item\":\"94864\",\"score\":0.17659007036183608},{\"item\":\"72378\",\"score\":0.06164568676567339},{\"item\":\"93363\",\"score\":0.058120639770243385},{\"item\":\"68791\",\"score\":0.026919880032451306},{\"item\":\"85131\",\"score\":-0.35794106000271037},{\"item\":\"67197\",\"score\":-0.48735167237049154}\n```\n\n```bash\n# tell Metarank which items were presented to the user and in which order from the previous request\n# optionally, we can include the score calculated by Metarank or your internal retrieval system\ncurl http://localhost:8080/feedback \\\n -d '{\n  \"event\": \"ranking\",\n  \"fields\": [],\n  \"id\": \"test-ranking\",\n  \"items\": [\n    {\"id\":\"72998\",\"score\":0.9602446652021992},{\"id\":\"79132\",\"score\":0.7819134441404151},{\"id\":\"68358\",\"score\":0.33377910321385645},\n    {\"id\":\"112623\",\"score\":0.32591281190727805},{\"id\":\"103228\",\"score\":0.31640256043322723},{\"id\":\"77561\",\"score\":0.3040782705414116},\n    {\"id\":\"94864\",\"score\":0.17659007036183608},{\"id\":\"72378\",\"score\":0.06164568676567339},{\"id\":\"93363\",\"score\":0.058120639770243385},\n    {\"id\":\"68791\",\"score\":0.026919880032451306},{\"id\":\"85131\",\"score\":-0.35794106000271037},{\"id\":\"67197\",\"score\":-0.48735167237049154}\n  ],\n  \"user\": \"test2\",\n  \"session\": \"test2\",\n  \"timestamp\": 1661431888711\n}'\n```\n\n\u003e Now, let's intereact with the items `93363`\n\n```bash\n# click on the item with id 93363\ncurl http://localhost:8080/feedback \\\n -d '{\n  \"event\": \"interaction\",\n  \"type\": \"click\",\n  \"fields\": [],\n  \"id\": \"test-interaction\",\n  \"ranking\": \"test-ranking\",\n  \"item\": \"93363\",\n  \"user\": \"test\",\n  \"session\": \"test\",\n  \"timestamp\": 1661431890711\n}'\n```\n\n\u003e Now, Metarank will personalize the items, the order of the items in the response will be different\n\n```bash\n# personalize the same list of items\n# they will be returned in a different order by Metarank\ncurl http://localhost:8080/rank/xgboost \\\n -d '{\n  \"event\": \"ranking\",\n  \"fields\": [],\n  \"id\": \"test-personalized\",\n  \"items\": [\n    {\"id\":\"72998\"}, {\"id\":\"67197\"}, {\"id\":\"77561\"},\n    {\"id\":\"68358\"}, {\"id\":\"79132\"}, {\"id\":\"103228\"}, \n    {\"id\":\"72378\"}, {\"id\":\"85131\"}, {\"id\":\"94864\"}, \n    {\"id\":\"68791\"}, {\"id\":\"93363\"}, {\"id\":\"112623\"}\n  ],\n  \"user\": \"test\",\n  \"session\": \"test\",\n  \"timestamp\": 1661431892711\n}'\n\n# {\"items\":[{\"item\":\"93363\",\"score\":2.2013986484185124},{\"item\":\"72998\",\"score\":1.1542776301073876},{\"item\":\"68358\",\"score\":0.9828904282341605},{\"item\":\"112623\",\"score\":0.9521647429731446},{\"item\":\"79132\",\"score\":0.9258841742518286},{\"item\":\"77561\",\"score\":0.8990921381835769},{\"item\":\"103228\",\"score\":0.8990921381835769},{\"item\":\"94864\",\"score\":0.7131600718467729},{\"item\":\"68791\",\"score\":0.624462038351694},{\"item\":\"72378\",\"score\":0.5269765094008626},{\"item\":\"85131\",\"score\":0.29198666089255343},{\"item\":\"67197\",\"score\":0.16412780810560743}]}\n```\n\n## Useful Links\n\n* [Documentation](https://docs.metarank.ai)\n* [Ranklens Dataset](https://github.com/metarank/ranklens)\n* [Contribution guide](CONTRIBUTING.md)\n* [License](LICENSE)\n\n## What's next? \n\nCheck out a more in-depth [Quickstart](/doc/quickstart/quickstart.md) full [Reference](/doc/installation.md). \n\nIf you have any questions, don't hesitate to join our [Slack](https://communityinviter.com/apps/metarank/metarank)!\n\n\nLicense\n=====\nThis project is released under the Apache 2.0 license, as specified in the [License](LICENSE) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmetarank%2Fmetarank","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmetarank%2Fmetarank","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmetarank%2Fmetarank/lists"}