{"id":13411758,"url":"https://github.com/ankane/disco","last_synced_at":"2025-11-17T14:10:50.591Z","repository":{"id":44411541,"uuid":"221663698","full_name":"ankane/disco","owner":"ankane","description":"Recommendations for Ruby and Rails using collaborative filtering","archived":false,"fork":false,"pushed_at":"2025-10-22T03:28:28.000Z","size":178,"stargazers_count":597,"open_issues_count":1,"forks_count":10,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-11-07T03:14:17.707Z","etag":null,"topics":["recommendation-engine","recommender-system"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ankane.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-14T09:51:36.000Z","updated_at":"2025-11-04T09:51:12.000Z","dependencies_parsed_at":"2023-12-26T17:30:46.230Z","dependency_job_id":"c3e202e9-ff0f-416f-a581-e7b5beceb7a9","html_url":"https://github.com/ankane/disco","commit_stats":{"total_commits":184,"total_committers":3,"mean_commits":"61.333333333333336","dds":"0.19021739130434778","last_synced_commit":"51d6d90d95281587502207ef2f989321cba9dd2b"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/ankane/disco","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ankane%2Fdisco","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ankane%2Fdisco/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ankane%2Fdisco/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ankane%2Fdisco/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ankane","download_url":"https://codeload.github.com/ankane/disco/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ankane%2Fdisco/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":283410687,"owners_count":26831444,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-08T02:00:06.281Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["recommendation-engine","recommender-system"],"created_at":"2024-07-30T20:01:16.598Z","updated_at":"2025-11-17T14:10:50.576Z","avatar_url":"https://github.com/ankane.png","language":"Ruby","funding_links":[],"categories":["Ruby"],"sub_categories":[],"readme":"# Disco\n\n:fire: Recommendations for Ruby and Rails using collaborative filtering\n\n- Supports user-based and item-based recommendations\n- Works with explicit and implicit feedback\n- Uses high-performance matrix factorization\n\n[![Build Status](https://github.com/ankane/disco/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/disco/actions)\n\n## Installation\n\nAdd this line to your application’s Gemfile:\n\n```ruby\ngem \"disco\"\n```\n\n## Getting Started\n\nCreate a recommender\n\n```ruby\nrecommender = Disco::Recommender.new\n```\n\nIf users rate items directly, this is known as explicit feedback. Fit the recommender with:\n\n```ruby\nrecommender.fit([\n  {user_id: 1, item_id: 1, rating: 5},\n  {user_id: 2, item_id: 1, rating: 3}\n])\n```\n\n\u003e IDs can be integers, strings, or any other data type\n\nIf users don’t rate items directly (for instance, they’re purchasing items or reading posts), this is known as implicit feedback. Leave out the rating.\n\n```ruby\nrecommender.fit([\n  {user_id: 1, item_id: 1},\n  {user_id: 2, item_id: 1}\n])\n```\n\n\u003e Each `user_id`/`item_id` combination should only appear once\n\nGet user-based recommendations - “users like you also liked”\n\n```ruby\nrecommender.user_recs(user_id)\n```\n\nGet item-based recommendations - “users who liked this item also liked”\n\n```ruby\nrecommender.item_recs(item_id)\n```\n\nUse the `count` option to specify the number of recommendations (default is 5)\n\n```ruby\nrecommender.user_recs(user_id, count: 3)\n```\n\nGet predicted ratings for specific users and items\n\n```ruby\nrecommender.predict([{user_id: 1, item_id: 2}, {user_id: 2, item_id: 4}])\n```\n\nGet similar users\n\n```ruby\nrecommender.similar_users(user_id)\n```\n\n## Examples\n\n### MovieLens\n\nLoad the data\n\n```ruby\ndata = Disco.load_movielens\n```\n\nCreate a recommender and get similar movies\n\n```ruby\nrecommender = Disco::Recommender.new(factors: 20)\nrecommender.fit(data)\nrecommender.item_recs(\"Star Wars (1977)\")\n```\n\n### Ahoy\n\n[Ahoy](https://github.com/ankane/ahoy) is a great source for implicit feedback\n\n```ruby\nviews = Ahoy::Event.where(name: \"Viewed post\").group(:user_id).group_prop(:post_id).count\n\ndata =\n  views.map do |(user_id, post_id), _|\n    {\n      user_id: user_id,\n      item_id: post_id\n    }\n  end\n```\n\nCreate a recommender and get recommended posts for a user\n\n```ruby\nrecommender = Disco::Recommender.new\nrecommender.fit(data)\nrecommender.user_recs(current_user.id)\n```\n\n## Storing Recommendations\n\nDisco makes it easy to store recommendations in Rails.\n\n```sh\nrails generate disco:recommendation\nrails db:migrate\n```\n\nFor user-based recommendations, use:\n\n```ruby\nclass User \u003c ApplicationRecord\n  has_recommended :products\nend\n```\n\n\u003e Change `:products` to match the model you’re recommending\n\nSave recommendations\n\n```ruby\nUser.find_each do |user|\n  recs = recommender.user_recs(user.id)\n  user.update_recommended_products(recs)\nend\n```\n\nGet recommendations\n\n```ruby\nuser.recommended_products\n```\n\nFor item-based recommendations, use:\n\n```ruby\nclass Product \u003c ApplicationRecord\n  has_recommended :products\nend\n```\n\nSpecify multiple types of recommendations for a model with:\n\n```ruby\nclass User \u003c ApplicationRecord\n  has_recommended :products\n  has_recommended :products_v2, class_name: \"Product\"\nend\n```\n\nAnd use the appropriate methods:\n\n```ruby\nuser.update_recommended_products_v2(recs)\nuser.recommended_products_v2\n```\n\n## Storing Recommenders\n\nIf you’d prefer to perform recommendations on-the-fly, store the recommender\n\n```ruby\njson = recommender.to_json\nFile.write(\"recommender.json\", json)\n```\n\nThe serialized recommender includes user activity from the training data (to avoid recommending previously rated items), so be sure to protect it. You can save it to a file, database, or any other storage system, or use a tool like [Trove](https://github.com/ankane/trove). Also, user and item IDs should be integers or strings for this.\n\nLoad a recommender\n\n```ruby\njson = File.read(\"recommender.json\")\nrecommender = Disco::Recommender.load_json(json)\n```\n\nAlternatively, you can store only the factors and use a library like [Neighbor](https://github.com/ankane/neighbor). See the [examples](https://github.com/ankane/neighbor/tree/master/examples/disco).\n\n## Algorithms\n\nDisco uses high-performance matrix factorization.\n\n- For explicit feedback, it uses [stochastic gradient descent](https://www.csie.ntu.edu.tw/~cjlin/papers/libmf/libmf_journal.pdf)\n- For implicit feedback, it uses [coordinate descent](https://www.csie.ntu.edu.tw/~cjlin/papers/one-class-mf/biased-mf-sdm-with-supp.pdf)\n\nSpecify the number of factors and epochs\n\n```ruby\nDisco::Recommender.new(factors: 8, epochs: 20)\n```\n\nIf recommendations look off, trying changing `factors`. The default is 8, but 3 could be good for some applications and 300 good for others.\n\n## Validation\n\nPass a validation set with:\n\n```ruby\nrecommender.fit(data, validation_set: validation_set)\n```\n\n## Cold Start\n\nCollaborative filtering suffers from the [cold start problem](https://en.wikipedia.org/wiki/Cold_start_(recommender_systems)). It’s unable to make good recommendations without data on a user or item, which is problematic for new users and items.\n\n```ruby\nrecommender.user_recs(new_user_id) # returns empty array\n```\n\nThere are a number of ways to deal with this, but here are some common ones:\n\n- For user-based recommendations, show new users the most popular items\n- For item-based recommendations, make content-based recommendations with a gem like [tf-idf-similarity](https://github.com/jpmckinney/tf-idf-similarity)\n\nGet top items with:\n\n```ruby\nrecommender = Disco::Recommender.new(top_items: true)\nrecommender.fit(data)\nrecommender.top_items\n```\n\nThis uses [Wilson score](https://www.evanmiller.org/how-not-to-sort-by-average-rating.html) for explicit feedback and item frequency for implicit feedback.\n\n## Data\n\nData can be an array of hashes\n\n```ruby\n[{user_id: 1, item_id: 1, rating: 5}, {user_id: 2, item_id: 1, rating: 3}]\n```\n\nOr a Rover data frame\n\n```ruby\nRover.read_csv(\"ratings.csv\")\n```\n\nOr a Daru data frame\n\n```ruby\nDaru::DataFrame.from_csv(\"ratings.csv\")\n```\n\n## Performance\n\nIf you have a large number of users or items, you can use an approximate nearest neighbors library like [Faiss](https://github.com/ankane/faiss) to improve the performance of certain methods.\n\nAdd this line to your application’s Gemfile:\n\n```ruby\ngem \"faiss\"\n```\n\nSpeed up the `user_recs` method with:\n\n```ruby\nrecommender.optimize_user_recs\n```\n\nSpeed up the `item_recs` method with:\n\n```ruby\nrecommender.optimize_item_recs\n```\n\nSpeed up the `similar_users` method with:\n\n```ruby\nrecommender.optimize_similar_users\n```\n\nThis should be called after fitting or loading the recommender.\n\n## Reference\n\nGet ids\n\n```ruby\nrecommender.user_ids\nrecommender.item_ids\n```\n\nGet the global mean\n\n```ruby\nrecommender.global_mean\n```\n\nGet factors\n\n```ruby\nrecommender.user_factors\nrecommender.item_factors\n```\n\nGet factors for specific users and items\n\n```ruby\nrecommender.user_factors(user_id)\nrecommender.item_factors(item_id)\n```\n\n## Credits\n\nThanks to:\n\n- [LIBMF](https://github.com/cjlin1/libmf) for providing high performance matrix factorization\n- [Implicit](https://github.com/benfred/implicit/) for serving as an initial reference for user and item similarity\n- [@dasch](https://github.com/dasch) for the gem name\n\n## History\n\nView the [changelog](https://github.com/ankane/disco/blob/master/CHANGELOG.md)\n\n## Contributing\n\nEveryone is encouraged to help improve this project. Here are a few ways you can help:\n\n- [Report bugs](https://github.com/ankane/disco/issues)\n- Fix bugs and [submit pull requests](https://github.com/ankane/disco/pulls)\n- Write, clarify, or fix documentation\n- Suggest or add new features\n\nTo get started with development:\n\n```sh\ngit clone https://github.com/ankane/disco.git\ncd disco\nbundle install\nbundle exec rake test\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fankane%2Fdisco","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fankane%2Fdisco","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fankane%2Fdisco/lists"}