{"id":20511166,"url":"https://github.com/lovoo/cofire","last_synced_at":"2025-08-16T17:04:37.922Z","repository":{"id":33922723,"uuid":"138185289","full_name":"lovoo/cofire","owner":"lovoo","description":"An online Collaborative Filtering system for recommendations","archived":false,"fork":false,"pushed_at":"2018-09-19T10:23:14.000Z","size":34,"stargazers_count":7,"open_issues_count":0,"forks_count":2,"subscribers_count":28,"default_branch":"master","last_synced_at":"2025-04-13T22:47:39.611Z","etag":null,"topics":["lovoo"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lovoo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-06-21T14:57:10.000Z","updated_at":"2021-12-17T07:17:25.000Z","dependencies_parsed_at":"2022-08-08T06:00:15.649Z","dependency_job_id":null,"html_url":"https://github.com/lovoo/cofire","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lovoo/cofire","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovoo%2Fcofire","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovoo%2Fcofire/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovoo%2Fcofire/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovoo%2Fcofire/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lovoo","download_url":"https://codeload.github.com/lovoo/cofire/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovoo%2Fcofire/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270742043,"owners_count":24637504,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lovoo"],"created_at":"2024-11-15T20:34:54.680Z","updated_at":"2025-08-16T17:04:37.753Z","avatar_url":"https://github.com/lovoo.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cofire [![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause) [![GoDoc](https://godoc.org/github.com/lovoo/cofire?status.svg)](https://godoc.org/github.com/lovoo/cofire)\n\nCofire is a stream-based collaborative filtering implementation for recommendation engines solely running on [Apache Kafka].\nLeveraging the [Goka] stream processing library, Cofire continously learns user-product ratings arriving from Kafka event streams and maintains its model up-to-date in Kafka tables.\nCofire implements streaming matrix factorization employing stochastic gradient descent (SGD).\n\n\u003e We assume you understand the SGD algorithm, so this README focus on how Cofire works on top of [Goka].\n\u003e See [this blog post](https://ruivieira.github.io/a-streaming-als-implementation.html) for a great introduction on SGD for streams.\n\n## Components\n\nCofire has two types of stream processors:\n\n- The *learner* consumes ratings from an input topic and applies SGD to learn the latent features of users and products.\nHere is where machine learning happens.\nLearners are stateful and store the latent features in Kafka in a log-compacted topic.\n- The *refeeder* reemits already learnt ratings into the learner's input topic after a predefined delay.\nThe refeeder effectively implements training iterations. By default, the number of iterations is configured to be 1, so the refeeder is optional.\n\nBesides these processors, two other components are necessary to get the system running:\n\n- At least one *producer* that writes the ratings into the learner's input topic.\n- At least one *predictor* that performs predictions using the learnt model.\nThe predictor does not need to be co-located with the learner, it can simply keep a local view of the model using Goka.\n\n## Preparation\n\nBefore starting any processors, one has to do the following:\n\n1. Choose a group name, eg, \"cofire-app\".\n2. Create topics for the processors (if not auto created):\n   - `\u003cgroup\u003e-input` as input for the learner, eg, \"cofire-app-input\"\n   - `\u003cgroup\u003e-loop` to loopback messages among learner instances, eg, \"cofire-app-loop\"\n   - `\u003cgroup\u003e-table` to store the learnt model, eg, \"cofire-app-table\"\n   - `\u003cgroup\u003e-update` to overwrite the learner model if desired , eg, \"cofire-app-update\"\n   - `\u003cgroup\u003e-refeed` to send ratings from the learner to the refeeder, eg, \"cofire-app-refeed\"\n3. Ensure all topics have the same number of partitions.\n4. Ensure `\u003cgroup\u003e-table` is configured with log compaction.\n\n\nSee the [examples](examples) directory for detailed examples.\n\n## How it works\n\nLearners (as any Goka processor) have processing partitions, which match the number of partitions of the inputs and state.\nThese processing partitions may be distributed over multiple processor instances, ie, multiple program instances running in multiple hosts.\n\nRatings and other messages are assigned to partitions via the key used when emitting the message.\n\n\u003e To simplify the explanation, we refer to keys as if they would be active entities.\n\u003e For example, \"we send a message to a key k\" means that we send a message using key k and the learner partition responsible for the key k receives that message;\n\u003e also, \"a key k processes a message\" means the learner partition responsible for key k processes a message.\n\n\n### Producing ratings\n\nA rating is defined as follows.\n\n```\nmessage Rating {\n  string user_id    = 1;\n  string product_id = 2;\n  double score      = 3;\n}\n```\n\nA producer sends ratings to the learner instances via `\u003cgroup\u003e-input` topic.\nThe key of each rating message is the `user_id`.\n\n\n### Learning\n\nCofire can be configured with these parameters:\n\n```go\ntype Parameters struct {\n  // Rank is the number of latent factors (features).\n  Rank int\n  // Gamma is the learning step of SGD.\n  Gamma float64\n  // Lambda is the regularization parameter of SGD.\n  Lambda float64\n  // Iterations is the number of times the data will be used for training.\n  Iterations int\n}\n```\n\nLearners have state, which is partitioned and stored in Kafka in the `\u003cgroup\u003e-table` topic.\nEach key has an entry in learner state defined as follows.\n```\nmessage Entry {\n  Features u = 1;\n  Features p = 2;\n}\n```\n\nThe algorithm for one rating has 3 steps:\n1. When `user_id` receives a rating via the input topic, it retrieves the U features and sends (rating,U) to the `product_id` via the `\u003cgroup\u003e-loop` topic.\n2. When `product_id` receives (rating,U), it retrieves the P features, applies SGD, and sends (rating,P) back to `user_id`.\n3. When `user_id` receives (rating,P), it retrieves the U features, applies SGD, and sends the rating to the refeeder via `\u003cgroup\u003e-refeed`.\n\nWith that the iteration for this rating is finished.\n\nIn ASCII-art, one iteration would look like this\n```\n    Rating\n      |\n      v\n     USER\n      |\n      * Entry                     PRODUCT\n      |        (Rating, U)           |\n      +-----------------------------\u003e|\n      |                              |\n      |                              * Update P\n      |        (Rating, P)           |\n      |\u003c-----------------------------+\n      |\n      * Update U\n      |\n      v\n   REFEEDER\n```\n\n### Iterating\n\nIf the algorithm is configured to run multiple iterations, the refeeder sends the rating back to the `user_id` to retrain the rating.\nThat happens for the configured number of iterations.\nSince the stream never ends, the refeeder creates iterations by delaying the `\u003cgroup\u003e-refeed` topic by a configurable duration.\nNote that the retention time configured for the topic has to be longer than the delay duration of the refeeder, otherwise ratings will be lost.\n\n\nIn ASCII-art, the complete flow is as follows.\nHere we see the three components: producer, learner and refeeder.\nThe **producer** sends a rating to the **learner**.\nThe *user* key receives the rating and sends the rating plus its U vector to the *product* key in the learner (by using Goka's loopback).\nThe product updates P, sends it back to the user, which updates U and sends the rating to the **refeeder**.\nThe refeeder sends the rating back to the learner/user-key after a delay, and the number of remaining iterations is decremented.\nThe *user* receives the rating and the next iteration starts.\n\n```\n    PRODUCER             ,..LEARNER..,               REFEEDER\n      |             USER'             'PRODUCT          .\n      |   Rating     .                   .              .\n      +-------------\u003e+                   .              .\n      .              |                   .              .\n      .              |                   .              .\n      .              * Entry             .              .\n      .              |                   .              .\n      .              |    (Rating, U)    .              .\n      .              +------------------\u003e+              .\n      .              .                   |              .\n      .              .                   |              .\n      .              .                   * Update P     .\n      .              .                   |              .\n      .              .    (Rating, P)    |              .\n      .              +\u003c------------------+              .\n      .              |                   .              .\n      .              |                   .              .\n      .              * Update U          .              .\n      .              |                   .              .\n      .              |      (Rating, #iterations)       .\n      .              +---------------------------------\u003e+\n      .              .                   .              |\n      .              .                   .              |\n      .              .                   .              * Delay\n      .              .                   .              |\n      .              .      (Rating, #iterations--)     |\n      .              +\u003c---------------------------------+\n      .              |                   .              .\n      .              |                   .              .\n      .              * Entry             .              .\n      .              |                   .              .\n      .              |   next iteration  .              .\n      .              +------------------\u003e+              .\n```\n\n\n### Predicting\n\nEvery update of U or P in a learner produces an update of `\u003cgroup\u003e-table`.\nTo perform predictions, one simply creates a Goka view of the `\u003cgroup\u003e-table\u003e` and gets the entries for the desired user and product. For example:\n\n```go\nview, _ := goka.NewView(brokers, goka.GroupTable(group), new(cofire.EntryCodec))\n\nuser, _ := view.Get(\"user\")\nu := user.(*cofire.Entry).U\nproduct, _ := view.Get(\"product\")\np := ep.(*cofire.Entry).P\n\nprediction := u.Predict(p, bias)\n```\n\n### Global bias\n\nThe global bias of SGD is not stored anywhere in the state, only in memory. So to apply predictions, one needs to compute the bias manually.\nHowever, if one is simply creating product recommendations for a user, bias can be set to 0 since that won't affect the sorted order of the scored products.\n\n## How to contribute\n\nContributions are always welcome.\nPlease fork the repo, create a pull request against master, and be sure tests pass.\nSee the [GitHub Flow] for details.\n\n[Apache Kafka]: https://kafka.apache.org/\n[Goka]: https://github.com/lovoo/goka\n[GoDoc]: https://godoc.org/github.com/lovoo/cofire\n[GitHub Flow]: https://guides.github.com/introduction/flow\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flovoo%2Fcofire","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flovoo%2Fcofire","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flovoo%2Fcofire/lists"}