{"id":27962332,"url":"https://github.com/thoughtworks/simplebandit","last_synced_at":"2026-03-17T09:33:29.480Z","repository":{"id":208313046,"uuid":"720110569","full_name":"thoughtworks/simplebandit","owner":"thoughtworks","description":"lightweight contextual bandit library for ts/js","archived":false,"fork":false,"pushed_at":"2023-12-11T08:03:37.000Z","size":571,"stargazers_count":18,"open_issues_count":2,"forks_count":0,"subscribers_count":12,"default_branch":"main","last_synced_at":"2025-05-07T19:27:26.803Z","etag":null,"topics":["bandits","contextual-bandits","personalization","recommendation-system","recommender","recommender-systems"],"latest_commit_sha":null,"homepage":"https://thoughtworks.github.io/simplebandit/","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thoughtworks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-11-17T15:47:26.000Z","updated_at":"2025-01-24T08:29:41.000Z","dependencies_parsed_at":"2025-05-08T06:45:10.554Z","dependency_job_id":null,"html_url":"https://github.com/thoughtworks/simplebandit","commit_stats":null,"previous_names":["thoughtworks/simplebandit"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/thoughtworks/simplebandit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thoughtworks%2Fsimplebandit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thoughtworks%2Fsimplebandit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thoughtworks%2Fsimplebandit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thoughtworks%2Fsimplebandit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thoughtworks","download_url":"https://codeload.github.com/thoughtworks/simplebandit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thoughtworks%2Fsimplebandit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30620740,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-17T08:10:05.930Z","status":"ssl_error","status_checked_at":"2026-03-17T08:10:04.972Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bandits","contextual-bandits","personalization","recommendation-system","recommender","recommender-systems"],"created_at":"2025-05-07T19:20:37.715Z","updated_at":"2026-03-17T09:33:29.465Z","avatar_url":"https://github.com/thoughtworks.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"![GitHub Workflow Status (with event)](https://img.shields.io/github/actions/workflow/status/thoughtworks/simplebandit/node.js.yml)\n![NPM](https://img.shields.io/npm/l/simplebandit)\n![npm](https://img.shields.io/npm/v/simplebandit)\n\n\u003cimg src=\"https://github.com/thoughtworks/simplebandit/assets/27999937/fb71b387-689d-4fd6-a80f-8eea37278d2c\" width=\"50\" align=\"right\" alt=\"simplebandit-logo-transparent\"/\u003e\n\n# simplebandit\n\nSimplebandit is a fast, lightweight typescript/javascript library for contextual bandit recommenders, with no external dependencies, transpiling to \u003c700 lines of javascript.\n\nIt provides classes and interfaces to create and manage bandit models, generate recommendations, select actions, and update your models. Easily integrates with e.g. React Native to support privacy sensitive and fully interpretable recommendations right on a user's device.\n\nYou can find the live examples deployed at [https://thoughtworks.github.io/simplebandit/](https://thoughtworks.github.io/simplebandit/).\n\nUnder the hood it's an online logistic regression oracle with softmax exploration.\n\n## Installation\n\nThe project is still in beta, but can be installed with\n\n```sh\nnpm install simplebandit\n```\n\nAll feedback is welcome.\n\n## Usage\n\n### With actionIds only\n\nIn the simplest case you are simply learning a preference over a list of possible actions, without regards to context or action features. By accepting a recommendation you make the recommended action more likely in future recommendations. By rejecting it, you make it less likely. The bandit learns from your feedback, updates, and adjusts.\n\n```typescript\nimport { SimpleBandit } from \"simplebandit\";\n\nconst bandit = new SimpleBandit({ actions: [\"apple\", \"pear\"] });\n\nconst recommendation1 = bandit.recommend();\nawait bandit.accept(recommendation1);\nconsole.log(recommendation1.actionId);\n\nconst recommendation2 = bandit.recommend();\nawait bandit.reject(recommendation2);\n```\n\n### With action features\n\nBy defining action features we can also learn across actions: e.g. by choosing a fruit we make other fruits also more likely for the next recommendation.\n\n```typescript\nconst actions: IAction[] = [\n  { actionId: \"apple\", features: { fruit: 1 } },\n  { actionId: \"pear\", features: { fruit: 1 } },\n  { actionId: \"chocolate\", features: { fruit: 0 } },\n];\n\nconst bandit = new SimpleBandit({ actions: actions });\n```\n\nIf you accept an `apple` recommendations, the probability of both `apple` and `pear` will go up for the next recommendation. (N.B. all features values should be encoded between -1 and 1.)\n\nThere are a few short hand ways of defining actions. For actions without features you can simply pass a list of `actionsIds` as above: `actions = [\"apple\", \"pear\"]`. For actions with features you can use the list of `IActions` or use a hash as a short-hand:\n\n```typescript\nactions = {\n  apple: { fruit: 1 },\n  chocolate: { fruit: 0, treat: 1 },\n};\n```\n\nIf all your features have the value `1`, you can also pass them as a list, so:\n\n```typescript\nactions = {\n  apple: [\"fruit\", \"healthy\"],\n  chocolate: [\"treat\", \"tasty\"],\n};\n```\n\nis equivalent to\n\n```typescript\nactions = {\n  apple: { fruit: 1, healthy: 1 },\n  chocolate: { treat: 1, tasty: 1 },\n};\n```\n\n(but slightly more readable when you have lots of features)\n\n### Adding context\n\nWe can also learn preferences depending on a context, by passing the relevant context as a hash into the `recommend` method. After positive feedback the bandit will learn to give similar recommendations given a similar context. For example when it is raining, recommend chocolate, when it is not, recommend apples. Like feature values, also context values should be encoded between `-1` and `1`.\n\n```typescript\nconst recommendation = bandit.recommend({ rain: 1 });\nawait bandit.accept(recommendation);\n```\n\n### Configuring the exploration-exploitation tradeoff with the temperature parameter\n\nYou can adjust how much the bandit exploits (assigning higher probability to higher scoring actions) or explores (assigning less low probability to lower scoring actions). In the most extreme case `temperature=0.0` you only ever pick the highest scoring action, never randomly exploring.\n\n```typescript\nconst bandit = new SimpleBandit({\n  actions: [\"apple\", \"pear\"],\n  temperature: 0.2,\n});\n```\n\nIt is worthwhile playing around with this parameter for your use case. Too much exploitation (low temperature) might mean you get stuck in a suboptimal optimization, and you do not adjust to changing preferences or circumstances. Too much exploration (high temperature), might mean you are not giving the best recommendations often enough.\n\n### Slates: Getting multiple recommendations\n\nIn order to get multiple recommendation (or a 'slate') instead of just one, call `bandit.slate()`:\n\n```typescript\nconst bandit = new SimpleBandit({\n  actions: [\"apple\", \"pear\", \"banana\"],\n  slateSize: 2,\n});\nlet slate = bandit.slate();\nawait bandit.choose(slate, slate.slateItems[1].actionId);\n//await bandit.reject(slate)\n```\n\nYou can pass slateSize as a parameter to the bandit, or to the slate method itself:\n\n```typescript\nslate = bandit.slate({ rain: 1 }, { slateSize: 3 });\n```\n\nWhen you call `bandit.choose(...)` you generate both an accept training data point for the chosen `actionId`, and rejection training data points for the not-chosen `slateItems`. You can set a lower sample weight on the rejected options with e.g. `slateNegativeSampleWeight=0.5`.`\n\n### Serializing and storing bandits\n\nYou can easily serialize/deserialize bandits to/from JSON. So you can store e.g. a personalized bandit for each user and load them on demand.\n\n```typescript\nconst bandit2 = SimpleBandit.fromJSON(bandit1.toJSON());\n```\n\n### Retaining training data\n\nThe `accept`, `reject` and `choose` methods also return a `trainingData[]` object.\nThese can be stored so that you can re-train the bandit at a later point (perhaps with e.g. a different oracle learningRate, or with different initial weights):\n\n```typescript\nconst trainingData = await bandit.accept(recommendation);\nconst bandit2 = new SimpleBandit({ actions: [\"apple\", \"pear\"] });\nawait bandit2.train(trainingData);\n```\n\n## Defining custom oracle\n\nFor more control over the behaviour of your bandit, you can customize the oracle:\n\n```typescript\noracle = new SimpleOracle({\n  actionIds: [\"apple\", \"pear\"], // only encode certain actionIds, ignore others\n  context: [\"rainy\"], // only encode certain context features, ignore others\n  features: [\"fruit\"], // only encode certain action features, ignore others\n  learningRate: 0.1, // how quick the oracle learns (and forgets)\n  regularizer: 0.0, // L2 (ridge) regularization parameter on the weights\n  actionIdFeatures: true, // learn preference for individual actions, regardless of context\n  actionFeatures: true, // learn preference over action features, regardless of context\n  contextActionIdInteractions: true, // learn interaction between context and actionId preference\n  contextActionFeatureInteractions: true, // learn interaction between context and action features preference\n  useInversePropensityWeighting: true, // oracle uses ipw by default (sample weight = 1/p), but can be switched off\n  laplaceSmoothing: 0.01, // add constant to probability before applying ipw\n  targetLabel: \"click\", // target label for oracle, defaults to 'click', but can also be e.g. 'rating'\n  oracleWeight: 1.0, // if using multiple oracles, how this one is weighted\n  name: \"click1\", // name is by default equal to targetLabel, but can give unique name if needed\n  weights: {}, // initialize oracle with feature weights hash\n});\n\nbandit = new SimpleBandit({\n  oracle: oracle,\n  temperature: 0.2,\n});\n```\n\n## Multiple oracles\n\nThe default oracle only optimizes for accepts/clicks, but in many cases you may want to optimize for other objectives or maybe a mixture of different objectives. You can pass a list of oracles that each learn to predict a different `targetLabel`:\n\n```typescript\nconst clickOracle = new SimpleOracle({\n    targetLabel: \"click\", // default\n  })\nconst starOracle = new SimpleOracle({\n    targetLabel: \"stars\", // if users leave a star rating after an action\n    oracleWeight: 2.0, // this oracle is weighted twice as heavily as the clickOracle\n    learningRate: 0.5, // can customize settings for each oracle\n  }),\n];\n\nconst bandit = new SimpleBandit({\n  oracle: [clickOracle, starOracle],\n  actions: actions,\n  temperature: temperature,\n});\n```\n\nThe `accept`, `reject` and `choose` methods still work the same for for all oracles with `targetLabel: 'click'`.\n\nFor other `targetLabels` there is the `feedback` method. You need to specify the `label` and the `value` (which should be between `0` and `1`):\n\n```typescript\nrecommendation = bandit.recommend();\nawait bandit.feedback(\n  recommendation,\n  \"stars\", // targetLabel\n  1.0, // value: should be 0 \u003c value \u003c 1\n);\n```\n\nFor a slate you also have to specify which action was chosen:\n\n```typescript\nslate = bandit.slate();\nawait bandit.feedback(\n  slate,\n  \"stars\",\n  1.0,\n  slate.slateItems[0].actionId, // if first item was chosen\n);\n```\n\n## Excluding actions\n\nIn some contexts you might want to apply business rules and exclude certain `actionIds`, or only include certain others:\n\n```typescript\nrecommendation = bandit.recommend(context, { exclude: [\"apple\"] });\nslate = bandit.slate(context, { include: [\"banana\", \"pear\"] });\n```\n\n## Examples\n\nThere are several usage examples provided in the `examples/` folder (built with react).\nYou can run the examples with `parcel examples/index.html` or `make example` and then view them on e.g. `http://localhost:1234`.\n\nOr simply visit [https://thoughtworks.github.io/simplebandit/](https://thoughtworks.github.io/simplebandit/).\n\n## Testing\n\n```sh\nnpm run test\n```\n\n## Contributing\n\nPull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthoughtworks%2Fsimplebandit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthoughtworks%2Fsimplebandit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthoughtworks%2Fsimplebandit/lists"}