{"id":15048097,"url":"https://github.com/github/dat-science","last_synced_at":"2025-10-04T07:31:06.710Z","repository":{"id":7077000,"uuid":"8364973","full_name":"github/dat-science","owner":"github","description":"Replaced by https://github.com/github/scientist","archived":true,"fork":false,"pushed_at":"2014-11-17T22:59:10.000Z","size":480,"stargazers_count":582,"open_issues_count":0,"forks_count":15,"subscribers_count":25,"default_branch":"master","last_synced_at":"2025-09-27T03:01:58.516Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/github.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-02-22T20:09:23.000Z","updated_at":"2024-10-22T22:39:51.000Z","dependencies_parsed_at":"2022-07-30T10:38:04.936Z","dependency_job_id":null,"html_url":"https://github.com/github/dat-science","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/github/dat-science","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2Fdat-science","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2Fdat-science/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2Fdat-science/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2Fdat-science/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/github","download_url":"https://codeload.github.com/github/dat-science/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2Fdat-science/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278283485,"owners_count":25961309,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-09-24T21:08:05.182Z","updated_at":"2025-10-04T07:31:06.381Z","avatar_url":"https://github.com/github.png","language":"Ruby","readme":"# Science is happening elsewhere!\n\n*This repository is historical. Up-to-date bits are over in [`github/scientist`](https://github.com/github/scientist).*\n\nA Ruby library for carefully refactoring critical paths. Science isn't a feature\nflipper or an A/B testing tool, it's a pattern that helps measure and validate\nlarge code changes without altering behavior.\n\n## How do I do science?\n\nLet's pretend you're changing the way you handle permissions in a large web app.\nTests can help guide your refactoring, but you really want to compare the\ncurrent and new behaviors live, under load.\n\n```ruby\nrequire \"dat/science\"\n\nclass MyApp::Widget\n  def allows?(user)\n    experiment = Dat::Science::Experiment.new \"widget-permissions\" do |e|\n      e.control   { model.check_user(user).valid? } # old way\n      e.candidate { user.can? :read, model } # new way\n    end\n\n    experiment.run\n  end\nend\n```\n\nWrap a `control` block around the code's original behavior, and wrap `candidate`\naround the new behavior. `experiment.run` will always return whatever the\n`control` block returns, but it does a bunch of stuff behind the scenes:\n\n* Decides whether or not to run `candidate`,\n* Runs `candidate` before `control` 50% of the time,\n* Measures the duration of both behaviors,\n* Compares the results of both behaviors,\n* Swallows any exceptions raised by the candidate behavior, and\n* Publishes all this information for tracking and reporting.\n\nIf you'd like a bit less verbosity, the `Dat::Science#science` helper\ninstantiates an experiment and calls `run`:\n\n```ruby\nrequire \"dat/science\"\n\nclass MyApp::Widget\n  include Dat::Science\n\n  def allows?(user)\n    science \"widget-permissions\" do |e|\n      e.control   { model.check_user(user).valid? } # old way\n      e.candidate { user.can? :read, model } # new way\n    end\n  end\nend\n```\n\n## Making science useful\n\nThe examples above will run, but they're not particularly helpful. The\n`candidate` block runs every time, and none of the results get\npublished. Let's fix that by creating an app-specific sublass of\n`Dat::Science::Experiment`. This makes it easy to add custom behavior\nfor enabling/disabling/throttling experiments and publishing results.\n\n```ruby\nrequire \"dat/science\"\n\nmodule MyApp\n  class Experiment \u003c Dat::Science::Experiment\n    def enabled?\n      # See \"Ramping up experiments\" below.\n    end\n\n    def publish(name, payload)\n      # See \"Publishing results\" below.\n    end\n  end\nend\n```\n\nAfter creating a subclass, tell `Dat::Science` to instantiate it any time the\n`science` helper is called:\n\n```ruby\nDat::Science.experiment = MyApp::Experiment\n```\n\n### Controlling comparison\n\nBy default the results of the `candidate` and `control` blocks are compared\nwith `==`. Use `comparator` to do something more fancy:\n\n```ruby\nscience \"loose-comparison\" do |e|\n  e.control    { \"vmg\" }\n  e.candidate  { \"VMG\" }\n  e.comparator { |a, b| a.downcase == b.downcase }\nend\n```\n\n### Ramping up experiments\n\nBy default the `candidate` block of an experiment will run 100% of the time.\nThis is often a really bad idea when testing live. `Experiment#enabled?` can be\noverridden to run all candidates, say, 10% of the time:\n\n```ruby\ndef enabled?\n  rand(100) \u003c 10\nend\n```\n\nOr, even better, use a feature flag library like [Flipper][]. Delegating the\ndecision makes it easy to define different rules for each experiment, and can\nhelp keep all your entropy concerns in one place.\n\n[Flipper]: https://github.com/jnunemaker/flipper\n\n```ruby\ndef enabled?\n  MyApp.flipper[name].enabled?\nend\n```\n\n### Publishing results\n\nBy default the results of an experiment are discarded. This isn't very useful.\n`Experiment#publish` can be overridden to publish results via any\ninstrumentation mechanism, which makes it easy to graph durations or\nmatches/mismatches and store results. The only two events published by an\nexperiment are `:match` when the result of the control and candidate behaviors\nare the same, and `:mismatch` when they aren't.\n\n```ruby\ndef publish(event, payload)\n  MyApp.instrument \"science.#{event}\", payload\nend\n```\n\nThe published `payload` is a Symbol-keyed Hash:\n\n```ruby\n{\n  :experiment =\u003e \"widget-permissions\",\n  :first      =\u003e :control,\n  :timestamp  =\u003e \u003ca-Time-instance\u003e,\n\n  :candidate =\u003e {\n    :duration  =\u003e 2.5,\n    :exception =\u003e nil,\n    :value     =\u003e 42\n  },\n\n  :control =\u003e {\n    :duration  =\u003e 25.0,\n    :exception =\u003e nil,\n    :value     =\u003e 24\n  }\n}\n```\n\n`:experiment` is the name of the experiment. `:first` is either `:candidate` or\n`:control`, depending on which block was run first during the experiment.\n`:timestamp` is the Time when the experiment started.\n\nThe `:candidate` and `:control` Hashes have the same keys:\n\n* `:duration` is the execution in ms, expressed as a float.\n* `:exception` is a reference to any raised exception or `nil`.\n* `:value` is the result of the block.\n\n#### Adding context\n\nIt's often useful to add more information to your results, and\n`Experiment#context` makes it easy:\n\n```ruby\nscience \"widget-permissions\" do |e|\n  e.context :user =\u003e user\n\n  e.control   { model.check_user(user).valid? } # old way\n  e.candidate { user.can? :read, model } # new way\nend\n```\n\n`context` takes a Symbol-keyed Hash of additional information to publish and\nmerges it with the default payload.\n\n#### Keeping it clean\n\nSometimes the things you're comparing can be huge, and there's no good way\nto do science against something simpler. Use a `cleaner` to publish a\nsimple version of a big nasty object graph:\n\n```ruby\nscience \"huge-results\" do |e|\n  e.control   { OldAndBusted.huge_results_for query }\n  e.candidate { NewHotness.huge_results_for query }\n  e.cleaner   { |result| result.count }\nend\n```\n\nThe results of the `control` and `candidate` blocks will be run through the\n`cleaner`. You could get the same behavior by calling `count` in the blocks,\nbut the `cleaner` makes it easier to keep things in sync. The original\n`control` result is still returned.\n\n## What do I do with all these results?\n\nOnce you've started an experiment and published some results, you'll want to\nanalyze the mismatches from your experiment.  Check out\n[`dat-analysis`](https://github.com/github/dat-analysis) where you'll find an\nanalysis toolkit to help you understand your experiment results.\n\n## Hacking on science\n\nBe on a Unixy box. Make sure a modern Bundler is available. `script/test` runs\nthe unit tests. All development dependencies will be installed automatically if\nthey're not available. Dat science happens primarily on Ruby 1.9.3 and 1.8.7,\nbut science should be universal.\n\n## Maintainers\n\n[@jbarnette](https://github.com/jbarnette) and [@rick](https://github.com/rick)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgithub%2Fdat-science","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgithub%2Fdat-science","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgithub%2Fdat-science/lists"}