https://github.com/github/dat-science
Replaced by https://github.com/github/scientist
https://github.com/github/dat-science
Last synced: 3 months ago
JSON representation
Replaced by https://github.com/github/scientist
- Host: GitHub
- URL: https://github.com/github/dat-science
- Owner: github
- License: mit
- Archived: true
- Created: 2013-02-22T20:09:23.000Z (almost 13 years ago)
- Default Branch: master
- Last Pushed: 2014-11-17T22:59:10.000Z (about 11 years ago)
- Last Synced: 2025-09-27T03:01:58.516Z (3 months ago)
- Language: Ruby
- Homepage:
- Size: 469 KB
- Stars: 582
- Watchers: 25
- Forks: 15
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Science is happening elsewhere!
*This repository is historical. Up-to-date bits are over in [`github/scientist`](https://github.com/github/scientist).*
A Ruby library for carefully refactoring critical paths. Science isn't a feature
flipper or an A/B testing tool, it's a pattern that helps measure and validate
large code changes without altering behavior.
## How do I do science?
Let's pretend you're changing the way you handle permissions in a large web app.
Tests can help guide your refactoring, but you really want to compare the
current and new behaviors live, under load.
```ruby
require "dat/science"
class MyApp::Widget
def allows?(user)
experiment = Dat::Science::Experiment.new "widget-permissions" do |e|
e.control { model.check_user(user).valid? } # old way
e.candidate { user.can? :read, model } # new way
end
experiment.run
end
end
```
Wrap a `control` block around the code's original behavior, and wrap `candidate`
around the new behavior. `experiment.run` will always return whatever the
`control` block returns, but it does a bunch of stuff behind the scenes:
* Decides whether or not to run `candidate`,
* Runs `candidate` before `control` 50% of the time,
* Measures the duration of both behaviors,
* Compares the results of both behaviors,
* Swallows any exceptions raised by the candidate behavior, and
* Publishes all this information for tracking and reporting.
If you'd like a bit less verbosity, the `Dat::Science#science` helper
instantiates an experiment and calls `run`:
```ruby
require "dat/science"
class MyApp::Widget
include Dat::Science
def allows?(user)
science "widget-permissions" do |e|
e.control { model.check_user(user).valid? } # old way
e.candidate { user.can? :read, model } # new way
end
end
end
```
## Making science useful
The examples above will run, but they're not particularly helpful. The
`candidate` block runs every time, and none of the results get
published. Let's fix that by creating an app-specific sublass of
`Dat::Science::Experiment`. This makes it easy to add custom behavior
for enabling/disabling/throttling experiments and publishing results.
```ruby
require "dat/science"
module MyApp
class Experiment < Dat::Science::Experiment
def enabled?
# See "Ramping up experiments" below.
end
def publish(name, payload)
# See "Publishing results" below.
end
end
end
```
After creating a subclass, tell `Dat::Science` to instantiate it any time the
`science` helper is called:
```ruby
Dat::Science.experiment = MyApp::Experiment
```
### Controlling comparison
By default the results of the `candidate` and `control` blocks are compared
with `==`. Use `comparator` to do something more fancy:
```ruby
science "loose-comparison" do |e|
e.control { "vmg" }
e.candidate { "VMG" }
e.comparator { |a, b| a.downcase == b.downcase }
end
```
### Ramping up experiments
By default the `candidate` block of an experiment will run 100% of the time.
This is often a really bad idea when testing live. `Experiment#enabled?` can be
overridden to run all candidates, say, 10% of the time:
```ruby
def enabled?
rand(100) < 10
end
```
Or, even better, use a feature flag library like [Flipper][]. Delegating the
decision makes it easy to define different rules for each experiment, and can
help keep all your entropy concerns in one place.
[Flipper]: https://github.com/jnunemaker/flipper
```ruby
def enabled?
MyApp.flipper[name].enabled?
end
```
### Publishing results
By default the results of an experiment are discarded. This isn't very useful.
`Experiment#publish` can be overridden to publish results via any
instrumentation mechanism, which makes it easy to graph durations or
matches/mismatches and store results. The only two events published by an
experiment are `:match` when the result of the control and candidate behaviors
are the same, and `:mismatch` when they aren't.
```ruby
def publish(event, payload)
MyApp.instrument "science.#{event}", payload
end
```
The published `payload` is a Symbol-keyed Hash:
```ruby
{
:experiment => "widget-permissions",
:first => :control,
:timestamp => ,
:candidate => {
:duration => 2.5,
:exception => nil,
:value => 42
},
:control => {
:duration => 25.0,
:exception => nil,
:value => 24
}
}
```
`:experiment` is the name of the experiment. `:first` is either `:candidate` or
`:control`, depending on which block was run first during the experiment.
`:timestamp` is the Time when the experiment started.
The `:candidate` and `:control` Hashes have the same keys:
* `:duration` is the execution in ms, expressed as a float.
* `:exception` is a reference to any raised exception or `nil`.
* `:value` is the result of the block.
#### Adding context
It's often useful to add more information to your results, and
`Experiment#context` makes it easy:
```ruby
science "widget-permissions" do |e|
e.context :user => user
e.control { model.check_user(user).valid? } # old way
e.candidate { user.can? :read, model } # new way
end
```
`context` takes a Symbol-keyed Hash of additional information to publish and
merges it with the default payload.
#### Keeping it clean
Sometimes the things you're comparing can be huge, and there's no good way
to do science against something simpler. Use a `cleaner` to publish a
simple version of a big nasty object graph:
```ruby
science "huge-results" do |e|
e.control { OldAndBusted.huge_results_for query }
e.candidate { NewHotness.huge_results_for query }
e.cleaner { |result| result.count }
end
```
The results of the `control` and `candidate` blocks will be run through the
`cleaner`. You could get the same behavior by calling `count` in the blocks,
but the `cleaner` makes it easier to keep things in sync. The original
`control` result is still returned.
## What do I do with all these results?
Once you've started an experiment and published some results, you'll want to
analyze the mismatches from your experiment. Check out
[`dat-analysis`](https://github.com/github/dat-analysis) where you'll find an
analysis toolkit to help you understand your experiment results.
## Hacking on science
Be on a Unixy box. Make sure a modern Bundler is available. `script/test` runs
the unit tests. All development dependencies will be installed automatically if
they're not available. Dat science happens primarily on Ruby 1.9.3 and 1.8.7,
but science should be universal.
## Maintainers
[@jbarnette](https://github.com/jbarnette) and [@rick](https://github.com/rick)