{"id":32202374,"url":"https://github.com/vathymut/dsos","last_synced_at":"2025-10-30T19:44:33.088Z","repository":{"id":44956059,"uuid":"430799878","full_name":"vathymut/dsos","owner":"vathymut","description":"Dataset shift with outlier scores","archived":false,"fork":true,"pushed_at":"2023-02-19T07:40:07.000Z","size":1518,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-22T04:04:56.798Z","etag":null,"topics":["data-drift","data-validation","dataset-shifts","drift-detection","machine-learning","mlops","model-monitoring","model-validation","performance-monitoring","r","statistical-process-control","statistical-tests"],"latest_commit_sha":null,"homepage":"https://vathymut.github.io/dsos/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"rbc-research/dsos","license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vathymut.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-11-22T17:17:27.000Z","updated_at":"2024-06-12T16:45:03.000Z","dependencies_parsed_at":"2023-02-09T20:46:27.263Z","dependency_job_id":null,"html_url":"https://github.com/vathymut/dsos","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/vathymut/dsos","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vathymut%2Fdsos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vathymut%2Fdsos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vathymut%2Fdsos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vathymut%2Fdsos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vathymut","download_url":"https://codeload.github.com/vathymut/dsos/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vathymut%2Fdsos/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280849659,"owners_count":26401812,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-24T02:00:06.418Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-drift","data-validation","dataset-shifts","drift-detection","machine-learning","mlops","model-monitoring","model-validation","performance-monitoring","r","statistical-process-control","statistical-tests"],"created_at":"2025-10-22T04:01:40.068Z","updated_at":"2025-10-30T19:44:33.078Z","avatar_url":"https://github.com/vathymut.png","language":"R","readme":"---\noutput: github_document\n---\n\n# `D-SOS`: Dataset shift with outlier scores\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n\u003c!-- badges: start --\u003e\n[![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle)\n[![License: GPL3](https://img.shields.io/badge/License-GPL3-green.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html)\n[![CRAN](https://www.r-pkg.org/badges/version/dsos)](https://cran.r-project.org/package=dsos)\n[![UAI 2022](https://img.shields.io/badge/paper-UAI 2022-yellow)](https://openreview.net/forum?id=S5UG2BLi9xc)\n[![downloads](https://cranlogs.r-pkg.org/badges/dsos)](https://cran.r-project.org/package=dsos)\n[![total-downloads](http://cranlogs.r-pkg.org/badges/grand-total/dsos)](https://cran.r-project.org/package=dsos)\n[![useR! 2022](https://img.shields.io/youtube/views/TALE9JUir8Q?style=social)](https://youtu.be/TALE9JUir8Q?t=26)\n\u003c!-- badges: end --\u003e\n\n## Overview\n\n`dsos` tests for no adverse shift based on outlier scores. Colloquially,\nthese tests check whether the new sample is not substantively worse than the\nold sample, not if the two are equal as tests of equal distributions do.\n`dsos` implements a family of two-sample comparison which assumes that\nwe have both a training set, the reference distribution, and a test set.\n\n## Installation\n\nThe package is under active development.\nFrom GitHub (which includes recent improvements), install with:\n\n```{r github, eval=FALSE}\n# install.packages(\"remotes\")\nremotes::install_github(\"vathymut/dsos\")\n```\n\nThe package is also on [CRAN](https://CRAN.R-project.org), although the\nCRAN release may lag behind GitHub updates. From CRAN, install the\npackage with:\n\n```{r cran, eval=FALSE}\ninstall.packages(\"dsos\")\n```\n\n## Quick Start\n\nSimulate outlier scores to test for no adverse shift when the null (no\nshift) holds. First, we use the frequentist permutation test:\n\n```{r null_pt, eval=TRUE}\nlibrary(dsos)\nset.seed(12345)\nn \u003c- 6e2\nos_train \u003c- rnorm(n = n)\nos_test \u003c- rnorm(n = n)\nnull_pt \u003c- pt_from_os(os_train, os_test)\nplot(null_pt)\n```\n\nWe can also use the (faster) asymptotic test:\n\n```{r null_at, eval=FALSE}\nnull_at \u003c- at_from_os(os_train, os_test)\nplot(null_at)\n```\n\nDoing the same exercise the Bayesian way (with Bayes factors):\n\n```{r null_bf, eval=TRUE}\nnull_bf \u003c- bf_from_os(os_train, os_test)\n# plot(null_bf)\nas_pvalue(null_bf$bayes_factor)\n```\n\nIn all cases, we fail to reject the null of no adverse shift. Note how we\ncan convert a Bayes factor into a $p$-value.\n\nWe can repeat this exercise when there is an adverse shift. Again, with\nthe permutation test:\n\n```{r shift_pt, eval=TRUE}\nos_shift \u003c- rnorm(n = n, mean = 0.2)\nshift_pt \u003c- pt_from_os(os_train, os_shift)\nplot(shift_pt)\n```\n\nOnce more, with the asymptotic test:\n\n```{r shift_at, eval=FALSE}\nshift_at \u003c- at_from_os(os_train, os_shift)\nplot(shift_at)\n```\n\nDoing it the Bayesian way (with Bayes factors):\n\n```{r shift_bf, eval=TRUE}\nshift_bf \u003c- bf_from_os(os_train, os_shift)\n# plot(shift_bf)\nas_pvalue(shift_bf$bayes_factor)\n```\n\nWe would reject the null of no adverse shift in all cases: the test set\nis worse off relative to the reference (training) scores.\n\nThe function `bf_compare` is handy: it computes and contrasts Bayes\nfactors for the frequentist and Bayesian approach.\n\n```{r shift_all, eval=TRUE}\nshift_all \u003c- bf_compare(os_train, os_shift)\nshift_all\n```\n\n## Reference\n\nTo cite this work, please refer to the\n[paper](https://openreview.net/forum?id=S5UG2BLi9xc). Sample Bibtex is below:\n\n```bibtex\n@inproceedings{kamulete2022test,\n  title     = {Test for non-negligible adverse shifts},\n  author    = {Vathy M. Kamulete},\n  booktitle = {The 38th Conference on Uncertainty in Artificial Intelligence},\n  year      = {2022},\n  url       = {https://openreview.net/forum?id=S5UG2BLi9xc}\n}\n```\n\nI gave a talk introducing the `dsos` R package at\n[useR! 2022](https://youtu.be/TALE9JUir8Q?t=26) during the\n'Unique Applications and Methods' track. It is a 15-minute crash course,\nfocused on interpretation. I also wrote a \n[blog post](https://vathymut.org/posts/2023-01-03-are-you-ok/)\nto motivate the need for tests of adverse shift.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvathymut%2Fdsos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvathymut%2Fdsos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvathymut%2Fdsos/lists"}