{"id":13586788,"url":"https://github.com/business-science/anomalize","last_synced_at":"2025-04-09T10:04:28.782Z","repository":{"id":49503681,"uuid":"125931913","full_name":"business-science/anomalize","owner":"business-science","description":"Tidy anomaly detection","archived":false,"fork":false,"pushed_at":"2023-12-28T15:19:53.000Z","size":43937,"stargazers_count":339,"open_issues_count":38,"forks_count":61,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-04-02T08:11:08.521Z","etag":null,"topics":["anomaly","anomaly-detection","decomposition","detect-anomalies","iqr","r-package","time-series"],"latest_commit_sha":null,"homepage":"https://business-science.github.io/anomalize/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/business-science.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-03-19T23:08:52.000Z","updated_at":"2025-03-22T11:08:45.000Z","dependencies_parsed_at":"2022-09-13T00:13:39.999Z","dependency_job_id":"8c03364f-8dc2-4998-9608-5596bb5bd2b2","html_url":"https://github.com/business-science/anomalize","commit_stats":{"total_commits":89,"total_committers":6,"mean_commits":"14.833333333333334","dds":0.0898876404494382,"last_synced_commit":"afd9be0a82e0d6d1e41346d2f4f3436c421e9e1a"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/business-science%2Fanomalize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/business-science%2Fanomalize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/business-science%2Fanomalize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/business-science%2Fanomalize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/business-science","download_url":"https://codeload.github.com/business-science/anomalize/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248018060,"owners_count":21034048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly","anomaly-detection","decomposition","detect-anomalies","iqr","r-package","time-series"],"created_at":"2024-08-01T15:05:48.711Z","updated_at":"2025-04-09T10:04:28.746Z","avatar_url":"https://github.com/business-science.png","language":"R","funding_links":[],"categories":["R","Time Series","Machine Learning"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n# Anomalize is being Superceded by Timetk:\n\n# anomalize \u003cimg src=\"man/figures/anomalize-logo.png\" width=\"147\" height=\"170\" align=\"right\" /\u003e\n\n\u003c!-- badges: start --\u003e\n[![R-CMD-check](https://github.com/business-science/anomalize/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/business-science/anomalize/actions/workflows/R-CMD-check.yaml)\n[![Lifecycle Status](https://img.shields.io/badge/lifecycle-superceded-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html)\n[![Coverage status](https://codecov.io/gh/business-science/anomalize/branch/master/graph/badge.svg)](https://app.codecov.io/github/business-science/anomalize?branch=master)\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/anomalize)](https://cran.r-project.org/package=anomalize)\n![](http://cranlogs.r-pkg.org/badges/anomalize?color=brightgreen)\n![](http://cranlogs.r-pkg.org/badges/grand-total/anomalize?color=brightgreen)\n\u003c!-- badges: end --\u003e\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r setup, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\",\n  dpi = 200,\n  message = F,\n  warning = F\n)\nlibrary(anomalize)\nlibrary(dplyr) # for pipe \n```\n\n\nThe `anomalize` package functionality has been superceded by `timetk`. We suggest you begin to use the `timetk::anomalize()` to benefit from enhanced functionality to get improvements going forward. [Learn more about Anomaly Detection with `timetk` here.](https://business-science.github.io/timetk/articles/TK08_Automatic_Anomaly_Detection.html) \n\nThe original `anomalize` package functionality will be maintained for previous code bases that use the legacy functionality. \n\nTo prevent the new `timetk` functionality from conflicting with old `anomalize` code, use these lines:\n\n``` r\nlibrary(anomalize)\n\nanomalize \u003c- anomalize::anomalize\nplot_anomalies \u003c- anomalize::plot_anomalies\n```\n\n\n\u003c!-- # anomalize --\u003e\n\n\n\n\u003e Tidy anomaly detection\n\n`anomalize` enables a tidy workflow for detecting anomalies in data. The main functions are `time_decompose()`, `anomalize()`, and `time_recompose()`. When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the \"normal\" data from the anomalous data.\n\n## Anomalize In 2 Minutes (YouTube)\n\n\u003ca href=\"https://www.youtube.com/watch?v=Gk_HwjhlQJs\" target=\"_blank\"\u003e\u003cimg src=\"http://img.youtube.com/vi/Gk_HwjhlQJs/0.jpg\"\nalt=\"Anomalize\" width=\"100%\" height=\"350\"/\u003e\u003c/a\u003e\n\nCheck out our entire [Software Intro Series](https://www.youtube.com/watch?v=Gk_HwjhlQJs\u0026list=PLo32uKohmrXsYNhpdwr15W143rX6uMAze) on YouTube!\n\n## Installation\n\nYou can install the development version with `devtools` or the most recent CRAN version with `install.packages()`:\n\n``` r\n# devtools::install_github(\"business-science/anomalize\")\ninstall.packages(\"anomalize\")\n```\n\n## How It Works\n\n`anomalize` has three main functions:\n\n- `time_decompose()`: Separates the time series into seasonal, trend, and remainder components\n- `anomalize()`: Applies anomaly detection methods to the remainder component.\n- `time_recompose()`: Calculates limits that separate the \"normal\" data from the anomalies!\n\n## Getting Started\n\nLoad the `anomalize` package. Usually, you will also load the tidyverse as well!\n\n```{r, eval = F}\nlibrary(anomalize)\nlibrary(tidyverse)\n# NOTE: timetk now has anomaly detection built in, which \n#  will get the new functionality going forward.\n#  Use this script to prevent overwriting legacy anomalize:\n\nanomalize \u003c- anomalize::anomalize\nplot_anomalies \u003c- anomalize::plot_anomalies\n```\n\n\nNext, let's get some data.  `anomalize` ships with a data set called `tidyverse_cran_downloads` that contains the daily CRAN download counts for 15 \"tidy\" packages from 2017-01-01 to 2018-03-01.\n\nSuppose we want to determine which daily download \"counts\" are anomalous. It's as easy as using the three main functions (`time_decompose()`, `anomalize()`, and `time_recompose()`) along with a visualization function, `plot_anomalies()`.\n\n```{r tidyverse_anoms_1, fig.height=8}\ntidyverse_cran_downloads %\u003e%\n    # Data Manipulation / Anomaly Detection\n    time_decompose(count, method = \"stl\") %\u003e%\n    anomalize(remainder, method = \"iqr\") %\u003e%\n    time_recompose() %\u003e%\n    # Anomaly Visualization\n    plot_anomalies(time_recomposed = TRUE, ncol = 3, alpha_dots = 0.25) +\n    ggplot2::labs(title = \"Tidyverse Anomalies\", subtitle = \"STL + IQR Methods\") \n```\n\nCheck out the [`anomalize` Quick Start Guide](https://business-science.github.io/anomalize/articles/anomalize_quick_start_guide.html). \n\n## Reducing Forecast Error by 32%\n\nYes! Anomalize has a new function, `clean_anomalies()`, that can be used to repair time series prior to forecasting. We have a [brand new vignette - Reduce Forecast Error (by 32%) with Cleaned Anomalies](https://business-science.github.io/anomalize/articles/forecasting_with_cleaned_anomalies.html).\n```{r}\ntidyverse_cran_downloads %\u003e%\n    dplyr::filter(package == \"lubridate\") %\u003e%\n    dplyr::ungroup() %\u003e%\n    time_decompose(count) %\u003e%\n    anomalize(remainder) %\u003e%\n  \n    # New function that cleans \u0026 repairs anomalies!\n    clean_anomalies() %\u003e%\n  \n    dplyr::select(date, anomaly, observed, observed_cleaned) %\u003e%\n    dplyr::filter(anomaly == \"Yes\")\n```\n\n\n## But Wait, There's More!\n\nThere are a several extra capabilities:\n\n- `plot_anomaly_decomposition()` for visualizing the inner workings of how algorithm detects anomalies in the \"remainder\". \n\n```{r, fig.height=7}\ntidyverse_cran_downloads %\u003e%\n    dplyr::filter(package == \"lubridate\") %\u003e%\n    dplyr::ungroup() %\u003e%\n    time_decompose(count) %\u003e%\n    anomalize(remainder) %\u003e%\n    plot_anomaly_decomposition() +\n    ggplot2::labs(title = \"Decomposition of Anomalized Lubridate Downloads\")\n```\n\nFor more information on the `anomalize` methods and the inner workings, please see [\"Anomalize Methods\" Vignette](https://business-science.github.io/anomalize/articles/anomalize_methods.html). \n\n## References\n\nSeveral other packages were instrumental in developing anomaly detection methods used in `anomalize`:\n\n- Twitter's `AnomalyDetection`, which implements decomposition using median spans and the Generalized Extreme Studentized Deviation (GESD) test for anomalies.\n- `forecast::tsoutliers()` function, which implements the IQR method. \n\n# Interested in Learning Anomaly Detection?\n\nBusiness Science offers two 1-hour courses on Anomaly Detection:\n\n- [Learning Lab 18](https://university.business-science.io/p/learning-labs-pro) - Time Series Anomaly Detection with `anomalize`\n\n- [Learning Lab 17](https://university.business-science.io/p/learning-labs-pro) - Anomaly Detection with `H2O` Machine Learning\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbusiness-science%2Fanomalize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbusiness-science%2Fanomalize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbusiness-science%2Fanomalize/lists"}