{"id":13655963,"url":"https://github.com/ethanrosenthal/skits","last_synced_at":"2025-12-30T00:09:14.915Z","repository":{"id":57468089,"uuid":"120064371","full_name":"EthanRosenthal/skits","owner":"EthanRosenthal","description":"scikit-learn-inspired time series","archived":false,"fork":false,"pushed_at":"2024-03-20T15:45:26.000Z","size":348,"stargazers_count":200,"open_issues_count":5,"forks_count":18,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-04-19T16:25:58.144Z","etag":null,"topics":["machine-learning","time-series"],"latest_commit_sha":null,"homepage":"https://www.ethanrosenthal.com/2018/03/22/time-series-for-scikit-learn-people-part2/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EthanRosenthal.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-02-03T06:02:58.000Z","updated_at":"2025-02-27T15:47:51.000Z","dependencies_parsed_at":"2024-06-19T09:59:56.037Z","dependency_job_id":"1690a380-c56c-4a7e-96fe-1b2a11ca3b58","html_url":"https://github.com/EthanRosenthal/skits","commit_stats":{"total_commits":18,"total_committers":3,"mean_commits":6.0,"dds":0.2777777777777778,"last_synced_commit":"66ba312da560e7c6b1784e7b93c1d28571e3659d"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthanRosenthal%2Fskits","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthanRosenthal%2Fskits/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthanRosenthal%2Fskits/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthanRosenthal%2Fskits/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EthanRosenthal","download_url":"https://codeload.github.com/EthanRosenthal/skits/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250480329,"owners_count":21437524,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","time-series"],"created_at":"2024-08-02T04:00:43.697Z","updated_at":"2025-12-30T00:09:14.876Z","avatar_url":"https://github.com/EthanRosenthal.png","language":"Python","funding_links":[],"categories":["Uncategorized"],"sub_categories":["Uncategorized"],"readme":"# skits\n[![CircleCI](https://circleci.com/gh/EthanRosenthal/skits/tree/master.svg?style=svg)](https://circleci.com/gh/EthanRosenthal/skits/tree/master)\n[![PyPI version](https://badge.fury.io/py/skits.svg)](https://badge.fury.io/py/skits)\n\nA library for\n**S**ci**K**it-learn-**I**nspired **T**ime **S**eries models.\n\nThe primary goal of this library is to allow one to train time series prediction models using a similar API to `scikit-learn`. Consequently, similar to `scikit-learn`, this library consists of `preprocessors`, `feature_extractors`, and `pipelines`. \n\n## Installation\n\nInstall with pip:\n\n```commandline\npip install skits\n```\n\n## Preprocessors\n\nThe preprocessors expect to receive time series data, and then end up storing some data about the time series such that they can fully invert a transform. The following example shows how to create a `DifferenceTransformer` transform data, and then invert it back to its original form. The `DifferenceTransformer` subtracts the point shifted by `period` away from each point.\n\n```python\nimport numpy as np\nfrom skits.preprocessing import DifferenceTransformer\n\ny = np.random.random(10)\n# scikit-learn expects 2D design matrices,\n# so we duplicate the time series.\nX = y[:, np.newaxis] \n\ndt = DifferenceTransformer(period=2)\n\nXt = dt.fit_transform(X,y)\nX_inv = dt.inverse_transform(Xt)\n\nassert np.allclose(X, X_inv)\n```\n\n## Feature Extractors\n\nAfter all preprocessing transformations are completed, multiple features may be built out of the time series. These can be built via feature extractors, which one should combine together into a large `FeatureUnion`. Current features include autoregressive, seasonal, and integrated features (covering the AR and I of ARIMA models).\n\n\n## Pipelines\n\nThere are two types of pipelines. The `ForecasterPipeline` is for forecasting time series (duh). Specifically, one should build this pipeline with a regressor as the final step such that one can make appropriate predictions. The functionality is similar to a regular `scikit-learn` pipeline. Differences include the addition of a `forecast()` method along with a `to_scale` keyword argument to `predict()` such that one can make sure that their prediction is on the same scale as the original data.\n\nThese classes are likely subject to change as they are fairly hacky right now. For example, one must transform both `X` and `y` for all transformations before the introduction of a `DifferenceTransformer`. While the pipeline handles this, one must prefix all of these transformations with `pre_` in the step names.\n\nAnywho, here's an example:\n\n```python\nimport numpy as np\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.pipeline import FeatureUnion\n\nfrom skits.pipeline import ForecasterPipeline\nfrom skits.preprocessing import ReversibleImputer\nfrom skits.feature_extraction import (AutoregressiveTransformer, \n                                      SeasonalTransformer)\n                               \nsteps = [\n    ('pre_scaling', StandardScaler()),\n    ('features', FeatureUnion([\n        ('ar_transformer', AutoregressiveTransformer(num_lags=3)),\n        ('seasonal_transformer', SeasonalTransformer(seasonal_period=20)\n    )])),\n    ('post_features_imputer', ReversibleImputer()),\n    ('regressor', LinearRegression(fit_intercept=False))\n]\n                               \nl = np.linspace(0, 1, 101)\ny = 5*np.sin(2 * np.pi * 5 * l) + np.random.normal(0, 1, size=101)\nX = y[:, np.newaxis]\n\npipeline = ForecasterPipeline(steps)\n\npipeline.fit(X, y)\ny_pred = pipeline.predict(X, to_scale=True, refit=True)\n```\n\nAnd this ends up looking like:\n\n```python\nimport matplotlib.pyplot as plt\n\nplt.plot(y, lw=2)\nplt.plot(y_pred, lw=2)\nplt.legend(['y_true', 'y_pred'], bbox_to_anchor=(1, 1));\n```\n![pred](pred.png)\n\nAnd forecasting looks like\n\n```python\nstart_idx = 70\nplt.plot(y, lw=2);\nplt.plot(pipeline.forecast(y[:, np.newaxis], start_idx=start_idx), lw=2);\nax = plt.gca();\nylim = ax.get_ylim();\nplt.plot((start_idx, start_idx), ylim, lw=4);\nplt.ylim(ylim);\nplt.legend(['y_true', 'y_pred', 'forecast start'], bbox_to_anchor=(1, 1));\n```\n![forecast](forecast.png)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fethanrosenthal%2Fskits","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fethanrosenthal%2Fskits","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fethanrosenthal%2Fskits/lists"}