{"id":24445158,"url":"https://github.com/autogluon/fev","last_synced_at":"2026-01-16T16:33:09.432Z","repository":{"id":267284673,"uuid":"897142969","full_name":"autogluon/fev","owner":"autogluon","description":"Forecast evaluation library","archived":false,"fork":false,"pushed_at":"2026-01-09T00:45:49.000Z","size":1843,"stargazers_count":136,"open_issues_count":6,"forks_count":13,"subscribers_count":5,"default_branch":"main","last_synced_at":"2026-01-11T12:14:39.620Z","etag":null,"topics":["benchmarking","datasets","forecasting","huggingface-datasets","time-series","time-series-forecasting","timeseries"],"latest_commit_sha":null,"homepage":"https://autogluon.github.io/fev/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/autogluon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-12-02T05:26:37.000Z","updated_at":"2026-01-08T09:42:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"aa0d5272-11f1-4642-b644-d85864ed8e4b","html_url":"https://github.com/autogluon/fev","commit_stats":null,"previous_names":["autogluon/fev"],"tags_count":19,"template":false,"template_full_name":"amazon-archives/__template_Apache-2.0","purl":"pkg:github/autogluon/fev","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Ffev","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Ffev/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Ffev/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Ffev/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/autogluon","download_url":"https://codeload.github.com/autogluon/fev/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Ffev/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28479921,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","datasets","forecasting","huggingface-datasets","time-series","time-series-forecasting","timeseries"],"created_at":"2025-01-20T23:17:06.615Z","updated_at":"2026-01-16T16:33:09.417Z","avatar_url":"https://github.com/autogluon.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# fev\n\n[![preprint](https://img.shields.io/static/v1?label=Paper\u0026message=2509.26468\u0026color=B31B1B\u0026logo=arXiv)](https://arxiv.org/abs/2509.26468)\n[![fev-bench](https://img.shields.io/badge/%F0%9F%8F%86%20fev--bench-Leaderboard-0078D4)](https://huggingface.co/spaces/autogluon/fev-leaderboard)\n[![huggingface](https://img.shields.io/badge/%F0%9F%A4%97%20HF-Chronos_Datasets-FFD21E)](https://huggingface.co/datasets/autogluon/chronos_datasets)\n[![huggingface](https://img.shields.io/badge/%F0%9F%A4%97%20HF-fev_Datasets-FFD21E)](https://huggingface.co/datasets/autogluon/fev_datasets)\n[![License: Apache-2.0](https://img.shields.io/badge/License-Apache--2.0-green.svg)](https://opensource.org/licenses/Apache-2.0)\n\n\u003c/div\u003e\n\n`fev` (Forecast EValuation library) is a lightweight package that makes it easy to benchmark time series forecasting models.\n\n- Extensible: Easy to define your own forecasting tasks and benchmarks.\n- Reproducible: Ensures that the results obtained by different users are comparable.\n- Easy to use: Compatible with most popular forecasting libraries.\n- Minimal dependencies: Just a thin wrapper on top of 🤗[`datasets`](https://huggingface.co/docs/datasets/en/index).\n\n### How is `fev` different from other benchmarking tools?\n\nExisting forecasting benchmarks usually fall into one of two categories:\n\n- Standalone datasets without any supporting infrastructure. These provide no guarantees that the results obtained by different users are comparable. For example, changing the start date or duration of the forecast horizon totally changes the meaning of the scores.\n- Bespoke end-to-end systems that combine models, datasets and forecasting tasks. Such packages usually come with lots of dependencies and assumptions, which makes extending or integrating these libraries into existing systems difficult.\n\n`fev` aims for the middle ground - it provides the core benchmarking functionality without introducing unnecessary constraints or bloated dependencies. The library supports point \u0026 probabilistic forecasting, different types of covariates, as well as all popular forecasting metrics.\n\n## 📝 Updates\n- **2025-09-16**: The new version `0.6.0` contains major new functionality, [updated documentation](https://autogluon.github.io/fev/latest/), as well as some breaking changes to the `Task` API. Please check the [release notes](https://github.com/autogluon/fev/releases) for more details.\n\n## ⚙️ Installation\n```\npip install fev\n```\n\n## 🚀 Quickstart\n\nCreate a task from a dataset stored on Hugging Face Hub\n```python\nimport fev\n\ntask = fev.Task(\n    dataset_path=\"autogluon/chronos_datasets\",\n    dataset_config=\"m4_hourly\",\n    horizon=24,\n)\n```\nIterate over the rolling evaluation windows:\n```python\nfor window in task.iter_windows():\n    past_data, future_data = window.get_input_data()\n```\n- `past_data` contains the past data before the forecast horizon (item ID, past timestamps, target, all covariates).\n- `future_data` contains future data that is known at prediction time (item ID, future timestamps, and known covariates)\n\nMake predictions\n```python\ndef naive_forecast(y: list, horizon: int) -\u003e dict[str, list[float]]:\n    # Make predictions for a single time series\n    return {\"predictions\": [y[-1] for _ in range(horizon)]}\n\npredictions_per_window = []\nfor window in task.iter_windows():\n    past_data, future_data = window.get_input_data()\n    predictions = [\n        naive_forecast(ts[task.target_column], task.horizon) for ts in past_data\n    ]\n    predictions_per_window.append(predictions)\n```\nGet an evaluation summary\n```python\ntask.evaluation_summary(predictions_per_window, model_name=\"naive\")\n# {'model_name': 'naive',\n#  'dataset_path': 'autogluon/chronos_datasets',\n#  'dataset_config': 'm4_hourly',\n#  'horizon': 24,\n#  'num_windows': 1,\n#  'initial_cutoff': -24,\n#  'window_step_size': 24,\n#  'min_context_length': 1,\n#  'max_context_length': None,\n#  'seasonality': 1,\n#  'eval_metric': 'MASE',\n#  'extra_metrics': [],\n#  'quantile_levels': None,\n#  'id_column': 'id',\n#  'timestamp_column': 'timestamp',\n#  'target_column': 'target',\n#  'generate_univariate_targets_from': None,\n#  'past_dynamic_columns': [],\n#  'excluded_columns': [],\n#  'task_name': 'm4_hourly',\n#  'test_error': 3.815112047601983,\n#  'training_time_s': None,\n#  'inference_time_s': None,\n#  'dataset_fingerprint': '19e36bb78b718d8d',\n#  'trained_on_this_dataset': False,\n#  'fev_version': '0.6.0',\n#  'MASE': 3.815112047601983}\n```\nThe evaluation summary contains all information necessary to uniquely identify the forecasting task.\n\nMultiple evaluation summaries produced by different models on different tasks can be aggregated into a single table.\n```python\n# Dataframes, dicts, JSON or CSV files supported\nsummaries = \"https://raw.githubusercontent.com/autogluon/fev/refs/heads/main/benchmarks/example/results/results.csv\"\nfev.leaderboard(summaries)\n# | model_name     |   skill_score |   win_rate | ... |\n# |:---------------|--------------:|-----------:| ... |\n# | auto_theta     |         0.126 |      0.667 | ... |\n# | auto_arima     |         0.113 |      0.667 | ... |\n# | auto_ets       |         0.049 |      0.444 | ... |\n# | seasonal_naive |         0     |      0.222 | ... |\n```\n\n## 📚 Documentation\n- Tutorials\n    - [Quickstart](https://autogluon.github.io/fev/latest/tutorials/01-quickstart/): Define a task and evaluate a model.\n    - [Datasets](https://autogluon.github.io/fev/latest/tutorials/02-dataset-format/): Use `fev` with your own datasets.\n    - [Tasks \u0026 benchmarks](https://autogluon.github.io/fev/latest/tutorials/03-tasks-and-benchmarks/): Advanced features for defining tasks and benchmarks.\n    - [Adapters](https://autogluon.github.io/fev/latest/tutorials/04-adapters/): Easily convert data into formats expected by popular time series libraries like [AutoGluon](https://auto.gluon.ai/), [Nixtlaverse](https://nixtlaverse.nixtla.io/), [GluonTS](https://ts.gluon.ai/), [Darts](https://unit8co.github.io/darts/) and more.\n    - [Models](https://autogluon.github.io/fev/latest/tutorials/05-add-your-model/): Evaluate your models and submit results to the leaderboard.\n- [API reference](https://autogluon.github.io/fev/latest/api/task/)\n\nExamples of model implementations compatible with `fev` are available in [`examples/`](./examples/).\n\n\n## 🏅 Leaderboards\nWe host leaderboards obtained using `fev` under https://huggingface.co/spaces/autogluon/fev-bench. This leaderboard includes results for the benchmark from [fev-bench: A Realistic Benchmark for Time Series Forecasting](https://arxiv.org/abs/2509.26468). Previous results for Chronos Benchmark II are available in [benchmarks/chronos_zeroshot/](benchmarks/chronos_zeroshot/).\n\n## 📈 Datasets\nRepositories with datasets in format compatible with `fev`:\n- [`chronos_datasets`](https://huggingface.co/datasets/autogluon/chronos_datasets)\n- [`fev_datasets`](https://huggingface.co/datasets/autogluon/fev_datasets)\n\n## Citation\n\nIf you find this package useful for your research, please consider citing the associated paper(s):\n```\n@article{shchur2025fev,\n  title={{fev-bench}: A Realistic Benchmark for Time Series Forecasting},\n  author={Shchur, Oleksandr and Ansari, Abdul Fatir and Turkmen, Caner and Stella, Lorenzo and Erickson, Nick and Guerron, Pablo and Bohlke-Schneider, Michael and Wang, Yuyang},\n  year={2025},\n  eprint={2509.26468},\n  archivePrefix={arXiv},\n  primaryClass={cs.LG}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautogluon%2Ffev","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fautogluon%2Ffev","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautogluon%2Ffev/lists"}