{"id":13869709,"url":"https://github.com/janosh/tensorboard-reducer","last_synced_at":"2025-04-05T09:04:57.830Z","repository":{"id":41209465,"uuid":"354585417","full_name":"janosh/tensorboard-reducer","owner":"janosh","description":"Reduce multiple PyTorch TensorBoard runs to new event (or CSV) files.","archived":false,"fork":false,"pushed_at":"2025-01-06T17:53:19.000Z","size":1579,"stargazers_count":71,"open_issues_count":2,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-29T08:06:54.333Z","etag":null,"topics":["averaging","csv","machine-learning","pytorch","reducer","tensorboard","tensorboard-pytorch","tensorboard-reducer"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/tensorboard-reducer","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/janosh.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"citation.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-04T15:57:50.000Z","updated_at":"2025-02-01T18:57:53.000Z","dependencies_parsed_at":"2023-02-17T11:50:20.749Z","dependency_job_id":"4bae6dfa-f59a-4127-839b-a6191971c543","html_url":"https://github.com/janosh/tensorboard-reducer","commit_stats":{"total_commits":84,"total_committers":2,"mean_commits":42.0,"dds":0.1428571428571429,"last_synced_commit":"d713353dd3a2bc502e52650060a7e5bd17e42b65"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janosh%2Ftensorboard-reducer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janosh%2Ftensorboard-reducer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janosh%2Ftensorboard-reducer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janosh%2Ftensorboard-reducer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/janosh","download_url":"https://codeload.github.com/janosh/tensorboard-reducer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247312077,"owners_count":20918344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["averaging","csv","machine-learning","pytorch","reducer","tensorboard","tensorboard-pytorch","tensorboard-reducer"],"created_at":"2024-08-05T20:01:13.024Z","updated_at":"2025-04-05T09:04:57.763Z","avatar_url":"https://github.com/janosh.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"![TensorBoard Reducer](https://raw.githubusercontent.com/janosh/tensorboard-reducer/main/assets/tensorboard-reducer.svg)\n\n[![Tests](https://github.com/janosh/tensorboard-reducer/actions/workflows/test.yml/badge.svg)](https://github.com/janosh/tensorboard-reducer/actions/workflows/test.yml)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/janosh/tensorboard-reducer/main.svg)](https://results.pre-commit.ci/latest/github/janosh/tensorboard-reducer/main)\n[![Requires Python 3.8+](https://img.shields.io/badge/Python-3.8+-blue.svg?logo=python\u0026logoColor=white)](https://python.org/downloads)\n[![PyPI](https://img.shields.io/pypi/v/tensorboard-reducer?logo=pypi\u0026logoColor=white)](https://pypi.org/project/tensorboard-reducer?logo=pypi\u0026logoColor=white)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/tensorboard-reducer)](https://pypistats.org/packages/tensorboard-reducer)\n[![DOI](https://zenodo.org/badge/354585417.svg)](https://zenodo.org/badge/latestdoi/354585417)\n\n\u003e This project can ingest both PyTorch and TensorFlow event files but was mostly tested with PyTorch. For a TF-only project, see [`tensorboard-aggregator`](https://github.com/Spenhouet/tensorboard-aggregator).\n\nCompute statistics (`mean`, `std`, `min`, `max`, `median` or any other [`numpy` operation](https://numpy.org/doc/stable/reference/routines.statistics)) of multiple TensorBoard run directories. This can be used e.g. when training model ensembles to reduce noise in loss/accuracy/error curves and establish statistical significance of performance improvements or get a better idea of epistemic uncertainty. Results can be saved to disk either as new TensorBoard runs or CSV/JSON/Excel. More file formats are easy to add, PRs welcome.\n\n## Example notebooks\n\n|                                                                                                                              | \u0026emsp;\u0026emsp;\u0026emsp;\u0026emsp;\u0026emsp;\u0026emsp;\u0026emsp;                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |\n| ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [**Basic Python API Demo**](https://github.com/janosh/tensorboard-reducer/blob/main/examples/basic_python_api_example.ipynb) | [![Launch Codespace]][codespace url] [![Launch Binder]](https://mybinder.org/v2/gh/janosh/tensorboard-reducer/main?labpath=examples%2Fbasic_python_api_example.ipynb) [![Open in Google Colab]](https://colab.research.google.com/github/janosh/tensorboard-reducer/blob/main/examples/basic_python_api_example.ipynb) | Demonstrates how to work with local TensorBoard event files.                                                                                                                                      |\n| [**Functorch MLP Ensemble**](https://github.com/janosh/tensorboard-reducer/blob/main/examples/functorch_mlp_ensemble.ipynb)  | [![Launch Codespace]][codespace url] [![Launch Binder]](https://mybinder.org/v2/gh/janosh/tensorboard-reducer/main?labpath=examples%2Ffunctorch_mlp_ensemble.ipynb)   [![Open in Google Colab]](https://colab.research.google.com/github/janosh/tensorboard-reducer/blob/main/examples/functorch_mlp_ensemble.ipynb)   | Shows how to aggregate run metrics with TensorBoard Reducer\u003cbr\u003ewhen training model ensembles using `functorch`.                                                                                   |\n| [**Weights \u0026 Biases Integration**](https://github.com/janosh/tensorboard-reducer/blob/main/examples/wandb_integration.ipynb) | [![Launch Codespace]][codespace url] [![Launch Binder]](https://mybinder.org/v2/gh/janosh/tensorboard-reducer/main?labpath=examples%2Fwandb_integration.ipynb)        [![Open in Google Colab]](https://colab.research.google.com/github/janosh/tensorboard-reducer/blob/main/examples/wandb_integration.ipynb)        | Trains PyTorch CNN ensemble on MNIST, logs results to [WandB](https://wandb.ai), downloads metrics from multiple WandB runs, aggregates using `tb-reducer`, then re-uploads to WandB as new runs. |\n\n[Launch Binder]: https://mybinder.org/badge_logo.svg\n[Launch Codespace]: https://img.shields.io/badge/Launch-Codespace-darkblue?logo=github\n[codespace url]: https://github.com/codespaces/new?hide_repo_select=true\u0026ref=main\u0026repo=354585417\n[Open in Google Colab]: https://colab.research.google.com/assets/colab-badge.svg\n\n*The mean of 3 runs shown in pink here is less noisy and better suited for comparisons between models or different training techniques than individual runs.*\n![Mean of 3 TensorBoard logs](https://raw.githubusercontent.com/janosh/tensorboard-reducer/main/assets/3-runs-mean.png)\n\n## Installation\n\n```sh\npip install tensorboard-reducer\n```\n\nExcel support requires installing extra dependencies:\n\n```sh\npip install 'tensorboard-reducer[excel]'\n```\n\n## Usage\n\n### CLI\n\n```sh\ntb-reducer runs/of-your-model* -o output-dir -r mean,std,min,max\n```\n\nAll positional CLI arguments are interpreted as input directories and expected to contain TensorBoard event files. These can be specified individually or with wildcards using shell expansion. You can check you're getting the right input directories by running `echo runs/of-your-model*` before passing them to `tb-reducer`.\n\n**Note**: By default, TensorBoard Reducer expects event files to contain identical tags and equal number of steps for all scalars. If you trained one model for 300 epochs and another for 400 and/or recorded different sets of metrics (tags in TensorBoard lingo) for each of them, see CLI flags `--lax-steps` and `--lax-tags` to disable this safeguard. The corresponding kwargs in the Python API are `strict_tags = True` and `strict_steps = True` on `load_tb_events()`.\n\nIn addition, `tb-reducer` has the following flags:\n\n- **`-o/--outpath`** (required): File path or directory where to write output to disk. If `--outpath` is a directory, output will be saved as TensorBoard runs, one new directory created for each reduction suffixed by the `numpy` operation, e.g. `'out/path-mean'`, `'out/path-max'`, etc. If `--outpath` is a file path, it must have `'.csv'`/`'.json'` or `'.xlsx'` (supports compression by using e.g. `.csv.gz`, `json.bz2`) in which case a single file will be created. CSVs will have a two-level header containing one column for each combination of tag (`loss`, `accuracy`, ...) and reduce operation (`mean`, `std`, ...). Tag names will be in top-level header, reduce ops in second level. **Hint**: When saving data as CSV or Excel, use `pandas.read_csv(\"path/to/file.csv\", header=[0, 1], index_col=0)` and `pandas.read_excel(\"path/to/file.xlsx\", header=[0, 1], index_col=0)` to load reduction results into a multi-index dataframe.\n- **`-r/--reduce-ops`** (optional, default: `mean`): Comma-separated names of numpy reduction ops (`mean`, `std`, `min`, `max`, ...). Each reduction is written to a separate `outpath` suffixed by its op name. E.g. if `outpath='reduced-run'`, the mean reduction will be written to `'reduced-run-mean'`.\n- **`-f/--overwrite`** (optional, default: `False`): Whether to overwrite existing output directories/data files (CSV, JSON, Excel). For safety, the overwrite operation will abort with an error if the file/directory to overwrite is not a known data file and does not look like a TensorBoard run directory (i.e. does not start with `'events.out'`).\n- **`--lax-tags`** (optional, default: `False`): Allow different runs have to different sets of tags. In this mode, each tag reduction will run over as many runs as are available for a given tag, even if that's just one. Proceed with caution as not all tags will have the same statistics in downstream analysis.\n- **`--lax-steps`** (optional, default: `False`): Allow tags across different runs to have unequal numbers of steps. In this mode, each reduction will only use as many steps as are available in the shortest run (same behavior as `zip(short_list, long_list)` which stops when `short_list` is exhausted).\n- **`--handle-dup-steps`** (optional, default: `None`): How to handle duplicate values recorded for the same tag and step in a single run. One of `'keep-first'`, `'keep-last'`, `'mean'`. `'keep-first/last'` will keep the first/last occurrence of duplicate steps while 'mean' computes their mean. Default behavior is to raise `ValueError` on duplicate steps.\n- **`--min-runs-per-step`** (optional, default: `None`): Minimum number of runs across which a given step must be recorded to be kept. Steps present across less runs are dropped. Only plays a role if `lax_steps` is true. **Warning**: Be aware that with this setting, you'll be reducing variable number of runs, however many recorded a value for a given step as long as there are at least `--min-runs-per-step`. In other words, the statistics of a reduction will change mid-run. Say you're plotting the mean of an error curve, the sample size of that mean will drop from, say, 10 down to 4 mid-plot if 4 of your models trained for longer than the rest. Be sure to remember when using this.\n- **`-v/--version`** (optional): Get the current version.\n\n### Python API\n\nYou can also import `tensorboard_reducer` into a Python script or Jupyter notebook for more complex operations. Here's a simple example that uses all of the main functions [`load_tb_events`], [`reduce_events`], [`write_data_file`] and [`write_tb_events`] to get you started:\n\n```py\nfrom glob import glob\n\nimport tensorboard_reducer as tbr\n\ninput_event_dirs = sorted(glob(\"glob_pattern/of_tb_directories_to_reduce*\"))\n# where to write reduced TB events, each reduce operation will be in a separate subdirectory\ntb_events_output_dir = \"path/to/output_dir\"\ncsv_out_path = \"path/to/write/reduced-data-as.csv\"\n# whether to abort or overwrite when csv_out_path already exists\noverwrite = False\nreduce_ops = (\"mean\", \"min\", \"max\", \"median\", \"std\", \"var\")\n\nevents_dict = tbr.load_tb_events(input_event_dirs)\n\n# number of recorded tags. e.g. would be 3 if you recorded loss, MAE and R^2\nn_scalars = len(events_dict)\nn_steps, n_events = list(events_dict.values())[0].shape\n\nprint(\n    f\"Loaded {n_events} TensorBoard runs with {n_scalars} scalars and {n_steps} steps each\"\n)\nprint(\", \".join(events_dict))\n\nreduced_events = tbr.reduce_events(events_dict, reduce_ops)\n\nfor op in reduce_ops:\n    print(f\"Writing '{op}' reduction to '{tb_events_output_dir}-{op}'\")\n\ntbr.write_tb_events(reduced_events, tb_events_output_dir, overwrite)\n\nprint(f\"Writing results to '{csv_out_path}'\")\n\ntbr.write_data_file(reduced_events, csv_out_path, overwrite)\n\nprint(\"Reduction complete\")\n```\n\n[`reduce_events`]: https://github.com/janosh/tensorboard-reducer/blob/6d3468610d2933a23bc355250f9c76e6b6bb0151/tensorboard_reducer/main.py#L12-L14\n[`load_tb_events`]: https://github.com/janosh/tensorboard-reducer/blob/6d3468610d2933a23bc355250f9c76e6b6bb0151/tensorboard_reducer/load.py#L10-L16\n[`write_data_file`]: https://github.com/janosh/tensorboard-reducer/blob/6d3468610d2933a23bc355250f9c76e6b6bb0151/tensorboard_reducer/write.py#L111-L115\n[`write_tb_events`]: https://github.com/janosh/tensorboard-reducer/blob/6d3468610d2933a23bc355250f9c76e6b6bb0151/tensorboard_reducer/write.py#L45-L49\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjanosh%2Ftensorboard-reducer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjanosh%2Ftensorboard-reducer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjanosh%2Ftensorboard-reducer/lists"}