{"id":15358080,"url":"https://github.com/antonior92/narx-double-descent","last_synced_at":"2026-03-01T10:33:22.281Z","repository":{"id":129505995,"uuid":"318629318","full_name":"antonior92/narx-double-descent","owner":"antonior92","description":"Explore the double-descent phenomena in the context of system identification. Companion code to the paper (https://arxiv.org/abs/2012.06341):","archived":false,"fork":false,"pushed_at":"2020-12-14T22:25:09.000Z","size":1653,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-15T07:23:20.671Z","etag":null,"topics":["double-descent","machine-learning","narx","system-identification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/antonior92.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-12-04T20:35:40.000Z","updated_at":"2025-02-15T05:18:24.000Z","dependencies_parsed_at":"2023-03-13T11:21:53.317Z","dependency_job_id":null,"html_url":"https://github.com/antonior92/narx-double-descent","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/antonior92/narx-double-descent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonior92%2Fnarx-double-descent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonior92%2Fnarx-double-descent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonior92%2Fnarx-double-descent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonior92%2Fnarx-double-descent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/antonior92","download_url":"https://codeload.github.com/antonior92/narx-double-descent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonior92%2Fnarx-double-descent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29966836,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T09:33:09.965Z","status":"ssl_error","status_checked_at":"2026-03-01T09:25:48.915Z","response_time":124,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["double-descent","machine-learning","narx","system-identification"],"created_at":"2024-10-01T12:39:54.095Z","updated_at":"2026-03-01T10:33:22.269Z","avatar_url":"https://github.com/antonior92.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Double-descent in system identification\n\n\nNext, we explore the double-descent phenomena in the context of system identification. This is the \ncompanion code to the paper ([https://arxiv.org/abs/2012.06341](https://arxiv.org/abs/2012.06341)):\n```\nBeyond Occam’s Razor in System Identification: Double-Descent Phenomena when Modeling Dynamics.\nAntônio H. Ribeiro, Johannes N. Hendriks, Adrian G. Wills, Thomas B. Schön, 2020.\narXiv: 2012.06341\n```\n\nBibTex formatted citation:\n```\n@misc{ribeiro2020occams,\n      title={Beyond Occam's Razor in System Identification: Double-Descent when Modeling Dynamics}, \n      author={Antônio H. Ribeiro and Johannes N. Hendriks and Adrian G. Wills and Thomas B. Schön},\n      year={2020},\n      eprint={2012.06341},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG}\n}\n```\n\n## Requirements:\n\nWe use standard packages: Numpy, Scipy, Pandas, Scikit-learn, Pytorch and Matplotlib...\nThe file `requirements.txt` gives the version under which the repository was tested against,\nbut we believe the code might still compatible with older versions of these packages.\n\n## Folder structure:\n\n- `models.py`: Contain the implementation of the available models to be\ntested. Options are:\n\n| Models  | Description |\n|:-------:|-----------:| \n| RBFSampler | Approximates feature map of an RBF kernel by Monte Carlo approximation of its Fourier transform (See [here](https://scikit-learn.org/stable/modules/generated/sklearn.kernel_approximation.RBFSampler.html)).|\n| RBFNet | Radial basis function network.  |\n| RandomForest | A random forest regressor (See [here](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html)).|\n| FullyConnectedNet | A n-layer fully conected neural network implemented in pytorch. |\n| LinearModel | Linear model adjusted by least-squares|\n\n\n\n- `datasets.py`: Contain implementations of available datasets to be used in the experiments. Also contain\nmethods and abstract classes to help in the implementation of new artificially generated datasets.\n\n| Datasets  | Description |\n|:-------:|-----------:| \n| ChenDSet| Nonlinear dataset (generated artificially). See Ref. [1]|\n| Order2LinearDSet | Linear second order dataset (generated artificially).|\n| CoupledEletricalDrives | Nonlinear dataset collected from physical system: see [here](https://sites.google.com/view/nonlinear-benchmark/benchmarks/coupled-electric-drives?authuser=0).|\n\nThe module can also be called from the command line as an script:\n```bash\npython datasets.py --dset [DSET]\n```\nwhich will plot the input and output from the dataset. Where `[DSET]` is one of the options in the table above.\n\n- `plot_predictions.py`: train/load model, evaluate on dataset, compute metric\nand plot prediction on the specified split. Run it as:\n```\npython plot_predictions.py\n```\nwith the option ``--dset [DSET]`` specifying the dataset (check the table above for available options); \n``--split [SPLIT]`` where `SPLIT` is in `{train, test}` specify which split is being evaluated.\n``--tp [TP]`` where `TP` is in `{pred, sim}` specify  weather one-step-ahead or free-run-simulation\nis being used. Use `--help` to get a full list of options.\n\n- `narx_double_descent.py`: train model and evaluate on dataset for varying number of features. Run it as:\n```\npython narx_double_descent.py\n```\nwith the option ``--dset [DSET]`` specifying the dataset (check the table above for available options); \n``--nonlinear_model [MDL]`` specifying the model (all the options in the table above are available, except for linear model); \n``--output [FILE]``  specify where to save the results. By default save in `performance.csv`.\nUse `--help` to get a full list of options.\n\n- `plot_double_descent.py`: generate plot of performance *vs* model size curve, using the output \nof `narx_double_descent.py`. Run it as:\n```\npython plot_double_descent.py [FILE]\n```\nwhere `FILE` is the file generated by `narx_double_descent.py`.\n\n- `generate_repository_figures.sh`: generate all figures that displayed in this README.md file and place them in `img/`. \n\n- `generate_paper_figures.sh`: generate all the figures used in the paper `Beyond Occam’s Razor in System Identification:\n Double-Descent Phenomena when Modeling Dynamics.`There is some overlap with the figures generated by the last command.\n However, this commands, yield the figures with the exact same style and size used in the paper.\n \n## Datasets\nIn the paper we focus on the two datasets:\n### ChenDSet\n   Generate nonlinear system described in Chen et al (1990). One example of input and correspondent output generated by such a system is displayed next:\n![input output plot chen data](img/chen/dset.png)\n```bash\n# The above plot can be generated by running:\npython datasets.py --dset ChenDSet\n```\n### CoupledEletricalDrive \n\nNonlinear system described in Wigren et al (2017). One example of input and correspondent output generated by such a\nsystem is displayed next:\n\n|  PRBS Sequence |  Uniform Sequence| \n|:--------------------:|:---------------:|\n|![CE8](img/ce8/dset_prbs.png) |![CE8](img/ce8/dset_unif.png) |\n\n```bash\n# The above plot can be generated by running:\n python datasets.py --dset CoupledElectricalDrives --sequence 0\n python datasets.py --dset CoupledElectricalDrives --sequence 3\n```\n\n# Experiments\n\nNext we describe some experiments where we observed the double descent phenomenon. The\ntable bellow describes: the model; whether the model is linear in the parameters or nor; th dataset; where the experiment is referenced in the\npaper \n`Beyond Occam’s Razor in System Identification: Double-Descent Phenomena when Modeling Dynamics`.\n\n\n| | Model |  Lin.-in-the-param.| Dataset | Overp. Solution|  Reference in the paper|\n|:-------:|:----------:|:----------:|:----------:|:----------:|:----------:|\n| [\\#1.](#chen-rff-minnorm) | RBFSampler | Yes| ChenDSet |  Minimun-norm |  Fig. 2  | \n| [\\#2.](#chen-rff-ridge) | RBFSampler | Yes| ChenDSet |  Ridge   |   Fig. 3  | [here](#chen-rff-) |\n| [\\#3.](#chen-rff-ensembles) | RBFSampler | Yes| ChenDSet |  Ensembles  |  Fig. 4  | [](#chen-rff-) |\n| [\\#4.](#chen-rbfnet-ridge) | RBFNet | Yes| ChenDSet |  Ensembles  |   Fig. 5  | [2.](#chen-rff-) |\n| [\\#5.](#ce8-rff-ridge) | RBFSampler | Yes | CE8 |   Ensembles  |  Fig. 1  | [2.](#chen-rff-) |\n| [\\#6.](#chen-randomforest) | Random Forest | No | ChenDSet |   Ensembles  |  Fig. 6  | [2.](#chen-rff-) |\n\nThe command `python narx_double_descent.py` can take more than 30 min in some of the examples bellow.\nFor convenience, the csv output files that would be generated as output are made available in\nthe folder `results/` is. So skip to the command if you want to reuse those pre-computed results\nor reduce -n [N] and -r [R] if you want a partial result faster. \n\n\n## \\#1. Random Fourier Features using Minimum Norm Solution \u003cdiv id='chen-rff-minnorm'/\u003e\n\n\nNext we show the double descent both for one-step-ahead error (left) and for free-run simulation \nerror (center), as well as the norm of the parameters (right). The baseline is the performance of a linear model.\n\n| One-step-ahead error | Free-run simulation error | Parameter Norm | \n|:--------------------:|:-------------------------:|:--------------:|\n|![ChenDSet / RBFSampler](img/chen/rbfs_pred.png)|![ChenDSet / RBFSampler](img/chen/rbfs_sim.png) |![ChenDSet / RBFSampler](img/chen/rbfs_norm.png) |\n\n```bash\n# The above plots can be generated by running:\n# 1. Generating results\nDSET=\"-d ChenDSet --cutoff_freq 0.7 --hold 1 --num_train_samples 400\"\nMODEL=\"-m RBFSampler  --gamma 0.6\"\npython narx_double_descent.py $DSET $MODEL -n 100 -r 10 -u 3 -o results/chen/rbfsampler.csv  \n# 2. Plotting:\npython plot_double_descent.py results/chen/rbfsampler.csv --tp pred --ymax 1.5 --plot_style ggplot # left plot (\u003c-) \npython plot_double_descent.py results/chen/rbfsampler.csv --tp sim --ymax 4.0 --plot_style ggplot # center plot (\u003c\u003e) \npython plot_double_descent.py results/chen/rbfsampler.csv --tp norm --plot_style ggplot  # right plot (-\u003e) \n```\n\nNext, we show plots of the model free-run simulation in the test set\nin the interpolation and classical regions. More the best RBFSampler in \neach region from all the runs above. This should help geting a\nbetter sense of how the model perform on each point of the curve.\n\n| Before interpolation threshold (`# features = 149 `) | After interpolation threshold (`# features = 40000 `)|\n|:--------------------:|:-------------------------:|\n|![ChenDSet / RBFSampler](img/chen/rbfs_before.png)|![ChenDSet / RBFSampler](img/chen/rbfs_interp.png) |\n\n```bash\n# The above plots can be generated by running:\npython plot_predictions.py $DSET $MODEL --n_features 149 --random_state 7  # left plot (\u003c-)\npython plot_predictions.py $DSET $MODEL --n_features 40000 --random_state 7 # right plot (-\u003e)\n```\n\n\n## \\#2. Random Fourier Features and ridge regression \u003cdiv id='chen-rff-ridge'/\u003e\n\n![ChenDSet / RBFSampler](img/chen/rbfs_ridge.png)\n\n```bash\n# The above plots can be generated by running:\n# 1. Generating results\nDSET=\"-d ChenDSet --cutoff_freq 0.7 --hold 1 --num_train_samples 400\"\nMODEL=\"-m RBFSampler  --gamma 0.6\"\nfor RIDGE in 0.01 0.001 0.0001 0.00001 0.000001 0.0000001;\ndo\n    python narx_double_descent.py $DSET $MODEL -n 100 -r 10 -u 3 -o results/chen/rbfsampler_r\"$RIDGE\".csv --ridge $RIDGE\ndone\n# 2. Plotting\npython  plot_multiple_dd.py results/chen/rbfsampler.csv  results/chen/rbfsampler_r{0.01,0.001,0.0001,0.00001,0.000001}.csv  \\\n  --labels  \"min-norm\"  \"\\$\\lambda=10^{-2}\\$\" \"\\$\\lambda=10^{-3}\\$\" \"\\$\\lambda=10^{-4}\\$\" \"\\$\\lambda=10^{-5}\\$\" \"\\$\\lambda=10^{-6}\\$\" \\\n  --ymax 1.5 --plot_style ggplot\n```\n\n\n## \\#3. Random Fourier Features and ensembles  \u003cdiv id='chen-rff-ensembles'/\u003e\n \n| One-step-ahead error | Free-run simulation error | Parameter Norm | \n|:--------------------:|:-------------------------:|:--------------:|\n|![ChenDSet / RBFSampler](img/chen/rbfs_ensemble_pred.png)|![ChenDSet / RBFSampler](img/chen/rbfs_ensemble_sim.png) |![ChenDSet / RBFSampler](img/chen/rbfs_ensemble_norm.png) |\n\n```bash\n# The above plots can be generated by running:\n# 1. Generating results\nDSET=\"-d ChenDSet --cutoff_freq 0.7 --hold 1 --num_train_samples 400\"\nMODEL=\"-m RBFSampler  --gamma 0.6  --ridge 0.0000001 --n_ensembles 1000\"\npython narx_double_descent.py $DSET $MODEL -n 100 -r 10 -u 3 -o results/chen/rbfsample_ensemble.csv\n# 2. Plotting\npython plot_double_descent.py results/chen/rbfsample_ensemble.csv --tp pred --ymax 1.5 --plot_style ggplot # left plot (\u003c-) \npython plot_double_descent.py results/chen/rbfsample_ensemble.csv --tp sim --ymax 4.0 --plot_style ggplot # center plot (\u003c\u003e) \npython plot_double_descent.py results/chen/rbfsample_ensemble.csv --tp norm --plot_style ggplot  # right plot (-\u003e) \n```\n\n\n## \\#4. RBFNet \u003cdiv id='chen-rbfnet-ridge'/\u003e\n\n![ChenDSet / RBFSampler](img/chen/rbfnet_pred.png)\n\n```bash\n# The above plots can be generated by:\n# 1. Generating results\nDSET=\"-d ChenDSet --cutoff_freq 0.5 --hold 1 --num_train_samples 400\"\nMODEL=\"-m RBFNet  --gamma 0.25 --spread 5.0  --ridge 0.00000000000001 --n_ensembles 2000\"\npython narx_double_descent.py $DSET $MODEL -n 60 -r 10 -u 3 -o results/chen/rbfnet.csv\n# 2. Plotting:\npython plot_double_descent.py results/chen/rbfnet.csv --tp pred --ymax 1.5 --plot_style ggplot\n```\n\n## \\#5. Double-descent experiments with CE8 dataset \u003cdiv id='ce8-rff-ridge'/\u003e\n\nNext we show how different model classes can display double descent\nbehaviour on this dataset.\n\n\nNext we show the double descent both for one-step-ahead error (left) and for free-run simulation \nerror (center), as well as the norm of the parameters (right). The baseline is the performance of a linear model.\n\n| One-step-ahead error | Free-run simulation error | Parameter Norm | \n|:--------------------:|:-------------------------:|:--------------:|\n|![CE8](img/ce8/rbfs_pred.png)|![CE8](img/ce8/rbfs_sim.png) |![CE8](img/ce8/rbfs_norm.png) |\n\n```bash\n# The above plots can be generated by:\n# 1. Generating results\nDSET=\"-d CoupledElectricalDrives --dset_choice unif\"\nMODEL=\"-m RBFSampler --gamma 0.2 --ridge 0.000001 --n_ensembles 2000\"\npython narx_double_descent.py $DSET $MODEL -n 100 -r 10 -l '-2' -u 2 -o results/ce8/rbfsampler.csv  \n# 2. Plotting:\npython plot_double_descent.py results/ce8/rbfsampler.csv --tp pred --ymax 0.2 --plot_style ggplot # left plot (\u003c-) \npython plot_double_descent.py results/ce8/rbfsampler.csv --tp sim --ymax 8.0 --plot_style ggplot  # center plot (\u003c\u003e) \npython plot_double_descent.py results/ce8/rbfsampler.csv --tp norm --plot_style ggplot # right plot (-\u003e) \n```\n\n## \\#6. Nonlinear model -  Random Forest \u003cdiv id='chen-randomforest'/\u003e\n\nNext we show the double descent both for one-step-ahead error (left) and for free-run simulation \nerror (right). The baseline is the performance of a linear model.\n\n| One-step-ahead error | Free-run simulation error |\n|:--------------------:|:-------------------------:|\n|![ChenDSet / Random Forest](img/chen/rf_pred.png)|![ChenDSet / Random Forest](img/chen/rf_sim.png) |\n\n```bash\n# The above plots can be generated by:\n# 1. Generating resultsve plots can be generated by running:\nDSET=\"-d ChenDSet --cutoff_freq 0.7 --hold 1 --num_train_samples 3000\"\nMODEL=\"-m RandomForest\"\npython narx_double_descent.py $DSET $MODEL -n 100 -r 10 -o results/chen/randomforest.csv  \n# 2. Plotting:\npython plot_double_descent.py results/chen/randomforest.csv --tp pred --ymax 0.8 --plot_style ggplot # left plot (\u003c-)\npython plot_double_descent.py results/chen/randomforest.csv --tp sim --ymax 2.0 --plot_style ggplot # right plot (-\u003e)\n```\n\nNext, we show plots of the model free-run simulation in the test set\nin the interpolation and classical regions. More the best Random Forest in \neach region from all the runs above. This should help give a\nbetter sense of how the model perform on each point of the curve.\n\n| Before interpolation threshold (`# features = 252 `) | After interpolation threshold (`# features = 20000 `)|\n|:--------------------:|:-------------------------:|\n|![ChenDSet / Random Forest](img/chen/rf_before.png)|![ChenDSet / Random Forest](img/chen/rf_interp.png) |\n\n```bash\n# The above plots can be generated by running:\npython plot_predictions.py $DSET $MODEL --n_features 600 --random_state 5  # left plot (\u003c-)\npython plot_predictions.py $DSET $MODEL --n_features 200000 --random_state 9  # right plot (-\u003e)\n```\n\n## Additional Datasets and Model\n\nWe focused in giving the commands for reproducing the paper examples.\nThere are, however, some datasets and models that were not explored in the paper and that were made available \nhere (i.e., `FullyConnectedNet`, `Order2LinearDSet`). \n\nThe fully connected neural network model is implemented using [PyTorch](https://pytorch.org/), and it allows \nthe use of the GPU when it is available (make sure to install the PyTorch built with CUDA suport if you want to make use of this).\n\n# References\n\n- [1] Chen,  S.,  Billings,  S.A.,  and  Grant,  P.M.  (1990).   Non-Linear  System  Identification  Using  Neural  Networks.International Journal of Control, 51(6), 1191–1214. doi:10/cg8bhx.  01127.\n- [2] Wigren,  T.  and  Schoukens,  M.  (2017).   Coupled  electric drives data set and reference models. Technical  Report. Uppsala Universitet, 2017","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantonior92%2Fnarx-double-descent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fantonior92%2Fnarx-double-descent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantonior92%2Fnarx-double-descent/lists"}