{"id":40266141,"url":"https://github.com/hftsoi/symbolfit","last_synced_at":"2026-02-08T19:15:33.897Z","repository":{"id":259551259,"uuid":"865759513","full_name":"hftsoi/symbolfit","owner":"hftsoi","description":"Automatic parametric modeling with symbolic regression","archived":false,"fork":false,"pushed_at":"2025-07-04T14:40:40.000Z","size":91116,"stargazers_count":61,"open_issues_count":36,"forks_count":4,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-01-20T08:15:59.426Z","etag":null,"topics":["automation","curve-fitting","high-energy-physics","interpolation","lmfit","machine-learning","nonlinear-optimization","parametric-modelling","pysr","python","symbolic-regression","uncertainty-estimations"],"latest_commit_sha":null,"homepage":"https://symbolfit.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hftsoi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-10-01T04:50:31.000Z","updated_at":"2025-12-21T06:27:05.000Z","dependencies_parsed_at":"2024-11-04T18:19:15.918Z","dependency_job_id":"0cfc6431-dfaa-437d-8cbc-ec545bc0d731","html_url":"https://github.com/hftsoi/symbolfit","commit_stats":null,"previous_names":["hftsoi/symbolfit"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/hftsoi/symbolfit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hftsoi%2Fsymbolfit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hftsoi%2Fsymbolfit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hftsoi%2Fsymbolfit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hftsoi%2Fsymbolfit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hftsoi","download_url":"https://codeload.github.com/hftsoi/symbolfit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hftsoi%2Fsymbolfit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29240390,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-08T18:06:38.086Z","status":"ssl_error","status_checked_at":"2026-02-08T18:06:09.124Z","response_time":57,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","curve-fitting","high-energy-physics","interpolation","lmfit","machine-learning","nonlinear-optimization","parametric-modelling","pysr","python","symbolic-regression","uncertainty-estimations"],"created_at":"2026-01-20T02:46:36.521Z","updated_at":"2026-02-08T19:15:33.861Z","avatar_url":"https://github.com/hftsoi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/hftsoi/symbolfit/main/docs/logo.png\" width=\"400\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/hftsoi/symbolfit/main/docs/thumbnail.png\" width=\"1200\"/\u003e\n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \n![Animation](docs/demo/animation.gif)\n\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\nDocs | Paper | Slides | Colab |\n|:-:|:-:|:-:|:-:|\n[![Read the Docs](https://img.shields.io/readthedocs/symbolfit?color=blue\u0026style=flat-square)](https://symbolfit.readthedocs.io) | [![arXiv](https://img.shields.io/badge/arXiv-2411.09851-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2411.09851) \u003cbr /\u003e [![doi](https://img.shields.io/badge/journal-doi-informational?style=flat-square\u0026color=teal)](https://doi.org/10.1007/s41781-025-00140-9) | [![slides](https://img.shields.io/badge/talk-slides-informational?style=flat-square\u0026color=purple)](https://github.com/hftsoi/symbolfit/blob/main/symbolfit.pdf) | [![Open in Colab](https://img.shields.io/badge/Colab-notebook-informational?style=flat-square\u0026color=gold)](https://colab.research.google.com/github/hftsoi/symbolfit/blob/main/colab_demo/symbolfit_colab.ipynb) |\n\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\nGitHub | pip | conda |\n|:-:|:-:|:-:|\n[![GitHub Created At](https://img.shields.io/github/created-at/hftsoi/symbolfit?color=black\u0026style=flat-square)](https://github.com/hftsoi/symbolfit) \u003cbr /\u003e [![GitHub Release](https://img.shields.io/github/v/release/hftsoi/symbolfit?color=black\u0026style=flat-square)](https://github.com/hftsoi/symbolfit/releases) \u003cbr /\u003e [![GitHub Release Date](https://img.shields.io/github/release-date/hftsoi/symbolfit?color=black\u0026style=flat-square)](https://github.com/hftsoi/symbolfit/releases) | [![PyPI - Version](https://img.shields.io/pypi/v/symbolfit?color=orange\u0026style=flat-square)](https://pypi.org/project/symbolfit) \u003cbr /\u003e [![Pepy Total Downloads](https://img.shields.io/pepy/dt/symbolfit?color=orange\u0026style=flat-square)](https://www.pepy.tech/projects/symbolfit) | [![Conda Version](https://img.shields.io/conda/vn/conda-forge/symbolfit.svg?color=green\u0026style=flat-square)](https://anaconda.org/conda-forge/symbolfit) \u003cbr /\u003e [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/symbolfit.svg?color=green\u0026style=flat-square)](https://anaconda.org/conda-forge/symbolfit) |\n\n\u003c/div\u003e\n\nAn API to automate parametric modeling with symbolic regression, originally developed for data analysis in the experimental high-energy physics community, but also applicable beyond.\n\nSymbolFit takes binned data with measurement/systematic uncertainties (optional) as input, utilizes [PySR](https://github.com/MilesCranmer/PySR) to perform a machine-search for batches of functional forms that model the data, parameterizes these functions, and utilizes [LMFIT](https://github.com/lmfit/lmfit-py) to re-optimize the functions and provide uncertainty estimation, all in one go.\nIt is designed to maximize automation with minimal human input. Each run produces a batch of functions with uncertainty estimation, which are evaluated, saved, and plotted automatically into readable output files, ready for downstream tasks.\n\n- [Installation](#installation)\n- [Getting Started](#getting-started)\n- [Documentation](#documentation)\n- [Citation](#citation)\n\n\u003e **Note:** This API is actively being updated to accommodate more use cases, so any feedback and contributions are very much welcomed and appreciated! If you encounter any problems while using it, please don’t hesitate to:\n\u003e - Report bugs or suggest new features at [![Issues](https://img.shields.io/badge/issues-github-informational?style=flat-square)](https://github.com/hftsoi/symbolfit/issues)\n\u003e - Ask for specific help and recommendations for your dataset at [![Discussions](https://img.shields.io/badge/discussions-github-informational?style=flat-square)](https://github.com/hftsoi/symbolfit/discussions)\n\u003e \n\u003e If you don't feel like sharing your data in public, please feel free to drop me a message or [![Email](https://img.shields.io/badge/email-ho.fung.tsoi@cern.ch-informational?style=flat-square\u0026color=blue)](mailto:ho.fung.tsoi@cern.ch). We are happy to assist in getting it to work on your data!\n\n## Installation\n**Prerequisite (Julia)**\n\nInstall Julia (backend for PySR):\n```\ncurl -fsSL https://install.julialang.org | sh\n```\nThen check if installed properly:\n```\njulia --version\n```\nIf julia is not found, most probably it is not yet included in PATH. To include, do:\n```\nexport PATH=\"$PATH:/path/to/\u003cJulia directory\u003e/bin\"\n```\nCheck out [here](https://julialang.org/downloads/platform) for platform-specific instructions.\n\nAfterward, it is recommended to start from an empty virtual environment for installing and running SymbolFit.\n\n**Installation via PyPI**\n\nWith python\u003e=3.10 and pip:\n```\npip install symbolfit\n```\n\n\u003cdetails\u003e\n  \u003csummary\u003eInstallation via conda\u003c/summary\u003e\n  \n  (to be updated for \u003e= v0.2.0, please use pip for now)\n  ```\n  conda create --name symbolfit_env python=3.10\n  conda activate symbolfit_env\n  conda install -c conda-forge symbolfit\n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eEditable installation for developers\u003c/summary\u003e\n\n  ```\n  git clone https://github.com/hftsoi/symbolfit.git\n  cd symbolfit\n  pip install -e .\n  ```\n\u003c/details\u003e\n\n## Getting Started\nTo run an example fit, get the example datasets by cloning this repo:\n```\ngit clone https://github.com/hftsoi/symbolfit.git\ncd symbolfit\n```\nThen within a python session (or simply do ```python fit_example.py```):\n```\nfrom symbolfit.symbolfit import *\n\ndataset = importlib.import_module('examples.datasets.toy_dataset_1.dataset')\npysr_config = importlib.import_module('examples.pysr_configs.pysr_config_gauss').pysr_config\n\nmodel = SymbolFit(\n    \tx = dataset.x,\n    \ty = dataset.y,\n    \ty_up = dataset.y_up,\n    \ty_down = dataset.y_down,\n    \tpysr_config = pysr_config,\n    \tmax_complexity = 60,\n    \tinput_rescale = True,\n    \tscale_y_by = 'mean',\n    \tmax_stderr = 20,\n    \tfit_y_unc = True,\n    \trandom_seed = None,\n    \tloss_weights = None\n)\n\nmodel.fit()\n```\nAfter the fit, save results to csv files:\n```\nmodel.save_to_csv(output_dir = 'output_dir/')\n```\nand plot results to pdf files:\n```\nmodel.plot_to_pdf(\n    \toutput_dir = 'output_dir/',\n    \tbin_widths_1d = dataset.bin_widths_1d,\n    \tplot_logy = False,\n    \tplot_logx = False,\n        sampling_95quantile = False,\n        #bin_edges_2d = dataset.bin_edges_2d,\n        #plot_logx0 = False,\n        #plot_logx1 = False,\n        #cbar_min = None,\n        #cbar_max = None,\n        #cmap = None,\n        #contour = None,\n        # ^ additional options for 2D plotting\n)\n```\nCandidate functions with full substitutions can be printed promptly:\n```\nmodel.print_candidate(candidate_number = 20)\n```\nWhen preparing for your own data, a graphical illustration of the input data format can be found [here](https://symbolfit.readthedocs.io/demo/input.html).\n\nEach fit will produce a batch of candidate functions and will automatically save all results to six output files:\n1) ```candidates.csv```: saves all candidate functions and evaluations in a csv table.\n2) ```candidates_compact.csv```: saves a reduced version for essential information without intermediate results.\n3) ```candidates.pdf```: plots all candidate functions (1D/2D only for now) with associated uncertainties one by one for fit quality evaluation.\n4) ```candidates_sampling.pdf```: plots all candidate functions (1D only for now) with total uncertainty coverage generated by sampling parameters.\n5) ```candidates_gof.pdf```: plots the goodness-of-fit scores.\n6) ```candidates_correlation.pdf```: plots the correlation matrices for the parameters of the candidate functions.\n\nOutput files from an example fit can be found and downloaded [here](https://github.com/hftsoi/symbolfit/tree/main/docs/demo/output_dir/toy_dataset_1) for illustration.\n\n\u003e **Note:** The function space is usually huge, even when constrained by the pysr config. This means that if you are not satisfied with the results from a fit, you can simply rerun it with the exact same config and obtain a completely different set of candidate functions\u0026mdash;the only difference being the random seed that initiates the seeding functions. Therefore, you can rerun the fit as many times as you want until you are satisfied with the results. If you use ```model = SymbolFit(..., random_seed = None, ...)```, nothing needs to be changed\u0026mdash;just rerun the fit. If you set a specific ```random_seed```, change its value before rerunning. However, if you are still not satisfied with the results after many trials, it might indicate an issue with the config. Then you might want to try a different config, tune it, and start new runs.\n\nFor detailed instructions and more demonstrations, please check out the Colab notebook and the documentation.\n\n## Documentation\nThe documentation can be found [here](https://symbolfit.readthedocs.io) for more information and demonstrations.\n\n## Citation\nIf you find this useful in your research, please consider citing SymbolFit:\n```\n@article{Tsoi:2024pbn,\n    author = \"Tsoi, Ho Fung and Rankin, Dylan and Caillol, Cecile and Cranmer, Miles and Dasu, Sridhara and Duarte, Javier and Harris, Philip and Lipeles, Elliot and Loncar, Vladimir\",\n    title = \"{SymbolFit: Automatic Parametric Modeling with Symbolic Regression}\",\n    eprint = \"2411.09851\",\n    archivePrefix = \"arXiv\",\n    primaryClass = \"hep-ex\",\n    doi = \"10.1007/s41781-025-00140-9\",\n    journal = \"Comput. Softw. Big Sci.\",\n    volume = \"9\",\n    pages = \"12\",\n    year = \"2025\"\n}\n```\nand PySR:\n```\n@misc{cranmerInterpretableMachineLearning2023,\n      title = {Interpretable {Machine} {Learning} for {Science} with {PySR} and {SymbolicRegression}.jl},\n      url = {http://arxiv.org/abs/2305.01582},\n      doi = {10.48550/arXiv.2305.01582},\n      urldate = {2023-07-17},\n      publisher = {arXiv},\n      author = {Cranmer, Miles},\n      month = may,\n      year = {2023},\n      note = {arXiv:2305.01582 [astro-ph, physics:physics]},\n      keywords = {Astrophysics - Instrumentation and Methods for Astrophysics, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing, Computer Science - Symbolic Computation, Physics - Data Analysis, Statistics and Probability},\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhftsoi%2Fsymbolfit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhftsoi%2Fsymbolfit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhftsoi%2Fsymbolfit/lists"}