{"id":15033281,"url":"https://github.com/wassimtenachi/physo","last_synced_at":"2025-05-14T10:07:47.674Z","repository":{"id":133781904,"uuid":"594060646","full_name":"WassimTenachi/PhySO","owner":"WassimTenachi","description":"Physical Symbolic Optimization","archived":false,"fork":false,"pushed_at":"2025-03-07T05:37:16.000Z","size":187046,"stargazers_count":1865,"open_issues_count":11,"forks_count":260,"subscribers_count":39,"default_branch":"main","last_synced_at":"2025-04-03T19:58:43.738Z","etag":null,"topics":["deep-learning","equation-discovery","machine-learning","physics","python","reinforcement-learning","symbolic-regression"],"latest_commit_sha":null,"homepage":"https://physo.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WassimTenachi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-27T14:06:05.000Z","updated_at":"2025-03-29T18:45:27.000Z","dependencies_parsed_at":"2023-05-24T10:00:29.137Z","dependency_job_id":"3c9097b0-7519-439d-b9e2-d2fb66121c96","html_url":"https://github.com/WassimTenachi/PhySO","commit_stats":{"total_commits":863,"total_committers":5,"mean_commits":172.6,"dds":"0.011587485515643148","last_synced_commit":"0141582169aadf175b4cb891180cf844c60c1c00"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WassimTenachi%2FPhySO","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WassimTenachi%2FPhySO/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WassimTenachi%2FPhySO/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WassimTenachi%2FPhySO/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WassimTenachi","download_url":"https://codeload.github.com/WassimTenachi/PhySO/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248336317,"owners_count":21086768,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","equation-discovery","machine-learning","physics","python","reinforcement-learning","symbolic-regression"],"created_at":"2024-09-24T20:20:35.367Z","updated_at":"2025-04-11T03:33:51.919Z","avatar_url":"https://github.com/WassimTenachi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# $\\Phi$-SO : Physical Symbolic Optimization\n![logo](https://raw.githubusercontent.com/WassimTenachi/PhySO/main/docs/assets/logo_dark.png)\nPhysical symbolic optimization ( $\\Phi$-SO ) - A symbolic optimization package built for physics.\n\n[![GitHub Repo stars](https://img.shields.io/github/stars/WassimTenachi/PhySO?style=social)](https://github.com/WassimTenachi/PhySO)\n[![Documentation Status](https://readthedocs.org/projects/physo/badge/?version=latest)](https://physo.readthedocs.io/en/latest/?badge=latest)\n[![Coverage Status](https://coveralls.io/repos/github/WassimTenachi/PhySO/badge.svg?branch=main)](https://coveralls.io/github/WassimTenachi/PhySO?branch=main)\n[![Twitter Follow](https://img.shields.io/twitter/follow/WassimTenachi?style=social)](https://twitter.com/WassimTenachi)\n[![Paper](https://img.shields.io/badge/arXiv-2303.03192-b31b1b)](https://arxiv.org/abs/2303.03192)\n[![Paper](https://img.shields.io/badge/arXiv-2312.01816-b31b1b)](https://arxiv.org/abs/2312.01816)\n\nSource code: [WassimTenachi/PhySO](https://github.com/WassimTenachi/PhySO)\\\nDocumentation: [physo.readthedocs.io](https://physo.readthedocs.io/en/latest/)\n\n## Highlights\n\n$\\Phi$-SO's symbolic regression module uses deep reinforcement learning to infer analytical physical laws that fit data points, searching in the space of functional forms.  \n\n`physo` is able to leverage:\n\n* Physical units constraints, reducing the search space with dimensional analysis ([[Tenachi et al 2023]](https://arxiv.org/abs/2303.03192))\n\n* Class constraints, searching for a single analytical functional form that accurately fits multiple datasets - each governed by its own (possibly) unique set of fitting parameters ([[Tenachi et al 2024]](https://arxiv.org/abs/2312.01816))\n\n$\\Phi$-SO recovering the equation for a damped harmonic oscillator:\n\nhttps://github.com/WassimTenachi/PhySO/assets/63928316/655b0eea-70ba-4975-8a80-00553a6e2786\n\nPerformances on the standard Feynman benchmark from [SRBench](https://github.com/cavalab/srbench/tree/master)) comprising 120 expressions from the Feynman Lectures on Physics against popular SR packages.\n\n$\\Phi$-SO achieves state-of-the-art performance in the presence of noise (exceeding 0.1%) and shows robust performances even in the presence of substantial (10%) noise:\n\n![feynman_results](https://github.com/WassimTenachi/PhySO/assets/63928316/bbb051a2-2737-40ca-bfbf-ed185c48aa71)\n\n# Installation\n\nThe package has been tested on:\n- Linux\n- OSX (ARM \u0026 Intel)\n- Windows\n\n## Install procedure\n\n### Virtual environment\n\nTo install the package it is recommended to first create a conda virtual environment:\n```\nconda create -n PhySO python=3.8\n```\nAnd activate it:\n```\nconda activate PhySO\n```\n### Downloading\n\n`physo` can be downloaded using git:\n```\ngit clone https://github.com/WassimTenachi/PhySO\n```\n\nOr by direct downloading a zip of the repository: [here](https://github.com/WassimTenachi/PhySO/zipball/master)\n\n### Dependencies\nFrom the repository root:\n\nInstalling dependencies :\n```\nconda install --file requirements.txt\n```\n\nIn order to simplify the installation process, since its first version, `physo` has been updated to have minimal very standard dependencies.\n\n---\n\n**NOTE** : `physo` supports CUDA but it should be noted that since the bottleneck of the code is free constant optimization, using CUDA (even on a very high-end GPU) does not improve performances over a CPU and can actually hinder performances.\n\n---\n\n### Installing PhySO\n\nInstalling `physo` to the environment (from the repository root):\n```\npython -m pip install -e .\n```\n\n### Testing install\n\nImport test:\n```\npython3\n\u003e\u003e\u003e import physo\n```\nThis should result in `physo` being successfully imported.\n\nUnit tests:\n\nRunning all unit tests except parallel mode ones (from the repository root):\n```\npython -m unittest discover -p \"*UnitTest.py\"\n```\nThis should result in all tests being successfully passed.\n\nRunning all unit tests (from the repository root):\n```\npython -m unittest discover -p \"*Test.py\"\n```\nThis  should take 5-15 min depending on your system (as if you have a lot of CPU cores, it will take longer to make the efficiency curves).\n\n## Uninstalling\nUninstalling the package.\n```\nconda deactivate\nconda env remove -n PhySO\n```\n## Getting started (SR)\n\nIn this tutorial, we show how to use `physo` to perform Symbolic Regression (SR).\nThe reference notebook for this tutorial can be found here: [sr_quick_start.ipynb](https://github.com/WassimTenachi/PhySO/blob/main/demos/sr_quick_start.ipynb).\n\n### Setup\n\nImporting the necessary libraries:\n```\n# External packages\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport torch\n```\n\nImporting `physo`:\n```\n# Internal code import\nimport physo\nimport physo.learn.monitoring as monitoring\n```\n\nIt is recommended to fix the seed for reproducibility:\n```\n# Seed\nseed = 0\nnp.random.seed(seed)\ntorch.manual_seed(seed)\n```\n\n### Making synthetic datasets\n\nMaking a toy synthetic dataset:\n```\n# Making toy synthetic data\nz = np.random.uniform(-10, 10, 50)\nv = np.random.uniform(-10, 10, 50)\nX = np.stack((z, v), axis=0)\ny = 1.234*9.807*z + 1.234*v**2\n```\nIt should be noted that free constants search starts around 1. by default. Therefore when using default hyperparameters, normalizing the data around an order of magnitude of 1 is strongly recommended.\n\n---\n\n__DA side notes__:  \n$\\Phi$-SO can exploit DA (dimensional analysis) to make SR more efficient.  \nOn can consider the physical units of $X=(z,v)$, $z$ being a length of dimension $L^{1}, T^{0}, M^{0}$, v a velocity of dimension $L^{1}, T^{-1}, M^{0}$, $y=E$ if an energy of dimension $L^{2}, T^{-2}, M^{1}$.\nIf you are not working on a physics problem and all your variables/constants are dimensionless, do not specify any of the `xx_units` arguments (or specify them as `[0,0]` for all variables/constants) and `physo` will perform a dimensionless symbolic regression task.  \n\n---\n\nDatasets plot:\n```\nn_dim = X.shape[0]\nfig, ax = plt.subplots(n_dim, 1, figsize=(10,5))\nfor i in range (n_dim):\n    curr_ax = ax if n_dim==1 else ax[i]\n    curr_ax.plot(X[i], y, 'k.',)\n    curr_ax.set_xlabel(\"X[%i]\"%(i))\n    curr_ax.set_ylabel(\"y\")\nplt.show()\n```\n\n### SR configuration\n\nIt should be noted that SR capabilities of `physo` are heavily dependent on hyperparameters, it is therefore recommended to tune hyperparameters to your own specific problem for doing science.  \nSummary of currently available hyperparameters presets configurations:\n\n|  Config    |            Recommended usecases                           |    Speed    |   Effectiveness   |                           Notes                                |\n|:----------:|:---------------------------------------------------------:|:-----------:|:-----------------:|:--------------------------------------------------------------:|\n| `config0`  | Demos                                                     |     ★★★     |          ★        | Light and fast config.                                         |\n| `config1`  | SR with DA $^*$ ;  Class SR with DA $^*$                    |       ★     |        ★★★        | Config used for Feynman Benchmark and MW streams Benchmark.    |\n| `config2`  | SR ; Class SR                                             |      ★★     |         ★★        | Config used for Class Benchmark.                               |\n\n$^*$ DA = Dimensional Analysis\n\nUsers are encouraged to edit configurations (they can be found in: [physo/config/](https://github.com/WassimTenachi/PhySO/tree/main/physo/config)).  \nBy default, `config0` is used, however it is recommended to follow the upper recommendations for doing science.\n\n---\n__DA side notes__:   \n1. During the first tens of iterations, the neural network is typically still learning the rules of dimensional analysis, resulting in most candidates being discarded and not learned on, effectively resulting in a much smaller batch size (typically 10x smaller), thus making the evaluation process much less computationally expensive. It is therefore recommended to compensate this behavior by using a higher batch size configuration which helps provide the neural network sufficient learning information.  \n---\n\nLogging and visualisation setup:\n```\nsave_path_training_curves = 'demo_curves.png'\nsave_path_log             = 'demo.log'\n\nrun_logger     = lambda : monitoring.RunLogger(save_path = save_path_log,\n                                                do_save = True)\n\nrun_visualiser = lambda : monitoring.RunVisualiser (epoch_refresh_rate = 1,\n                                           save_path = save_path_training_curves,\n                                           do_show   = False,\n                                           do_prints = True,\n                                           do_save   = True, )\n```\n\n### Running SR\n\nGiven variables data $(x_0,..., x_n)$ (here $(z, v)$ ), the root variable $y$ (here $E$) as well as free and fixed constants, you can run an SR task to recover $f$ via the following command.\n\n---\n\n__DA side notes__:    \nHere we are allowing the use of a fixed constant $1$ of dimension $L^{0}, T^{0}, M^{0}$ (ie dimensionless) and free constants $m$ of dimension $L^{0}, T^{0}, M^{1}$ and $g$ of dimension $L^{1}, T^{-2}, M^{0}$.  \nIt should be noted that here the units vector are of size 3 (eg: `[1, 0, 0]`) as in this example the variables have units dependent on length, time and mass only.\nHowever, units vectors can be of any size $\\leq 7$ as long as it is consistent across X, y and constants, allowing the user to express any units (dependent on length, time, mass, temperature, electric current, amount of light, or amount of matter). \nIn addition, dimensional analysis can be performed regardless of the order in which units are given, allowing the user to use any convention ([length, mass, time] or [mass, time, length] etc.) as long as it is consistent across X,y and constants.  \n\n---\n\n```\n# Running SR task\nexpression, logs = physo.SR(X, y,\n                            # Giving names of variables (for display purposes)\n                            X_names = [ \"z\"       , \"v\"        ],\n                            # Associated physical units (ignore or pass zeroes if irrelevant)\n                            X_units = [ [1, 0, 0] , [1, -1, 0] ],\n                            # Giving name of root variable (for display purposes)\n                            y_name  = \"E\",\n                            y_units = [2, -2, 1],\n                            # Fixed constants\n                            fixed_consts       = [ 1.      ],\n                            fixed_consts_units = [ [0,0,0] ],\n                            # Free constants names (for display purposes)\n                            free_consts_names = [ \"m\"       , \"g\"        ],\n                            free_consts_units = [ [0, 0, 1] , [1, -2, 0] ],\n                            # Symbolic operations that can be used to make f\n                            op_names = [\"mul\", \"add\", \"sub\", \"div\", \"inv\", \"n2\", \"sqrt\", \"neg\", \"exp\", \"log\", \"sin\", \"cos\"],\n                            get_run_logger     = run_logger,\n                            get_run_visualiser = run_visualiser,\n                            # Run config\n                            run_config = physo.config.config0.config0,\n                            # Parallel mode (only available when running from python scripts, not notebooks)\n                            parallel_mode = False,\n                            # Number of iterations\n                            epochs = 20\n)\n```\n\n### Inspecting the best expression found\n\n__Getting best expression:__\n\nThe best expression found (in accuracy) is returned in the `expression` variable:\n```\nbest_expr = expression\nprint(best_expr.get_infix_pretty())\n```\n```\n\u003e\u003e\u003e \n                     2           \n    -g⋅m⋅z + -v⋅v⋅sin (1.0)⋅1.0⋅m\n```\n\nIt can also be loaded later on from log files:\n```\nimport physo\nfrom physo.benchmark.utils import symbolic_utils as su\nimport sympy\n\n# Loading pareto front expressions\npareto_expressions = physo.read_pareto_pkl(\"demo_curves_pareto.pkl\")\n# Most accurate expression is the last in the Pareto front:\nbest_expr = pareto_expressions[-1]\nprint(best_expr.get_infix_pretty())\n```\n\n__Display:__\n\nThe expression can be converted into...  \nA sympy expression:\n\n```\nbest_expr.get_infix_sympy()\n```\n```\n\u003e\u003e\u003e -g*m*z - v*v*sin(1.0)**2*1.0*m\n```\n\nA sympy expression (with evaluated free constants values):\n\n```\nbest_expr.get_infix_sympy(evaluate_consts=True)[0]\n```\n\n```\n\u003e\u003e\u003e 1.74275713004454*v**2*sin(1.0)**2 + 12.1018380702846*z\n```\n\nA latex string:\n\n```\nbest_expr.get_infix_latex()\n```\n\n```\n\u003e\u003e\u003e '\\\\frac{m \\\\left(- 1000000000000000 g z - 708073418273571 v^{2}\\\\right)}{1000000000000000}'\n```\n\nA latex string (with evaluated free constants values):\n```\nsympy.latex(best_expr.get_infix_sympy(evaluate_consts=True))\n```\n\n```\n\u003e\u003e\u003e '\\\\mathtt{\\\\text{[1.74275713004454*v**2*sin(1.0)**2 + 12.1018380702846*z]}}'\n```\n\n__Getting free constant values:__\n\n```\nbest_expr.free_consts\n```\n```\n\u003e\u003e\u003e FreeConstantsTable\n     -\u003e Class consts (['g' 'm']) : (1, 2)\n     -\u003e Spe consts   ([]) : (1, 0, 1)\n```\n\n```\nbest_expr.free_consts.class_values\n```\n```\n\u003e\u003e\u003e tensor([[ 6.9441, -1.7428]], dtype=torch.float64)\n```\n\n### Checking exact symbolic recovery\n\n```\n# To sympy\nbest_expr = best_expr.get_infix_sympy(evaluate_consts=True)\n\nbest_expr = best_expr[0]\n\n# Printing best expression simplified and with rounded constants\nprint(\"best_expr : \", su.clean_sympy_expr(best_expr, round_decimal = 4))\n\n# Target expression was:\ntarget_expr = sympy.parse_expr(\"1.234*9.807*z + 1.234*v**2\")\nprint(\"target_expr : \", su.clean_sympy_expr(target_expr, round_decimal = 4))\n\n# Check equivalence\nprint(\"\\nChecking equivalence:\")\nis_equivalent, log = su.compare_expression(\n                        trial_expr  = best_expr,\n                        target_expr = target_expr,\n                        handle_trigo            = True,\n                        prevent_zero_frac       = True,\n                        prevent_inf_equivalence = True,\n                        verbose                 = True,\n)\nprint(\"Is equivalent:\", is_equivalent)\n```\n\n```\n\u003e\u003e\u003e best_expr :  1.234*v**2 + 12.1018*z\n    target_expr :  1.234*v**2 + 12.1018*z\n    \n    Checking equivalence:\n      -\u003e Assessing if 1.234*v**2 + 12.101838*z (target) is equivalent to 1.74275713004454*v**2*sin(1.0)**2 + 12.1018380702846*z (trial)\n       -\u003e Simplified expression : 1.23*v**2 + 12.1*z\n       -\u003e Symbolic error        : 0\n       -\u003e Symbolic fraction     : 1\n       -\u003e Trigo symbolic error        : 0\n       -\u003e Trigo symbolic fraction     : 1\n       -\u003e Equivalent : True\n    Is equivalent: True\n```\n\n# Documentation\n\nFurther documentation can be found at [physo.readthedocs.io](https://physo.readthedocs.io/en/latest/).\n\nQuick start guide for __Symbolic Regression__ : [HERE](https://physo.readthedocs.io/en/latest/r_sr.html#getting-started-sr)  \n\nQuick start guide for __Class Symbolic Regression__ : [HERE](https://physo.readthedocs.io/en/latest/r_class_sr.html#getting-started-class-sr)  \n\n# Citing this work\n \nSymbolic Regression with reinforcement learning \u0026 dimensional analysis\n\n```\n@ARTICLE{PhySO_RL_DA,\n       author = {{Tenachi}, Wassim and {Ibata}, Rodrigo and {Diakogiannis}, Foivos I.},\n        title = \"{Deep Symbolic Regression for Physics Guided by Units Constraints: Toward the Automated Discovery of Physical Laws}\",\n      journal = {ApJ},\n         year = 2023,\n        month = dec,\n       volume = {959},\n       number = {2},\n          eid = {99},\n        pages = {99},\n          doi = {10.3847/1538-4357/ad014c},\narchivePrefix = {arXiv},\n       eprint = {2303.03192},\n primaryClass = {astro-ph.IM},\n       adsurl = {https://ui.adsabs.harvard.edu/abs/2023ApJ...959...99T},\n      adsnote = {Provided by the SAO/NASA Astrophysics Data System}\n}\n```\n\nClass Symbolic Regression\n```\n@ARTICLE{PhySO_ClassSR,\n       author = {{Tenachi}, Wassim and {Ibata}, Rodrigo and {Fran{\\c{c}}ois}, Thibaut L. and {Diakogiannis}, Foivos I.},\n        title = \"{Class Symbolic Regression: Gotta Fit 'Em All}\",\n      journal = {arXiv e-prints},\n     keywords = {Computer Science - Machine Learning, Astrophysics - Astrophysics of Galaxies, Astrophysics - Instrumentation and Methods for Astrophysics, Physics - Computational Physics},\n         year = 2023,\n        month = dec,\n          eid = {arXiv:2312.01816},\n        pages = {arXiv:2312.01816},\n          doi = {10.48550/arXiv.2312.01816},\narchivePrefix = {arXiv},\n       eprint = {2312.01816},\n primaryClass = {cs.LG},\n       adsurl = {https://ui.adsabs.harvard.edu/abs/2023arXiv231201816T},\n      adsnote = {Provided by the SAO/NASA Astrophysics Data System}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwassimtenachi%2Fphyso","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwassimtenachi%2Fphyso","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwassimtenachi%2Fphyso/lists"}