{"id":24527182,"url":"https://github.com/metexplore/dexom-python","last_synced_at":"2025-08-24T22:17:13.536Z","repository":{"id":51674100,"uuid":"511520615","full_name":"MetExplore/dexom-python","owner":"MetExplore","description":"diversity-based enumeration of optimal context-specific metabolic networks using the cobrapy library","archived":false,"fork":false,"pushed_at":"2025-03-27T09:47:53.000Z","size":2006,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-27T10:30:48.970Z","etag":null,"topics":["cobrapy","constraint-based-modeling","metabolism","omics-data-integration","optimization","python"],"latest_commit_sha":null,"homepage":"https://forgemia.inra.fr/metexplore/cbm/dexom-python","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MetExplore.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-07-07T12:33:59.000Z","updated_at":"2025-03-27T09:47:56.000Z","dependencies_parsed_at":"2024-03-12T17:49:51.075Z","dependency_job_id":"5dd4d501-6f8f-47cb-85b1-eb0495ff61ef","html_url":"https://github.com/MetExplore/dexom-python","commit_stats":null,"previous_names":[],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MetExplore%2Fdexom-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MetExplore%2Fdexom-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MetExplore%2Fdexom-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MetExplore%2Fdexom-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MetExplore","download_url":"https://codeload.github.com/MetExplore/dexom-python/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248904970,"owners_count":21180892,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cobrapy","constraint-based-modeling","metabolism","omics-data-integration","optimization","python"],"created_at":"2025-01-22T06:17:03.371Z","updated_at":"2025-08-24T22:17:13.516Z","avatar_url":"https://github.com/MetExplore.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DEXOM in python\n\n\u003ca href = \"https://github.com/MetExplore/dexom-python/blob/master/LICENSE\"\u003e\u003cimg alt=\"GitHub license\" src=\"https://img.shields.io/github/license/maximiliansti/dexom_python\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pypi.org/project/dexom-python/\"\u003e\u003cimg alt = \"PyPI Package\" src = \"https://img.shields.io/pypi/v/dexom-python\"/\u003e\u003c/a\u003e  \n\u003ca href=\"https://archive.softwareheritage.org/browse/origin/?origin_url=https://pypi.org/project/dexom-python/\"\u003e\u003cimg src=\"https://archive.softwareheritage.org/badge/origin/https://pypi.org/project/dexom-python//\"\u003e\u003c/a\u003e\n\u003ca href=\"https://forgemia.inra.fr/metexplore/cbm/dexom-python/-/commits/master\"\u003e\u003cimg alt=\"pipeline status\" src=\"https://forgemia.inra.fr/metexplore/cbm/dexom-python/badges/master/pipeline.svg\" /\u003e\u003c/a\u003e\n\nThis is a python implementation of DEXOM (Diversity-based enumeration of optimal context-specific metabolic networks)  \nThe original project, which was developped in MATLAB, can be found here: https://github.com/MetExplore/dexom  \nThe imat implementation was partially inspired by the driven package for data-driven constraint-based analysis: https://github.com/opencobra/driven\n\nThe package can be installed using pip: `pip install dexom-python`\n\nYou can also clone the git repository with `git clone https://forge.inrae.fr/metexplore/cbm/dexom-python`  \nThen install dependencies with `poetry install` (if poetry is already installed in your python environment) or `pip install -e .` \n\nAPI documentation is available here: https://dexom-python.readthedocs.io/en/stable/  \nAll of the commandline scripts can be called with the `-h` option to display help messages.\n\n## Requirements\n- Python 3.7 - 3.10\n- CPLEX 12.10 - 22.10\n\n### Installing CPLEX\n\n[Free license (Trial version)](https://www.ibm.com/analytics/cplex-optimizer): this version is limited to 1000 variables and 1000 constraints, and is therefore not useable on larger models\n\n[Academic license](https://www.ibm.com/academic/technology/data-science): for this, you must sign up using an academic email address.\n - after logging in, you can access the download for \"ILOG CPLEX Optimization Studio\"\n - download version 12.10 or higher of the appropriate installer for your operating system\n - install the solver \n\nYou must then update the environment variable named PYTHONPATH (in Linux) or Path (in Windows) by adding the directory containing the `setup.py` file appropriate for your OS and python version   \nAlternatively, run `python \"C:\\Program Files\\IBM\\ILOG\\CPLEX_Studio1210\\python\\setup.py\" install` and/or `pip install cplex==12.10` (with the appropriate CPLEX version number)\n\n## Available functions\n\nThese are the different functions which are available for context-specific metabolic subnetwork extraction\n\n### apply_gpr\nThe `gpr_rules.py` script can be used to transform gene expression data into reaction weights.  \nIt uses the gene identifiers and gene-protein-reaction rules present in the model to connect the genes and reactions.  \nBy default, continuous gene expression values/weights will be transformed into continuous reaction weights.  \nUsing the `--convert` flag will instead create semi-quantitative reaction weights with values in {-1, 0, 1}. The default proportion of these three weights is {25%, 50%, 25%}, it can be adjusted with the `--quantiles` parameter.\n\n### iMAT\n`imat_functions.py` contains a modified version of the iMAT algorithm as defined by [(Shlomi et al. 2008)](https://pubmed.ncbi.nlm.nih.gov/18711341/).  \nThe main inputs of this algorithm are a model file, which must be supplied in a cobrapy-compatible format (SBML, JSON or MAT), and a reaction_weight file in which each reaction is attributed a score.  \nThese reaction weights must be determined prior to launching imat, for example with GPR rules present in the metabolic model.  \n\nThe remaining inputs of imat are:\n- `epsilon`: the activation threshold of reactions with weight \u003e 0\n- `threshold`: the activation threshold for unweighted reactions\n- `full`: a bool parameter for switching between the partial \u0026 full-DEXOM implementation\n\nIn addition, the following solver parameters have been made available through the solver API:\n- `timelimit`: the maximum amount of time allowed for solver optimization (in seconds)\n- `feasibility`: the solver feasibility tolerance\n- `mipgaptol`: the solver MIP gap tolerance\n\nnote: the feasibility determines the solver's capacity to return correct results.  \n**It is absolutely necessary** to uphold the following rule: `epsilon \u003e threshold \u003e ub*feasibility` (where `ub` is the maximal upper bound for reaction flux in the model).\n\nBy default, imat uses the `create_new_partial_variables` function. In this version, binary flux indicator variables are created for each reaction with a non-zero weight.  \nIn the full-DEXOM implementation, binary flux indicator variables are created for every reaction in the model. This does not change the result of the imat function, but can be used for the enumeration methods below.\n\nThere is additionally a  `parsimonious_imat` function, which first maximizes the original iMAT objective, then minimizes the absolute sum of all reaction fluxes, thus producing a parsimonious flux distribution.\n\n### enum_functions\n\nFour methods for enumerating context-specific networks are available:\n- `rxn_enum_functions.py` contains reaction-enumeration (function name: `rxn_enum`)\n- `icut_functions.py` contains integer-cut (function name: `icut`)\n- `maxdist_functions.py` contains distance-maximization (function name: `maxdist`)\n- `diversity_enum_functions.py` contains diversity-enumeration  (function name: `diversity_enum`)\n\nAn explanation of these methods can be found in [(Rodriguez-Mier et al. 2021)](https://doi.org/10.1371/journal.pcbi.1008730).  \nEach of these methods can be used on its own. The same model and reaction_weights inputs must be provided as for the imat function.\n\nAdditional parameters for all 4 methods are:\n- `prev_sol`: an imat solution used as a starting point (if none is provided, a new one will be computed)  \n- `obj_tol`: the relative tolerance on the imat objective value for the optimality of the solutions  \n\nicut, maxdist, and diversity-enum also have two more parameters:\n- `maxiter`: the maximum number of iterations to run\n- `full`: set to True to use the full-DEXOM implementation  \nAs previously explained, the full-DEXOM implementation defines binary indicator variables for all reactions in the model. Although only the reactions with non-zero weights have an impact on the imat objective function, the distance maximization function which is used in maxdist and diversity-enum can utilize the binary indicators for all reactions. This increases the distance between the solutions and their diversity, but requires significantly more computation time.  \n\nmaxdist and div-enum also have one additional parameter:  \n- `icut`: if True, an integer-cut constraint will be applied to prevent this enumeration to produce duplicate solutions\n\n## Parallelized DEXOM for computation clusters\nThe folder `dexom_python/cluster_utils/` contains batch scripts which can be used for running dexom_python functions on a slurm cluster, as well as a snakemake workflow which can be used to launch enumeration functions in multiple jobs.\n\nThe script `cluster_install_dexom_python.sh` contains the necessary commands for cloning the dexom-python git repository, setting up a python virtual environement and installing all required dependencies.  \nNote that this script will only work if your cluster has a python module installed at `system/Python-3.7.4` - otherwise you must use a python version which is installed on your cluster.  \nInstalling the CPLEX solver must be done separately. For a brief explanation on how to install the solver on Linux, refer to [this IBM Q\u0026A page](https://www.ibm.com/support/pages/installation-ibm-ilog-cplex-optimization-studio-linux-platforms).\n\nThe snakemake workflow can be launched through the following command: (note that you must replace the `\"path/to/solver\"` string with the actual path to your CPLEX solver.)  \n```\nsbatch dexom_python/cluster_utils/submit_slurm.sh\n```\nIf you run this command without modifying any parameters, it will execute a short DEXOM pipeline (with reaction-enumeration followed by diversity-enumeration) on a toy model.\n\nThe main parameters of the snakemake workflow can be found in the file `cluster_config.yaml`.  \nHere you can define the inputs \u0026 outputs, as well as the number of parallel batches and iterations per batch.  \nNote that if you want to modify the advanced parameters for DEXOM, such as the solver tolerance and threshold values, you must to so in the `dexom_python/default_parameter_values.py` file.  \n\nThis workflow uses a reaction-weights file as an input. The \n\nThe following scripts provide some tools to visualize \u0026 analyze DEXOM results:  \n- `pathway_enrichment.py` can be used to perform a pathway enrichment analysis using a one-sided hypergeometric test  \n- `result_functions.py` contains the `plot_pca` function, which performs Principal Component Analysis on the enumeration solutions\n\n*Some older scripts for running enumeration functions on a slurm cluster can be found in `dexom_python/cluster_utils/legacy`. However, it is strongly recommended to use the snakemake workflow, which is more reliable and can be adapted more easily for different applications.*\n\n\n## Example scripts\n\n### Toy models\nThe `toy_models.py` script contains code for generating some small metabolic models and reaction weights.  \nThe `toy_models/` folder contains some ready-to-use models and reaction weight files.  \nThe `main.py` script contains a simple example of the DEXOM workflow using one of the toy models.   \nAs mentioned previously, the snakemake workflow in `dexom_python/cluster_utils/` also uses a toy model as an example.\n\n### Recon 2.2\nThe `example_data/` folder contains a modified version of the Recon 2.2 model [(Swainston et al. 2016)](https://doi.org/10.1007/s11306-016-1051-4) as well as some differential gene expression data which can be used to test this implementation.  \nThe folder already contains a reaction-weights file, which was produced with the following command:  \n```\npython dexom_python/gpr_rules -m example_data/recon2v2_corrected.json -g example_data/pval_0-01_geneweights.csv -o example_data/pval_0-01_reactionweights\n```\nAlternatively an example of how this command can be submitted to a slurm cluster is shown in `slurm_example_gpr.sh` (again, you must insert the path to your CPLEX solver in the appropriate location).\n\nIn order to use the snakemake workflow on this example dataset, you must modify some parameters in `cluster_config.yaml`:\n```\nmodel: example_data/recon2v2_corrected.json\nreaction_weights: example_data/pval_0-01_reactionweights.csv\noutput_path: example_data_cluster_output/\n```\nAdditionally, when using continuous reaction-weights, the solver may have difficulty finding solutions if the constraints are too strict. To relax the optimality tolerance on the objective value, modify the following parameter in the file `dexom_python/default_parameter_values.py`:\n```\n'obj_tol': 2e-3,\n```\nYou can then once again start the snakemake workflow with the command:\n```\nsbatch dexom_python/cluster_utils/submit_slurm.sh\n```\n\nAfter all jobs are completed, you can analyze the results with the following commands:  \n```\npython dexom_python/pathway_enrichment.py -s example_data_cluster_output/all_unique_solutions.csv -m example_data/recon2v2_corrected.json -o example_data/\npython dexom_python/result_functions.py -s example_data_cluster_output/all_unique_solutions.csv -o example_data/\n```\nThe file `example_data_cluster_output/all_unique_solutions.csv` contains all unique solutions enumerated with DEXOM.  \nThe `.png` files in the `example_data` folder contain boxplots of the pathway enrichment tests as well as a 2D PCA plot of the binary solution vectors.\n\n### Cell-specific reconstruction\n\nAn example of how to use DEXOM-python as a part of a cell-specific network reconstruction pipeline, including a more complete snakemake workflow, can be found here: https://forge.inrae.fr/metexplore/cbm/ocmmed\n\n### Latest version: v2.1.1","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmetexplore%2Fdexom-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmetexplore%2Fdexom-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmetexplore%2Fdexom-python/lists"}