{"id":13675830,"url":"https://github.com/zqfang/GSEApy","last_synced_at":"2025-04-28T23:31:24.891Z","repository":{"id":39121356,"uuid":"49308559","full_name":"zqfang/GSEApy","owner":"zqfang","description":"Gene Set Enrichment Analysis in Python","archived":false,"fork":false,"pushed_at":"2025-04-01T16:56:01.000Z","size":105513,"stargazers_count":614,"open_issues_count":27,"forks_count":125,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-04-04T09:17:57.467Z","etag":null,"topics":["enrichment-analysis","gsea","python3","rust"],"latest_commit_sha":null,"homepage":"http://gseapy.rtfd.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zqfang.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-01-09T03:05:06.000Z","updated_at":"2025-04-04T02:37:16.000Z","dependencies_parsed_at":"2023-02-04T08:15:45.101Z","dependency_job_id":"fd43d0ff-d247-4396-8b8d-08ec419b179f","html_url":"https://github.com/zqfang/GSEApy","commit_stats":{"total_commits":1094,"total_committers":23,"mean_commits":47.56521739130435,"dds":0.6892138939670933,"last_synced_commit":"bc01fd4fc43cde5c293ac3b94b41c46b1ab8ce46"},"previous_names":[],"tags_count":48,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zqfang%2FGSEApy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zqfang%2FGSEApy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zqfang%2FGSEApy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zqfang%2FGSEApy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zqfang","download_url":"https://codeload.github.com/zqfang/GSEApy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251404487,"owners_count":21584098,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["enrichment-analysis","gsea","python3","rust"],"created_at":"2024-08-02T12:01:04.744Z","updated_at":"2025-04-28T23:31:19.883Z","avatar_url":"https://github.com/zqfang.png","language":"Python","funding_links":[],"categories":["Python","Software packages","زیست شناسی و بیوتکنولوژی"],"sub_categories":["RNA-seq","کار با زمان و تقویم"],"readme":"\nGSEApy\n========\n\nGSEApy: Gene Set Enrichment Analysis in Python.\n------------------------------------------------\n\n.. image:: https://badge.fury.io/py/gseapy.svg\n    :target: https://badge.fury.io/py/gseapy\n\n.. image:: https://img.shields.io/conda/vn/bioconda/GSEApy.svg?style=plastic\n    :target: http://bioconda.github.io\n\n.. image:: https://anaconda.org/bioconda/gseapy/badges/downloads.svg   \n    :target: https://anaconda.org/bioconda/gseapy\n\n.. image:: https://github.com/zqfang/GSEApy/workflows/GSEApy/badge.svg?branch=master\n    :target: https://github.com/zqfang/GSEApy/actions\n    :alt: Action Status\n\n.. image:: http://readthedocs.org/projects/gseapy/badge/?version=master\n    :target: http://gseapy.readthedocs.io/en/master/?badge=master\n    :alt: Documentation Status\n\n.. image:: https://img.shields.io/badge/license-MIT-blue.svg\n    :target:  https://img.shields.io/badge/license-MIT-blue.svg\n\n.. image:: https://img.shields.io/pypi/pyversions/gseapy.svg\n    :alt: PyPI - Python Version\n\n\n**Release notes** : https://github.com/zqfang/GSEApy/releases\n\n`Tutorial for scRNA-seq datasets \u003chttps://gseapy.readthedocs.io/en/latest/singlecell_example.html#\u003e`_\n\n`Tutorial for general usage \u003chttps://gseapy.readthedocs.io/en/latest/gseapy_example.html\u003e`_\n\n\nCitation\n------------------------------------\n::\n\n    Zhuoqing Fang, Xinyuan Liu, Gary Peltz, GSEApy: a comprehensive package for performing gene set enrichment analysis in Python, \n    Bioinformatics, 2022;, btac757, https://doi.org/10.1093/bioinformatics/btac757\n\n\n\nGSEApy is a Python/Rust implementation for **GSEA** and wrapper for **Enrichr**.\n--------------------------------------------------------------------------------------------\n\nGSEApy can be used for **RNA-seq, ChIP-seq, Microarray** data. It can be used for convenient GO enrichment and to produce **publication quality figures** in python.\n\n\nGSEApy has 7 sub-commands available: ``gsea``, ``prerank``, ``ssgsea``, ``gsva``, ``replot`` ``enrichr``, ``biomart``.\n\n\n:gsea:    The ``gsea`` module produces `GSEA  \u003chttp://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page\u003e`_ results.  The input requries a txt file(FPKM, Expected Counts, TPM, et.al), a cls file, and gene_sets file in gmt format.\n:prerank: The ``prerank`` module produces **Prerank tool** results.  The input expects a pre-ranked gene list dataset with correlation values, provided in .rnk format, and gene_sets file in gmt format.  ``prerank`` module is an API to `GSEA` pre-rank tools.\n:ssgsea: The ``ssgsea`` module performs **single sample GSEA(ssGSEA)** analysis.  The input expects a pd.Series (indexed by gene name), or a pd.DataFrame (include ``GCT`` file) with expression values and a ``GMT`` file. For multiple sample input, ssGSEA reconigzes gct format, too. ssGSEA enrichment score for the gene set is described by `D. Barbie et al 2009 \u003chttp://www.nature.com/nature/journal/v462/n7269/abs/nature08460.html\u003e`_.\n:gsva: The ``gsva`` module performs `GSVA \u003chttps://github.com/rcastelo/GSVA\u003e`_ method by `Hänzelmann et al \u003chttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-7\u003e`_. The input is same to ssgsea.\n:replot: The ``replot`` module reproduce GSEA desktop version results.  The only input for GSEApy is the location to ``GSEA`` Desktop output results.\n:enrichr: The ``enrichr`` module enable you perform gene set enrichment analysis using ``Enrichr`` API. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr . It runs very fast.\n:biomart: The ``biomart`` module helps you convert gene ids using BioMart API.\n\n\nPlease use 'gseapy COMMAND -h' to see the detail description for each option of each module.\n\n\nThe full ``GSEA`` is far too extensive to describe here; see\n`GSEA  \u003chttp://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page\u003e`_ documentation for more information. All files' formats for GSEApy are identical to ``GSEA`` desktop version.\n\n\n\nWhy GSEApy\n-----------------------------------------------------\n\nI would like to use Pandas to explore my data, but I did not find a convenient tool to\ndo gene set enrichment analysis in python. So, here are my reasons:\n\n* **Ability to run inside python interactive console without having to switch to R!!!**\n* User friendly for both wet and dry lab users.\n* Produce or reproduce publishable figures.\n* Perform batch jobs easy.\n* Easy to use in bash shell or your data analysis workflow, e.g. snakemake.\n\n\nGSEApy vs GSEA(Broad) output\n-----------------------------------------------\nUsing the same data for ``GSEAPreranked``, and ``GSEApy`` reproduce similar results.\n\n\n.. image:: docs/Preank.py.vs.broad.jpg\n    :width: 400\n\n\nSee more output here: `Example \u003chttp://gseapy.readthedocs.io/en/master/gseapy_example.html\u003e`_\n\n\nInstallation\n------------\n\n| Install gseapy package from bioconda or pip.\n\n\n.. code:: shell\n\n   # if you have conda (MacOS_x86-64 and Linux only)\n   $ conda install -c bioconda gseapy\n   # Windows and MacOS_ARM64(M1/2-Chip)\n   $ pip install gseapy\n\n\n| If pip install failed, use\n\n.. code:: shell\n\n   # you need to install rust first to compile the code\n   curl https://sh.rustup.rs -sSf | sh -s -- -y\n   # export rust compiler \n   export PATH=\"$PATH:$HOME/.cargo/bin\"\n   # install\n   $ pip install git+git://github.com/zqfang/gseapy.git#egg=gseapy\n\n\nDependency\n--------------\n* Python 3.7+\n\nMandatory\n~~~~~~~~~\n\n* build\n    * Rust: For gseapy \u003e 0.11.0, Rust compiler is needed\n    * setuptools-rust\n* run\n    * Numpy \u003e= 1.13.0\n    * Scipy\n    * Pandas\n    * Matplotlib\n    * Requests\n\nRun GSEApy\n-----------------\n\n\nFor command line usage:\n~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n\n  # An example to reproduce figures using replot module.\n  $ gseapy replot -i ./Gsea.reports -o test\n\n\n  # An example to run GSEA using gseapy gsea module\n  $ gseapy gsea -d exptable.txt -c test.cls -g gene_sets.gmt -o test\n\n  # An example to run Prerank using gseapy prerank module\n  $ gseapy prerank -r gsea_data.rnk -g gene_sets.gmt -o test\n\n  # An example to run ssGSEA using gseapy ssgsea module\n  $ gseapy ssgsea -d expression.txt -g gene_sets.gmt -o test\n\n  # An example to run GSVA using gseapy ssgsea module\n  $ gseapy gsva -d expression.txt -g gene_sets.gmt -o test\n\n  # An example to use enrichr api\n  # see details for -g input -\u003e ``get_library_name`` \n  $ gseapy enrichr -i gene_list.txt -g KEGG_2016 -o test\n\n\n\nRun gseapy inside python console:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n1. Prepare expression.txt, gene_sets.gmt and test.cls required by GSEA, you could do this\n\n.. code:: python\n\n    import gseapy\n\n    # run GSEA.\n    gseapy.gsea(data='expression.txt', gene_sets='gene_sets.gmt', cls='test.cls', outdir='test')\n\n    # run prerank\n    gseapy.prerank(rnk='gsea_data.rnk', gene_sets='gene_sets.gmt', outdir='test')\n\n    # run ssGSEA\n    gseapy.ssgsea(data=\"expression.txt\", gene_sets= \"gene_sets.gmt\", outdir='test')\n\n    # run GSVA\n    gseapy.gsva(data=\"expression.txt\", gene_sets= \"gene_sets.gmt\", outdir='test')\n\n    # An example to reproduce figures using replot module.\n    gseapy.replot(indir='./Gsea.reports', outdir='test')\n\n\n2. If you prefer to use Dataframe, dict, list in interactive python console, you could do this.\n\nsee detail here: `Example \u003chttp://gseapy.readthedocs.io/en/master/gseapy_example.html\u003e`_\n\n.. code:: python\n\n\n    # assign dataframe, and use enrichr library data set 'KEGG_2016'\n    expression_dataframe = pd.DataFrame()\n\n    sample_name = ['A','A','A','B','B','B'] # always only two group,any names you like\n\n    # assign gene_sets parameter with enrichr library name or gmt file on your local computer.\n    gseapy.gsea(data=expression_dataframe, gene_sets='KEGG_2016', cls= sample_names, outdir='test')\n\n    # prerank tool\n    gene_ranked_dataframe = pd.DataFrame()\n    gseapy.prerank(rnk=gene_ranked_dataframe, gene_sets='KEGG_2016', outdir='test')\n\n    # ssGSEA\n    gseapy.ssgsea(data=expression_dataframe, gene_sets='KEGG_2016', outdir='test')\n\n    # gsva\n    gseapy.gsva(data=expression_dataframe, gene_sets='KEGG_2016', outdir='test')\n\n\n3. For ``enrichr`` , you could assign a list, pd.Series, pd.DataFrame object, or a txt file (should be one gene name per row.)\n\n.. code:: python\n\n    # assign a list object to enrichr\n    gl = ['SCARA3', 'LOC100044683', 'CMBL', 'CLIC6', 'IL13RA1', 'TACSTD2', 'DKKL1', 'CSF1',\n         'SYNPO2L', 'TINAGL1', 'PTX3', 'BGN', 'HERC1', 'EFNA1', 'CIB2', 'PMP22', 'TMEM173']\n\n    gseapy.enrichr(gene_list=gl, gene_sets='KEGG_2016', outdir='test')\n\n    # or a txt file path.\n    gseapy.enrichr(gene_list='gene_list.txt', gene_sets='KEGG_2016',\n                   outdir='test', cutoff=0.05, format='png' )\n\n\nGSEApy supported gene set libaries :\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo see the full list of gseapy supported gene set libraries, please click here: `Library \u003chttp://amp.pharm.mssm.edu/Enrichr/#stats\u003e`_\n\nOr use ``get_library_name`` function inside python console.\n\n.. code:: python\n\n    #see full list of latest enrichr library names, which will pass to -g parameter:\n    names = gseapy.get_library_name()\n\n    # show top 20 entries.\n    print(names[:20])\n\n\n   ['Genome_Browser_PWMs',\n   'TRANSFAC_and_JASPAR_PWMs',\n   'ChEA_2013',\n   'Drug_Perturbations_from_GEO_2014',\n   'ENCODE_TF_ChIP-seq_2014',\n   'BioCarta_2013',\n   'Reactome_2013',\n   'WikiPathways_2013',\n   'Disease_Signatures_from_GEO_up_2014',\n   'KEGG_2016',\n   'TF-LOF_Expression_from_GEO',\n   'TargetScan_microRNA',\n   'PPI_Hub_Proteins',\n   'GO_Molecular_Function_2015',\n   'GeneSigDB',\n   'Chromosome_Location',\n   'Human_Gene_Atlas',\n   'Mouse_Gene_Atlas',\n   'GO_Cellular_Component_2015',\n   'GO_Biological_Process_2015',\n   'Human_Phenotype_Ontology',]\n\n\n\nDev \n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: shell\n\n\n        # test rust extension only \n        cargo test --features=extension-module\n        # test whole package\n        python setup.py test\n\n\n\nBug Report\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIf you would like to report any bugs when use gseapy, don't hesitate to create an issue on github here.\n\n\nTo get help of GSEApy\n------------------------------------\n\n1. See `Frequently Asked Questions \u003chttps://gseapy.readthedocs.io/en/latest/faq.html\u003e`_\n\n2. Visit the document site at `Examples \u003chttps://gseapy.readthedocs.io/en/latest/gseapy_example.html\u003e`_\n\n3. The GSEApy discussion channel: `Q\u0026A \u003chttps://github.com/zqfang/GSEApy/discussions\u003e`_ \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzqfang%2FGSEApy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzqfang%2FGSEApy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzqfang%2FGSEApy/lists"}