{"id":23757591,"url":"https://github.com/broadinstitute/g2papi","last_synced_at":"2025-09-05T04:32:24.841Z","repository":{"id":230473995,"uuid":"777994843","full_name":"broadinstitute/g2papi","owner":"broadinstitute","description":"Python Client Library for the G2P Portal API","archived":false,"fork":false,"pushed_at":"2025-08-14T20:06:41.000Z","size":4562,"stargazers_count":12,"open_issues_count":2,"forks_count":2,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-08-14T22:10:39.184Z","etag":null,"topics":["bioinformatics","bioinformatics-tool","bioinformatics-visualization","isoforms","protein-structure","proteins","sequence-alignment","variant-analysis"],"latest_commit_sha":null,"homepage":"https://g2p.broadinstitute.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/broadinstitute.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-26T22:10:11.000Z","updated_at":"2025-08-14T20:06:44.000Z","dependencies_parsed_at":"2025-05-07T16:35:43.307Z","dependency_job_id":"54932cff-ae9d-4ab8-885a-42c226c52cc2","html_url":"https://github.com/broadinstitute/g2papi","commit_stats":null,"previous_names":["broadinstitute/g2papi"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/broadinstitute/g2papi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fg2papi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fg2papi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fg2papi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fg2papi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/broadinstitute","download_url":"https://codeload.github.com/broadinstitute/g2papi/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fg2papi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273713294,"owners_count":25154607,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","bioinformatics-tool","bioinformatics-visualization","isoforms","protein-structure","proteins","sequence-alignment","variant-analysis"],"created_at":"2024-12-31T19:49:07.538Z","updated_at":"2025-09-05T04:32:19.809Z","avatar_url":"https://github.com/broadinstitute.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# g2papi\n\n`g2papi` is a Python library and command-line tool designed to interact with the G2P API provided by the Broad Institute. It allows users to retrieve mappings between protein isoforms, transcripts, and PDB structures for a gene, as well as protein feature tables for a gene.\n\nAccess the swagger definition and endpoints at: https://g2p.broadinstitute.org/api-docs/\n\n## Citation\nIf you use g2papi in your research, please cite:\n\nKwon S, et al. Genomics 2 Proteins portal: A resource and discovery tool for linking genetic screening outputs to protein sequences and structures. doi: https://doi.org/10.1101/2024.01.02.573913.\n\n\n## Installation\n\nFirst, ensure that you have Python and pip installed on your system.\n\n### Installing with PyPi\n\n```\npip install g2papi\n```\n\n### Installing from source\n\nClone this repository to your local machine and navigate into the cloned directory:\n\n```\ngit clone https://github.com/broadinstitute/g2papi.git\ncd g2papi\n```\n\nTo install `g2papi`, run:\n\n```\npip install .\n```\n\nThis will install the `g2papi` package and its dependencies.\n\n## Usage\n\n### As a Python Library\n\nYou can import `g2papi` in your Python scripts to retrieve data from the G2P API directly. Here are some examples:\n\nCalling the G2P3D API to get the Gene-Transcript-Protein Isoform-Structure mapping\n\n```python\nimport g2papi\n\n# Get gene-transcript-protein isoform-protein structure map as a pandas dataframe\ngene_transcript_protein_isoform_struct = g2papi.get_gene_transcript_protein_isoform_structure('BRCA1', 'P38398')\nprint(gene_transcript_protein_isoform_struct[['UniProt Isoform','Ensembl Transcript Id', 'RefSeq mRNA Id']].head())\n\n```\n\nOutput:\n\n```\n  UniProt Isoform Ensembl Transcript Id RefSeq mRNA Id\n0     P38398-1(*)    ENST00000357654(*)   NM_001407611\n1     P38398-1(*)    ENST00000357654(*)   NM_001407616\n2     P38398-1(*)    ENST00000357654(*)   NM_001407624\n3     P38398-1(*)    ENST00000357654(*)   NM_001407637\n4     P38398-1(*)    ENST00000357654(*)   NM_001407641\n```\n\nGetting Protein Features\n\n```python\nimport g2papi\n\n# Get protein features as a pandas dataframe\nprotein_features = g2papi.get_protein_features('BRCA1', 'P38398')\nprotein_features.fillna('-', inplace=True)\nprint(protein_features[[\n    'residueId', 'AA',\n    'AlphaFold confidence (pLDDT)', \n    'Active site (UniProt)'\n]].head())\n\n```\n\nOutput:\n\n```\n   residueId AA  AlphaFold confidence (pLDDT) Active site (UniProt)\n0          1  M                            41.59                     -\n1          2  D                            45.81                     -\n2          3  L                            48.11                     -\n3          4  S                            63.99                     -\n4          5  A                            61.73                     -\n```\n\n\n### As a Command-Line Tool\ng2papi can also be used as a command-line tool to retrieve information directly to your terminal or output files.\n\nGetting Gene-Transcript-Protein Isoform-Structure Map with the G2P3D API\n\n```\ng2papi get-gene-transcript-protein-isoform-structure-map --geneName BRCA1 --uniprotId P38398\n```\n\nGetting Protein Features\n\n```\ng2papi get-protein-features --geneName BRCA1 --uniprotId P38398\n```\n\nThe above commands will print the results to your terminal. If you wish to save the output to a file, you can redirect the output:\n\n```\ng2papi get-gene-transcript-protein-isoform-structure-map --geneName BRCA1 --uniprotId P38398 \u003e transcript_map.tsv\ng2papi get-protein-features --geneName BRCA1 --uniprotId P38398 \u003e protein_features.tsv\n```\n\n## System Requirements\nThe package was developed and tested on Python 3.9.12, and is designed to run on a computer that can run Python3 and has a working internet connection. The library was installed and tested on Ubuntu Linux 20.04 and Mac OSX Ventura, 13.5.1. \n\n## Set up time (total time to set up and run: approximately 5 minutes)\nInstallation and execution steps run in approximately real time. 3 installation steps each run in less than 5 seconds, and execution time takes less than 5 seconds for genes with under 2000 residues.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbroadinstitute%2Fg2papi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbroadinstitute%2Fg2papi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbroadinstitute%2Fg2papi/lists"}