{"id":21962519,"url":"https://github.com/biocpy/summarizedexperiment","last_synced_at":"2025-09-08T06:37:29.512Z","repository":{"id":37858409,"uuid":"503640024","full_name":"BiocPy/SummarizedExperiment","owner":"BiocPy","description":"Container class for genomic experiments","archived":false,"fork":false,"pushed_at":"2025-09-01T16:30:22.000Z","size":4927,"stargazers_count":5,"open_issues_count":5,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-01T18:26:05.392Z","etag":null,"topics":["summarizedexperiment"],"latest_commit_sha":null,"homepage":"https://biocpy.github.io/SummarizedExperiment/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BiocPy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-06-15T06:18:08.000Z","updated_at":"2025-08-23T05:33:22.000Z","dependencies_parsed_at":"2023-01-31T19:15:37.580Z","dependency_job_id":"284567fb-6ad5-400d-820d-5aa881474708","html_url":"https://github.com/BiocPy/SummarizedExperiment","commit_stats":null,"previous_names":[],"tags_count":42,"template":false,"template_full_name":null,"purl":"pkg:github/BiocPy/SummarizedExperiment","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2FSummarizedExperiment","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2FSummarizedExperiment/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2FSummarizedExperiment/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2FSummarizedExperiment/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BiocPy","download_url":"https://codeload.github.com/BiocPy/SummarizedExperiment/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2FSummarizedExperiment/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274146363,"owners_count":25230115,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-08T02:00:09.813Z","response_time":121,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["summarizedexperiment"],"created_at":"2024-11-29T10:42:51.146Z","updated_at":"2025-09-08T06:37:29.490Z","avatar_url":"https://github.com/BiocPy.png","language":"Python","readme":"[![Project generated with PyScaffold](https://img.shields.io/badge/-PyScaffold-005CA0?logo=pyscaffold)](https://pyscaffold.org/)\n[![PyPI-Server](https://img.shields.io/pypi/v/SummarizedExperiment.svg)](https://pypi.org/project/SummarizedExperiment/)\n![Unit tests](https://github.com/BiocPy/SummarizedExperiment/actions/workflows/run-tests.yml/badge.svg)\n\n# SummarizedExperiment\n\nThis package provides containers to represent genomic experimental data as 2-dimensional matrices, follows Bioconductor's [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html). In these matrices, the rows typically denote features or genomic regions of interest, while columns represent samples or cells.\n\nThe package currently includes representations for both `SummarizedExperiment` and `RangedSummarizedExperiment`. A distinction lies in the fact `RangedSummarizedExperiment` object provides an additional slot to store genomic regions for each feature and is expected to be `GenomicRanges` (more [here](https://github.com/BiocPy/GenomicRanges/)).\n\n## Install\n\nTo get started, Install the package from [PyPI](https://pypi.org/project/summarizedexperiment/),\n\n```shell\npip install summarizedexperiment\n```\n\n## Usage\n\nA `SummarizedExperiment` contains three key attributes,\n\n- `assays`: A dictionary of matrices with assay names as keys, e.g. counts, logcounts etc.\n- `row_data`: Feature information e.g. genes, transcripts, exons, etc.\n- `column_data`: Sample information about the columns of the matrices.\n\nFirst lets mock feature and sample data:\n\n```python\nfrom random import random\nimport pandas as pd\nimport numpy as np\nfrom biocframe import BiocFrame\n\nnrows = 200\nncols = 6\ncounts = np.random.rand(nrows, ncols)\nrow_data = BiocFrame(\n    {\n        \"seqnames\": [\n            \"chr1\",\n            \"chr2\",\n            \"chr2\",\n            \"chr2\",\n            \"chr1\",\n            \"chr1\",\n            \"chr3\",\n            \"chr3\",\n            \"chr3\",\n            \"chr3\",\n        ]\n        * 20,\n        \"starts\": range(100, 300),\n        \"ends\": range(110, 310),\n        \"strand\": [\"-\", \"+\", \"+\", \"*\", \"*\", \"+\", \"+\", \"+\", \"-\", \"-\"] * 20,\n        \"score\": range(0, 200),\n        \"GC\": [random() for _ in range(10)] * 20,\n    }\n)\n\ncol_data = pd.DataFrame(\n    {\n        \"treatment\": [\"ChIP\", \"Input\"] * 3,\n    }\n)\n```\n\nTo create a `SummarizedExperiment`,\n\n```python\nfrom summarizedexperiment import SummarizedExperiment\n\ntse = SummarizedExperiment(\n    assays={\"counts\": counts}, row_data=row_data, column_data=col_data,\n    metadata={\"seq_platform\": \"Illumina NovaSeq 6000\"},\n)\n```\n\n    ## output\n    class: SummarizedExperiment\n    dimensions: (200, 6)\n    assays(1): ['counts']\n    row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']\n    row_names(0):\n    column_data columns(1): ['treatment']\n    column_names(0):\n    metadata(1): seq_platform\n\nTo create a `RangedSummarizedExperiment`\n\n```python\nfrom summarizedexperiment import RangedSummarizedExperiment\nfrom genomicranges import GenomicRanges\n\ntrse = RangedSummarizedExperiment(\n    assays={\"counts\": counts}, row_data=row_data,\n    row_ranges=GenomicRanges.from_pandas(row_data.to_pandas()), column_data=col_data\n)\n```\n\n    ## output\n    class: RangedSummarizedExperiment\n    dimensions: (200, 6)\n    assays(1): ['counts']\n    row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']\n    row_names(0):\n    column_data columns(1): ['treatment']\n    column_names(0):\n    metadata(0):\n\nFor more examples, checkout the [documentation](https://biocpy.github.io/SummarizedExperiment/).\n\n\u003c!-- pyscaffold-notes --\u003e\n\n## Note\n\nThis project has been set up using PyScaffold 4.5. For details and usage\ninformation on PyScaffold see https://pyscaffold.org/.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiocpy%2Fsummarizedexperiment","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbiocpy%2Fsummarizedexperiment","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiocpy%2Fsummarizedexperiment/lists"}