{"id":39614708,"url":"https://github.com/noamteyssier/adpbulk","last_synced_at":"2026-01-18T08:15:49.814Z","repository":{"id":41811198,"uuid":"437088194","full_name":"noamteyssier/adpbulk","owner":"noamteyssier","description":"pseudobulking on an AnnData object","archived":false,"fork":false,"pushed_at":"2025-03-14T22:33:21.000Z","size":22,"stargazers_count":34,"open_issues_count":5,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-12-21T23:51:25.036Z","etag":null,"topics":["anndata","differential-expression","pseudobulk","scanpy"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/noamteyssier.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-10T19:11:44.000Z","updated_at":"2025-08-14T16:52:45.000Z","dependencies_parsed_at":"2025-03-14T23:23:44.714Z","dependency_job_id":"776b5832-e3b2-46b8-bd11-9ae746e6ecf0","html_url":"https://github.com/noamteyssier/adpbulk","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/noamteyssier/adpbulk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamteyssier%2Fadpbulk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamteyssier%2Fadpbulk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamteyssier%2Fadpbulk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamteyssier%2Fadpbulk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/noamteyssier","download_url":"https://codeload.github.com/noamteyssier/adpbulk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamteyssier%2Fadpbulk/sbom","scorecard":{"id":691491,"data":{"date":"2025-08-11","repo":{"name":"github.com/noamteyssier/adpbulk","commit":"553e333e10279a2f519c2f4baf196aa25d639bff"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.4,"checks":[{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/main.yml:26: update your workflow using https://app.stepsecurity.io/secureworkflow/noamteyssier/adpbulk/main.yml/main?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/main.yml:31","Info:   0 out of   1 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   1 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Code-Review","score":0,"reason":"Found 0/18 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/main.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 11 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-22T02:23:37.441Z","repository_id":41811198,"created_at":"2025-08-22T02:23:37.442Z","updated_at":"2025-08-22T02:23:37.442Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28534141,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T00:39:45.795Z","status":"online","status_checked_at":"2026-01-18T02:00:07.578Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anndata","differential-expression","pseudobulk","scanpy"],"created_at":"2026-01-18T08:15:48.555Z","updated_at":"2026-01-18T08:15:49.801Z","avatar_url":"https://github.com/noamteyssier.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# adpbulk\n\n# Summary\nPerforms pseudobulking of an `AnnData` object based on columns available in the `.obs` dataframe. This was originally intended to be used to pseudo-bulk single-cell RNA-seq data to higher order combinations of the data as to use existing RNA-seq differential expression tools such as `edgeR` and `DESeq2`. An example usage of this would be pseudobulking cells based on their cluster, sample of origin, or CRISPRi guide identity. This is intended to work on both individual categories (i.e. one of the examples) or combinations of categories (two of the three, etc.)\n\n# Installation\n## From PyPI\n```bash\npip install adpbulk\n```\n\n## From Github\n```bash\ngit clone https://github.com/noamteyssier/adpbulk\ncd adpbulk\npip install .\npytest -v \n```\n\n# Usage\nThis package is intended to be used as a python module. \n\n## Single Category Pseudo-Bulk\nThe simplest use case is to aggregate on a single category. This will aggregate all the observations belonging to the same class within the category and return a pseudo-bulked matrix with dimensions equal to the number of values within the category. \n```python3\nfrom adpbulk import ADPBulk\n\n# initialize the object\nadpb = ADPBulk(adat, \"category_name\")\n\n# perform the pseudobulking\npseudobulk_matrix = adpb.fit_transform()\n\n# retrieve the sample meta data (useful for easy incorporation with edgeR)\nsample_meta = adpb.get_meta()\n```\n\n## Multiple Category Pseudo-Bulk\nA common use case is to aggregate on multiple categories. This will aggregate all observations beloging to the combination of classes within two categories and return a pseudo-bulked matrix with dimensions equal to the number of values of nonzero intersections between categories. \n```python3\nfrom adpbulk import ADPBulk\n\n# initialize the object\nadpb = ADPBulk(adat, [\"category_a\", \"category_b\"])\n\n# perform the pseudobulking\npseudobulk_matrix = adpb.fit_transform()\n\n# retrieve the sample meta data (useful for easy incorporation with edgeR)\nsample_meta = adpb.get_meta()\n```\n\n## Pseudo-Bulk using raw counts\nSome differential expression software expects the counts to be untransformed counts. SCANPY uses the `.raw` attribute in its `AnnData` objects to store the initial `AnnData` object before transformation. If you'd like to perform the pseudo-bulk aggregation using these raw counts you can provide the `use_raw=True` flag. \n```python3\nfrom adpbulk import ADPBulk\n\n# initialize the object w. aggregation on the `.raw` attribute\nadpb = ADPBulk(adat, [\"category_a\", \"category_b\"], use_raw=True)\n\n# perform the pseudobulking\npseudobulk_matrix = adpb.fit_transform()\n\n# retrieve the sample meta data (useful for easy incorporation with edgeR)\nsample_meta = adpb.get_meta()\n```\n\n## Alternative Aggregation Options\nIt may also be useful to aggregate using an alternative function besides the sum - this option will allow you to choose between sum, mean, and median as an aggregation function.\n```python3\nfrom adpbulk import ADPBulk\n\n# initialize the object w. an alternative aggregation option\n# aggregation options are: sum, mean, and median\n# default aggregation is sum\nadpb = ADPBulk(adat, \"category\", method=\"mean\")\n\n# perform the pseudobulking\npseudobulk_matrix = adpb.fit_transform()\n\n# retrieve the sample meta data (useful for easy incorporation with edgeR)\nsample_meta = adpb.get_meta()\n```\n\n## Alternative Formatting Options\n```python3\nfrom adpbulk import ADPBulk\n\n# initialize the object w. alternative name formatting options\nadpb = ADPBulk(adat, [\"category_a\", \"category_b\"], name_delim=\".\", group_delim=\"::\")\n\n# perform the pseudobulking\npseudobulk_matrix = adpb.fit_transform()\n\n# retrieve the sample meta data (useful for easy incorporation with edgeR)\nsample_meta = adpb.get_meta()\n```\n\n\n## Example `AnnData` Function\nHere is a function to generate an `AnnData` object to test the module or to play with the object if unfamiliar.\n```python3\nimport numpy as np\nimport pandas as pd\nimport anndata as ad\n\ndef build_adat(SIZE_N=100, SIZE_M=100):\n    \"\"\"\n    creates an anndata for testing\n    \"\"\"\n    # generates random values (mock transformed data)\n\tmat = np.random.random((SIZE_N, SIZE_M))\n\n\t# generates random values (mock raw count data)\n    raw = np.random.randint(0, 1000, (SIZE_N, SIZE_M))\n\n\t# creates the observations and categories\n    obs = pd.DataFrame({\n        \"cell\": [f\"b{idx}\" for idx in np.arange(SIZE_N)],\n        \"cA\": np.random.choice(np.random.choice(5)+1, SIZE_N),\n        \"cB\": np.random.choice(np.random.choice(5)+1, SIZE_N),\n        \"cC\": np.random.choice(np.random.choice(5)+1, SIZE_N),\n        \"cD\": np.random.choice(np.random.choice(5)+1, SIZE_N),\n        }).set_index(\"cell\")\n\n\t# creates the variables (genes) and categories\n    var = pd.DataFrame({\n        \"symbol\": [f\"g{idx}\" for idx in np.arange(SIZE_M)],\n        \"cA\": np.random.choice(np.random.choice(5)+1, SIZE_M),\n        \"cB\": np.random.choice(np.random.choice(5)+1, SIZE_M),\n        \"cC\": np.random.choice(np.random.choice(5)+1, SIZE_M),\n        \"cD\": np.random.choice(np.random.choice(5)+1, SIZE_M),\n        }).set_index(\"symbol\")\n    \n\t# Creates the `AnnData` object\n\tadat = ad.AnnData(\n            X=mat,\n            obs=obs,\n            var=var)\n    \n\t# Creates an `AnnData` object to simulate the `.raw` attribute\n\tadat_raw = ad.AnnData(\n            X=raw,\n            obs=obs,\n            var=var)\n    \n\t# Sets the `.raw` attribute\n\tadat.raw = adat_raw\n    \n\treturn adat\n\nadat = build_adat()\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoamteyssier%2Fadpbulk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnoamteyssier%2Fadpbulk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoamteyssier%2Fadpbulk/lists"}