{"id":13407947,"url":"https://github.com/walaj/bxtools","last_synced_at":"2026-01-21T22:43:27.525Z","repository":{"id":73879722,"uuid":"68660022","full_name":"walaj/bxtools","owner":"walaj","description":"Tools for analyzing 10X Genomics data","archived":false,"fork":false,"pushed_at":"2019-02-06T13:12:47.000Z","size":147,"stargazers_count":42,"open_issues_count":6,"forks_count":10,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-07-31T20:28:39.388Z","etag":null,"topics":["genomics","sequencing","tenxgenomics"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/walaj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-09-20T01:06:03.000Z","updated_at":"2024-07-08T21:24:29.000Z","dependencies_parsed_at":null,"dependency_job_id":"6420e4c0-635e-4b04-a0a9-915f3e93a91f","html_url":"https://github.com/walaj/bxtools","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walaj%2Fbxtools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walaj%2Fbxtools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walaj%2Fbxtools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walaj%2Fbxtools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/walaj","download_url":"https://codeload.github.com/walaj/bxtools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243578294,"owners_count":20313795,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["genomics","sequencing","tenxgenomics"],"created_at":"2024-07-30T20:00:49.719Z","updated_at":"2026-01-21T22:43:27.485Z","avatar_url":"https://github.com/walaj.png","language":"Makefile","readme":"[![Build Status](https://travis-ci.org/walaj/bxtools.svg?branch=master)](https://travis-ci.org/walaj/bxtools)\n\n## *bxtools* - Tools for analyzing 10X genomics data\n\n**License:** [MIT][license]\n\n## Note: *bxtools* is an emerging project. If you find an operation that you need that may be in the scope of *bxtools*, please submit an issue report or pull request with the suggested functionality. We are looking for community suggestions for what we might include.\n\nTable of contents\n=================\n\n  * [Installation](#installation)\n  * [Description](#description)\n  * [Components](#components)\n    * [Split](#split)\n    * [Stats](#stats)\n    * [Tile](#tile)\n    * [Relabel](#relabel)\n    * [Mol](#mol)\n    * [Convert](#convert)\n  * [Example Recipes](#examples-recipes)\n  * [Attributions](#attributions)\n\nInstallation\n------------\n\n```\ngit clone --recursive https://github.com/walaj/bxtools\ncd bxtools\n./configure\nmake \nmake install\n```\n\nDescription\n-----------\n*bxtools* is a set of light-weight command line tools for analyzing 10X genomics data. It is built to \ntake care of low-level type operations in a 10X-specific way by accounting for the BX tag in 10X data.\n\nComponents\n----------\n\n#### Split\n\nSplit a BAM file by the BX tag.\n\n```\n## split a BAM into individual BAMs (called test.\u003cbx\u003e.bam). Don't output tags with \u003c 10 reads\nbxtools split $bam -a test -m 10 \u003e counts.tsv\n\n## split a portion of a BAM \nsamtools view -h $bam 1:1,000,000-2,000,000 | bxtools split - -a test \u003e counts.tsv\n\n## just get the BX counts and sort by prevalence\nbxtools split $bam -x | sort -n -k 2,2 \u003e counts.tsv\n```\n\n#### Stats\n\nCollect BX-level statistics from a 10X BAM\n\n```\nbxtools stats $bam \u003e stats.tsv\n## output columns: BX, read count, median insert size, median mapq, median AS.\n```\n\nTo summarize based on another tag, use `-t`. E.g. : `bxtools stats -t MI $bam`\n\n\n#### Tile\n\nCollect BX-level read counts on a tiled genome\n```\n## default is 1kb tiles, across entire genome\nbxtools tile $bam \u003e counts.bed\n\n## input bed to check (e.g. chr1 only)\nsamtools view -h $bam 1:1-250,000,000 | bxtools tile - -b chr1.tiles.bed \u003e chr1.tiles.counts.bed\n```\n\n#### Relabel\nMove the BX barcodes from the ``BX`` tag (e.g. ``BX:ACTTACCGA``) to the read name (e.g. ``qname_ACTTACCGA``)\n```\nVERBOSE=-v ## print progress\nbxtools relabel $bam $VERBOSE \u003e relabeled.bam\n```\n\n#### Mol\nGet the minimum molecular footprint on the genome as BED file for each MI tag. The \nminimal footprint is defined from the minimum start position to the maximum end position of \nall reads sharing an MI tag. Throws an error message if detects the same MI tag on multiple chromosomes.\n\nThe output BED format is chr, start, end, MI, BX, read_count\n```\nbxtools mol $bam \u003e mol_footprint.bed\n```\n\n#### Convert\nSwitch the alignment chromosome with the BX tag. This is a hack to allow a 10X BAM to be sorted and indexed by BX tag, rather than coordinate. \nUseful for rapid lookup of all BX reads from a particular BX. Note that this switches \"-\" for \"_\" to make query possible with ``samtools view``.\nThis also requires a two-pass solution. The first loop is to get all of the unique BX tags to build the new BAM header. The second makes the switches.\nThis means that streaming from ``stdin`` is not available.\n\n```\nbxtools convert $bam | samtools sort - -o bx_sorted.bam\nsamtools index bx_sorted.bam\nsamtools view AGTCCAAGTCGGAAGT_1\n```\n\nExample recipes\n---------------\n#### Get BX level coverage in 2kb bins across genome, ignore low-frequency tags\n\n```\n## make a list of bad tags (freq \u003c 100)\nsamtools view -h $bam 1:1-10,000,000 | bxtools split - -x | awk '$2 \u003c 100' | cut -f1 \u003e excluded_list.txt\n\n## get the coverage, while excluding bad tags (grep: -F literal, -f file, -v exclude)\nsamtools view -h $bam 1:1-10,000,000 | grep -v -F -f excluded_list.txt | bxtools tile - -w 2000 \u003e bxcov.bed\n```\n\nAttributions\n------------\n\nThis project is developed and maintained by Jeremiah Wala (jwala@broadinstitute.org)\n\nAnalysis suggestions and 10X support\n* Tushar Kamath - MD-PhD Student, Harvard Medical School\n* Gavin Ha - Postdoctoral Fellow, Broad Institute\n* Srinivas Viswanathan - Oncology Fellow, Dana Farber Cancer Institute\n* Chris Whelan - Computational Biologist, Broad Institute\n* Cheng-Zhong Zhang - Assistant Professor, Dana Farber Cancer Institute\n* Marcin Imielinski - Assistant Professor, Weill Cornell Medical College\n* Rameen Beroukhim - Assistant Professor, Dana Farber Cancer Institute\n* Matthew Meyerson - Professor, Dana Farber Cancer Institute\n\n[license]: https://github.com/walaj/bxtools/blob/master/LICENSE\n","funding_links":[],"categories":["Tools"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwalaj%2Fbxtools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwalaj%2Fbxtools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwalaj%2Fbxtools/lists"}