{"id":13407924,"url":"https://github.com/kehrlab/bcctools","last_synced_at":"2025-03-14T12:31:51.044Z","repository":{"id":103059606,"uuid":"147702447","full_name":"kehrlab/bcctools","owner":"kehrlab","description":"Correcting barcodes in 10X linked-read sequencing data.","archived":false,"fork":false,"pushed_at":"2024-04-12T10:50:51.000Z","size":57,"stargazers_count":4,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-07-31T20:28:38.837Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kehrlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-06T16:31:02.000Z","updated_at":"2024-04-08T06:34:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"16650881-cfbd-443b-b18e-a579b277fc81","html_url":"https://github.com/kehrlab/bcctools","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kehrlab%2Fbcctools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kehrlab%2Fbcctools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kehrlab%2Fbcctools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kehrlab%2Fbcctools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kehrlab","download_url":"https://codeload.github.com/kehrlab/bcctools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243578276,"owners_count":20313794,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T20:00:49.513Z","updated_at":"2025-03-14T12:31:50.610Z","avatar_url":"https://github.com/kehrlab.png","language":"C++","readme":"bcctools\n=======\n\nA toolbox for correcting barcodes in 10X linked-read sequencing data.\n\n\n\n\nPrerequisites\n-------------\n\n* GCC version \u003e= 4.9 (supports C++14)\n* SeqAn core library, version 2.3.1? (https://github.com/seqan/seqan)\n* SDSL - Succinct Data Structure Lirbrary (https://github.com/simongog/sdsl-lite)\n* kseq.h from HTSlib (https://github.com/samtools/htslib)\n\n\n\n\nInstallation\n------------\n\n1. Download the Seqan core library. You do not need to follow the SeqAn install instructions. You only need the directory .../include/seqan with all its content (the SeqAn core library).\n2. Download and install the SDSL.\n3. Download HTSlib or just put the kseq.h header file into a folder named htslib.\n3. Edit lines 14-17 in the Makefile to point to the directories of SeqAn, SDSL and HTSlib.\n4. Run 'make' in the bcctools directory.\n\nIf everything is setup correctly, this will create the binary 'bcctools'.\n\n\n\n\nUsage\n-----\n\nThe only input needed for barcode correction is a pair of barcoded FASTQ files generated on the 10X Chromium platform.\nOptionally, you can specify a barcode whitelist file.\n\nThe program consists of several commands, which are listed when running\n\n    ./bcctools --help\n\nFor a short description of each command and an overview of arguments and options, you can run\n\n    ./bcctools \u003cCOMMAND\u003e --help\n\nIf you need the output to be sorted and/or converted to SAM, BAM, or (gzipped) FASTQ format, you can run the provided bash script. For a short description of options and arguments of this script run\n\n    ./scripts/run_bcctools -h\n\n### The whitelist command\n\n    ./bcctools whitelist [OPTIONS] \u003cFASTQ 1 file\u003e\n\nCreates a barcode whitelist based on barcode occurence in the data.\nCreating a whitelist from your data is recommended (rather than using the 10X whitelist) to reduce the number of alternatives during correction and prevents false corrections.\n\n### The index command\n\n    ./bcctools index [OPTIONS] \u003cwhitelist file\u003e\n\nCreates a barcode index from the given barcode whitelist and writes it to disk. This command is optional as the index can be created on the fly in the 'correct' command.\n\n### The correct command\n\n    ./bcctools correct [OPTIONS] \u003cwhitelist file\u003e \u003cFASTQ 1 file\u003e \u003cFASTQ 2 file\u003e\n\nCorrects barcodes of the given barcoded read pair data using the specified barcode whitelist. A barcode index is computed on the fly unless index files are present for the specified barcode whitelist. The output is a tab-separated file holding one read pair per line as decribed below.\n\n### The stats command\n\n    ./bcctools stats [OPTIONS] \u003cCorrected (gzipped) FASTQ 1 file\u003e\n    ./bcctools stats [OPTIONS] \u003cCorrected SAM/BAM file\u003e\n    ./bcctools stats [OPTIONS] \u003cCorrected TSV file\u003e\n\nComputes the number of read pairs with whitelisted, corrected and unrecognized barcodes, a barcode occurrence histogram and counts quality values of corrected barcode positions.\n\n\n\n\nExample\n-------\n\n    mkdir bcctools_example \u0026\u0026 cd bcctools_example/\n    ln -s /path/to/first.fq.gz\n    ln -s /path/to/second.fq.gz\n\n    ./bcctools whitelist -o whitelist.txt first.fq.gz\n    ./bcctools correct whitelist.txt first.fq.gz second.fq.gz \u003e corrected.tsv\n\nUsing the bash script to create a BAM file sorted by the corrected barcode sequence:\n\n    ./script/run_bcctools -f bam first.fq.gz second.fq.gz\n\n\nOutput format\n-------------\n\nThe output format of the correct command is a simple tab-separated format, where each read pair and its barcode information is given on a single line.\nThe fields are as follows:\n\nField | Description\n--- | ---\nREAD NAME | The read or query name taken from the FASTQ file and cropped at the first whitespace. \nCORRECTED BARCODE | A comma separated list of possible barcode corrections. If the raw barcode is whitelisted, the value of this field is identical to the RAW BARCODE field. An asterisk '*' indicates that the barcode is not whitelisted and correction was unsuccessful.\nRAW BARCODE | The first 16 base pairs of the first read in the read pair.\n7-MER SPACER | The seven base pairs following the first 16 base pairs of the first read in the read pair.\nTRIMMED FIRST READ | The remaining base pairs of the first read in the read pair after trimming the barcode and 7-mer spacer sequence.\nSECOND READ | The second read sequence.\nBARCODE QUALITY STRING | The first 16 values of the quality string of the first read in the read pair.\n7-MER SPACER QUALITY STRING |  The seven values following the first 16 values of the quality string of the first read in the read pair.\nTRIMMED FIRST READ QUALITY STRING | The remaining quality string after trimming the barcode and 7-mer spacer quality values.\nSECOND READ QUALITY STRING | The quality string of the second read in the read pair.\n\n\nContact\n-------\n\nFor questions and comments contact birte.kehr [at] ukr.de or create an issue.\n","funding_links":[],"categories":["Tools"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkehrlab%2Fbcctools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkehrlab%2Fbcctools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkehrlab%2Fbcctools/lists"}