Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/m-osky/vcfiles
VCF: Variant Call Format files are difficult to understand at first, I already had to dive into them in order to learn how to edit them, so you can use my scripts. Some of these functions already exist in vcftools, but it seems it stopped being updated. so you can use these scripts to do various tasks with them
https://github.com/m-osky/vcfiles
genomics vcf vcf-files vcftools
Last synced: 18 days ago
JSON representation
VCF: Variant Call Format files are difficult to understand at first, I already had to dive into them in order to learn how to edit them, so you can use my scripts. Some of these functions already exist in vcftools, but it seems it stopped being updated. so you can use these scripts to do various tasks with them
- Host: GitHub
- URL: https://github.com/m-osky/vcfiles
- Owner: M-Osky
- Created: 2021-09-02T08:51:04.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-09-02T09:43:40.000Z (over 3 years ago)
- Last Synced: 2023-12-04T14:57:58.663Z (about 1 year ago)
- Topics: genomics, vcf, vcf-files, vcftools
- Language: Perl
- Homepage:
- Size: 32.2 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# VCFiles
Variant Call Format files are difficult to understand at first, I already had to dive into them in order to learn how to edit them, so you can use my scripts. Some of these functions already exist in vcftools, but it seems it stopped being updated.
so you can use these scripts to do various tasks with themAll scripts have their own HELP information, just call them with the usual flags: -h --help
# vcf_pop_subseter
outputs a new vcf file with only samples from the specified populations# vcf_loci_subseter
outputs a new vcf file with only the specified loci or with a number n of random loci# vcf_locinames
extracts the locus IDs from a vcf file# vcf_random_resampler
outputs a new vcf file with a number n of samples per population, useful if there are differences in your sample size per population, samples per population are pulled randomly.# coverage_vcfilter
this will analyse mean coverage/depth (DP) per locus.
Contrary to vcftools you can choose how to handle missing depth information: genotype with missing DP can ither be ignored when computing the average or otherwise they can be considered as DP=0. Caution: vcftools seem to use both this criteria, for the table generated it ignores missing, but when filtering out loci according to coverage it seems to consider missing as zero.# vcf_joiner
joins multiple vcf files in one file. Careful if they have different loci.