Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/etowahadams/mafs2vcf
mafs2vcf is a Python command line interface for converting ANGSD generated .mafs files to a variant call format file (.vcf).
https://github.com/etowahadams/mafs2vcf
Last synced: about 2 months ago
JSON representation
mafs2vcf is a Python command line interface for converting ANGSD generated .mafs files to a variant call format file (.vcf).
- Host: GitHub
- URL: https://github.com/etowahadams/mafs2vcf
- Owner: etowahadams
- Created: 2020-09-20T05:10:08.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-10-11T12:56:52.000Z (about 4 years ago)
- Last Synced: 2024-10-18T17:26:29.745Z (3 months ago)
- Language: Python
- Homepage:
- Size: 25.4 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# mafs2vcf
mafs2vcf is a Python command line interface for converting
[ANGSD generated .mafs files](http://www.popgen.dk/angsd/index.php/Allele_Frequencies) to a
[variant call format](https://en.wikipedia.org/wiki/Variant_Call_Format) file (.vcf).## Installation
```
$ git clone https://github.com/etowahs/mafs2vcf.git
$ cd mafs2vcf
$ pip3 install .
$ mafs2vcf --help
```## Usage
To convert mafs file to vcf, mafs2vcf requires the mafs file for the target species and divergent species. If you
also have a mafs file for an ancestral species, it can be included as an option argument with the flag `-a`.
```
$ mafs2vcf --t data/target.mafs --d data/divergent.mafs --a data/ancestral.mafs
```
The resulting vcf file will have four different individuals, two for the target (SAMP1, SAMP2), one for the divergent (DIV1),
and one for the ancestral (ANC1). The target mafs is divided up into two individuals because it is useful for downstream processing
in the [massprf-pipeline](https://github.com/sjgaughran/massprf-pipeline).
If the knownEM of a site is under 0.99, then we assume that the site is polymorphic and the sample gets a 0/1 in the vcf file.
If the knownEM is >= 0.99, then we assume that the site is fixed and the sample gets a 1/1/ in the vcf file.