Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/clauswilke/omega_mutsel
Code and data for Spielman and Wilke, The relationship between dN/dS and scaled selection coefficients, Mol. Biol. Evol. 2015.
https://github.com/clauswilke/omega_mutsel
Last synced: about 1 month ago
JSON representation
Code and data for Spielman and Wilke, The relationship between dN/dS and scaled selection coefficients, Mol. Biol. Evol. 2015.
- Host: GitHub
- URL: https://github.com/clauswilke/omega_mutsel
- Owner: clauswilke
- Created: 2014-06-07T01:16:25.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2019-02-22T14:20:13.000Z (almost 6 years ago)
- Last Synced: 2024-10-14T17:29:10.355Z (3 months ago)
- Language: TeX
- Homepage:
- Size: 38.4 MB
- Stars: 1
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Omega_MutSel
============Repository for "The relationship between dN/dS and scaled selection coefficients", Stephanie J. Spielman and Claus O. Wilke.
All code written by SJS (contact at [email protected]).## Description of Contents ##
- - - -__datasets/__
Contains tab-delimited summary files for simulated datasets. All simulated alignments available from TBD (for now, email [email protected]).
* [no_synsel.txt](./datasets/no_synsel.txt)
* simulations with symmetric mutation rates and in which synonymous codons all have same fitness ([Figure 1A](./Manuscript/figures/MainText/dnds_variance.pdf), [Figure 2B](./Manuscript/figures/MainText/regression_convergence.pdf))
* [synsel.txt](./datasets/synsel.txt)
* simulations with symmetric mutation rates and in synonymous codons have different fitness ([Figure 1A](./Manuscript/figures/MainText/dnds_variance.pdf), [Figure 2B](./Manuscript/figures/MainText/regression_convergence.pdf))
* [conv.txt](./datasets/conv.txt)
* simulations to demonstrate convergence of omega to dN/dS ([[Figure 2C](./Manuscript/figures/MainText/regression_convergence.pdf)](./Manuscript/figures/MainText/regression_convergence_raw.pdf))
* [np.txt](./datasets/np.txt)
* simulations which use experimental NP amino acid fitness data ([Bloom 2014](http://mbe.oxfordjournals.org/content/31/8/1956)) in combination with NP mutation rates ([Bloom 2014](http://mbe.oxfordjournals.org/content/31/8/1956)) ([Figure 3](./Manuscript/figures/MainText/nyp_bias_r2.pdf), [Tables 1, S1, S2, S3](./Manuscript/figures/latex_tables.txt), and [Figure S1](./Manuscript/figures/SI/nyp_regression.pdf))
* [yeast.txt](./datasets/yeast.txt)
* simulations which use experimental NP amino acid fitness data ([Bloom 2014](http://mbe.oxfordjournals.org/content/31/8/1956)) in combination with yeast mutation rates ([Zhu 2014](http://www.pnas.org/content/111/22/E2310)) ([Figure 3](./Manuscript/figures/MainText/nyp_bias_r2.pdf), [Tables 1, S1, S2, S3](./Manuscript/figures/latex_tables.txt), and [Figure S1](./Manuscript/figures/SI/nyp_regression.pdf))
* [polio.txt](./datasets/polio.txt)
* simulations which use experimental NP amino acid fitness data ([Bloom 2014](http://mbe.oxfordjournals.org/content/31/8/1956)) in combination with polio virus mutation rates ([Acevedo 2014](http://www.nature.com/nature/journal/v505/n7485/full/nature12861.html)) ([Figure 3](./Manuscript/figures/MainText/nyp_bias_r2.pdf), [Tables 1, S1, S2, S3](./Manuscript/figures/latex_tables.txt), and [Figure S1](./Manuscript/figures/SI/nyp_regression.pdf))__scripts/__
Contains scripts used in analysis. [NOTE: all simulated alignments were created using a custom sequence simulation library, [pyvolve](https://github.com/sjspielman/pyvolve). See within for details.]
* experimental_data/
* [nucleoprotein_amino_preferences.txt](./scripts/experimental_data/nucleoprotein_amino_preferences.txt)
* This file corresponds exactly to supplementary_file_1.xls from of [Bloom 2014](http://mbe.oxfordjournals.org/content/31/8/1956). Gives amino acid preference/fitness data for each of the 498 positions in NP. Each row is a position, and each column is the amino acid preference (alphabetical)
* [np_codon_eqfreqs.txt](./scripts/experimental_data/np_codon_eqfreqs.txt)
* Contains codon equilibrium frequencies computed from NP preference data and NP mutation rates. Each row is a position, and values are alphabetical (first column is AAA, second column is AAC, etc.). Generated by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) .
* [yeast_codon_eqfreqs.txt](./scripts/experimental_data/yeast_codon_eqfreqs.txt)
* Contains codon equilibrium frequencies computed from yeast preference data and yeast mutation rates. Each row is a position, and values are alphabetical (first column is AAA, second column is AAC, etc.). Generated by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) .
* [polio_codon_eqfreqs.txt](./scripts/experimental_data/polio_codon_eqfreqs.txt)
* Contains codon equilibrium frequencies computed from polio preference data and polio mutation rates. Each row is a position, and values are alphabetical (first column is AAA, second column is AAC, etc.). Generated by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) .* np_scripts/
* [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py)
* Compute equilibrium codon frequencies for a variety of frequency parameterizations from experimental NP amino acid fitness data in combination with either NP, yeast, or polio mutation rates. See script for full description.
* [globalDNDS_raw_exp.bf](./hyphy_files/globalDNDS_raw_exp.bf)
* Template batchfile used by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) to create [globalDNDS_{np/yeast/polio}.bf](https://github.com/clauswilke/Omega_MutSel/tree/master/hyphy_files) files.* simulation_scripts/ Scripts in this directory were created to run specifically on The University of Texas at Austin's Center for Computational Biology and Bioinformatics cluster, [Phylocluster](http://ccbb.biosci.utexas.edu/resources.html). All files w/ extension ".qsub" are job submission scripts corresponding to a particular python script, such that _xyz_.qsub goes with run_ *xyz*.py.
* [run_sim_nyp.py](./scripts/simulation_scripts/run_sim_nyp.py)
* simulate alignments which use NP amino acid fitness data and either NP, yeast, or polio mutation rates
* [run_nyp.py](./scripts/simulation_scripts/run_nyp.py)
* infer dN/dS, omega for NP, yeast, or polio datasets
* [run_siminf.py](./scripts/simulation_scripts/run_siminf.py)
* simulate alignments and subsequently infer dN/dS and omega for the "synonymous selection" and "no synonymous selection" sets
* [run_convergence.py](./scripts/simulation_scripts/run_convergence.py)
* simulate alignmets, infer dN/dS and omega to demonstrate omega convergence with data sets of increasing size
* [functions_omega_mutsel.py](./scripts/simulation_scripts/functions_omega_mutsel.py)
* Contains functions used by scripts in this directory.__hyphy_files/__
Contains files used in HYPHY inference.
* [globalDNDS_fequal.bf](./hyphy_files/globalDNDS_fequal.bf)
* hyphy batchfile to infer omega according to GY94 M0 model with Fequal (1/61 for all codons) frequency parameterization. Used to determined omega for no[synsel.txt](./datasets/conv.txt), [synsel.txt](./datasets/conv.txt), [conv.txt](./datasets/conv.txt) .
* [globalDNDS_np.bf](./hyphy_files/globalDNDS_np.bf)
* hyphy batchfile to infer omega for simulations with experimental NP amino acid fitness data and NP mutation rates, according to a variety of frequency parameterizations. Generated by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) .
* [globalDNDS_yeast.bf](./hyphy_files/globalDNDS_yeast.bf)
* hyphy batchfile to infer omega for simulations with experimental NP amino acid fitness data and yeast mutation rates, according to a variety of frequency parameterizations. Generated by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) .
* [globalDNDS_yeast.bf](./hyphy_files/globalDNDS_yeast.bf)
* hyphy batchfile to infer omega for simulations with experimental NP amino acid fitness data and polio mutation rates, according to a variety of frequency parameterizations. Generated by [prefs_to_freqs.py](./scripts/np_scripts/prefs_to_freqs.py) .
* [CF3x4.bf](./hyphy_files/CF3x4.bf)
* hyphy batchfile used in conjunction with [globalDNDS_{np/yeast/polio}.bf](https://github.com/clauswilke/Omega_MutSel/tree/master/hyphy_files) files to compute CF3x4 equilibrium codon frequencies.
* [GY94.mdl](./hyphy_files/GY94.mdl)
* contains standard GY94 rate matrix
* [MG_np.mdl](./hyphy_files/MG_np.mdl)
* contains MG1 and MG3 matrices for NP mutation rates
* [MG_yeast.mdl](./hyphy_files/MG_yeast.mdl)
* contains MG1 and MG3 matrices for yeast mutation rates
* [MG_polio.mdl](./hyphy_files/MG_polio.mdl)
* contains MG1 and MG3 matrices for polio mutation rates