https://github.com/pwilmart/yeast_carbonsources

Yeast TMT data - 3 different carbon sources (from Gygi lab) analyzed with PAW pipeline and MaxQuant
https://github.com/pwilmart/yeast_carbonsources

jupyter-notebook paw-pipeline proteomics r statistical-testing tmt tmt-data-analyses

Last synced: 4 months ago
JSON representation

Yeast TMT data - 3 different carbon sources (from Gygi lab) analyzed with PAW pipeline and MaxQuant

Host: GitHub
URL: https://github.com/pwilmart/yeast_carbonsources
Owner: pwilmart
License: mit
Created: 2019-01-30T01:43:26.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-03-27T17:34:34.000Z (over 2 years ago)
Last Synced: 2025-01-12T18:33:12.608Z (6 months ago)
Topics: jupyter-notebook, paw-pipeline, proteomics, r, statistical-testing, tmt, tmt-data-analyses
Language: Jupyter Notebook
Homepage:
Size: 40.9 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Yeast_CarbonSources
Yeast grown with galactose, glucose, or raffinose carbon sources from the Gygi lab labeled with 10-plex TMT reagents, in triplicate.

These are analyses of a public dataset (PRIDE [PXD002875](https://www.ebi.ac.uk/pride/archive/projects/PXD002875)) from Paulo, O'Connell, Gaun, and Gygi:

> Paulo, J.A., O’Connell, J.D., Gaun, A. and Gygi, S.P., 2015. Proteome-wide quantitative multiplexed profiling of protein expression: carbon-source dependency in Saccharomyces cerevisiae. Molecular biology of the cell, 26(22), pp.4063-4074.

There were 24 RAW files of yeast grown in three different carbon sources. It was a 3x3 (9-plex) TMT experiment done with the SPS MS3 (MultiNotch) method on a Thermo Fusion instrument.

Analyses:
- [PAW pipeline](https://github.com/pwilmart/PAW_pipeline.git)
- [CarbonSources_part-1](https://pwilmart.github.io/TMT_analysis_examples/CarbonSources_part-1.html)
- CarbonSources_part-2
- MaxQuant
- [CarbonSources_MQ](https://pwilmart.github.io/TMT_analysis_examples/CarbonSources_MQ.html)

---
## `PAW` folder contents

**File types:**
- `*.ipynb` - Jupyter notebooks

- `*.r` - code cells from notebooks

- `*.html` - notebooks rendered in html

- `results_files` folder:
- `*.log` - console output log files from pipeline steps
- `*.txt` - tab-delimited text results_files
- protein summaries
- peptide summaries
- `*.xlsx` - Excel files
- `R-input.txt` - prepped table of TMT data for importing into r
- `CarbonSources_results.txt` - statistical testing results from r

---

## `MQ` folder contents

**Files:**
- CarbonSources_MQ.ipynb - Jupyter notebook for statistical analysis
- CarbonSources_MQ.html - html rendering of notebook
- CarbonSources_MQ.r - code cells from notebook
- CarbonSources_results.txt - statistical testing results
- parameters.txt - summary of MQ parameter settings
- proteinGroups.txt - main protein-level results file from MQ
- proteinGroups.xlsx - Modified Excel file (for table prepping)
- R-input.txt - prepped table for import into r
- summary.txt - summary file from MQ (LC run stats)

---

## R input table prep

Basic steps were similar for both pipelines:
- flag proteins to exclude
- common contaminants
- decoys
- proteins with no reporter ion signals
- sort excluded proteins to bottom of table
- make new tab
- add column of protein accessions
- add columns of the TMT channels
- export the new tab contents to text files
- table should be well-formed (single header line) and rectangular
- read text file into R

Statistical test results in R are collected into a data frame in the same order as the imported proteins. At the end of the notebook, the results file is saved as a text file for adding back to the main protein results spreadsheet file. The accessions are also included to make sure that the rows are correctly aligned.

Eventually, there needs to be a coherent, comprehensive summary file that contains the proteomics results, the statistical testing results, and any other information to aid biological interpretation (rich annotations, etc.). This will be needed for publication and is a nice thing to include in data repositories. An Excel file is a good format for this since adding descriptive text and formatting are easy. A basic Excel sheet can be easily distributed in Supplemental files and opened in Open Office applications.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pwilmart/yeast_carbonsources

Awesome Lists containing this project

README