An open API service indexing awesome lists of open source software.

https://github.com/mschubert/pepitope


https://github.com/mschubert/pepitope

Last synced: about 1 month ago
JSON representation

Awesome Lists containing this project

README

          

pepitope: extract, qc and screen *pep*tide ep*itope*s
=====================================================

Peptide-TCR co-culture screens support the development of personalized
immunotherapy by revealing the specific reactivity of patient-derived T cell
receptors to patient-specific (neo-)antigens. Here, we provide a software tool
to assist in the creation of patient-specific antigen libraries, as well as
in analyzing the result of co-culture screening data.

This R package is used to:

* [**Extract peptides**](#generating-minigene-constructs) with flanking region around mutations
* Run [**quality control**](#performing-quality-control-on-construct-library-sequencing) on sequencing of these libraries
* Perform [**differential abundance**](#differential-abundance-testing-of-co-culture-screens) testing of co-culture screens

Installation
------------

The package is currently only available on Github, use the`remotes` package to
install:

```r
# run this in R >= 4.5.0
# note that we need Rust, Cargo, and cmake to compile the dependencies
if (!requireNamespace("remotes", quietly=TRUE))
install.packages("remotes")
remotes::install_github("mschubert/pepitope", dependencies=TRUE, timeout=300)
```

Usage
-----

### Generating minigene constructs

Here we have sequenced the DNA (and optionally RNA) of a patient and
identified the variants in a `.vcf` file. We now want to extract the
reference and mutated alternative sequences including their flanking
regions into a summary report. The steps are:

* Load a genome and annotation, usually GRCh38 and Ensembl
* Load a VCF variants file as `VRanges` object and annotate the protein-coding mutations
* Optionally, load a fusion VCF and annotating those
* Subset the peptide context around each mutation
* Make a report of variants, coding changes, and tiled peptides

More information can be found in the [*Minigene report* vignette 🔗](https://mschubert.github.io/pepitope/articles/minigene.html).

Code example

```r
library(pepitope)

# genome and annotation
ens106 = AnnotationHub::AnnotationHub()[["AH100643"]]
asm = BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
seqlevelsStyle(ens106) = "UCSC"

# read variants from VCF file, apply filters and annotate
variant_vcf_file = system.file("my_variants.vcf", package="pepitope")
vr = readVcfAsVRanges(variant_vcf_file) |>
filter_variants(min_cov=2, min_af=0.05, pass=TRUE)
ann = annotate_coding(vr, ens106, asm)
subs = ann |>
# filter_expressed(rna_sample, min_reads=1, min_tpm=0) |>
subset_context(15)

# read fusion variants, apply filters and annotate
fusion_vcf_file = system.file("my_fusions.vcf", package="pepitope")
vr2 = readVcfAsVRanges(fusion_vcf_file) |>
filter_fusions(min_reads=2, min_split_reads=1, min_tools=1)
seqlevelsStyle(vr2) = "UCSC"
fus = annotate_fusions(vr2, ens106, asm) |>
subset_context_fusion(15)

# create construct tables and make a report
tiled = make_peptides(subs, fus) |>
pep_tile() |>
remove_cutsite(BbsI="GAAGAC")

report = make_report(ann, subs, fus, tiled)
writexl::write_xlsx(report, "my_variants.xlsx")
```

### Creating construct library (wetlab)

We want to express the sequences (minigenes) including their flanking regions
(context) in target cells that will be used in a co-culture screen with T-cells.
For this, we first need to add a barcode to each construct and then order
them as gene blocks and transduce them into the target cells. The steps are:

* Add Barcodes in the annotation sheets as `barcode` or `barcode_{1,2}` columns
* Order these constructs as gene blocks and clone them into expression vectors
* Transduce target cells with this peptide construct library

Code example (runnable without external data)

```r
# creating barcoded constructs
lib = "https://raw.githubusercontent.com/hawkjo/freebarcodes/master/barcodes/barcodes12-1.txt"
valid_barcodes = readr::read_tsv(lib, col_names=FALSE)$X1
all_constructs = example_peptides(valid_barcodes)
plot_barcode_overlap(all_constructs, valid_barcodes)
```

Code example (loading from .xlsx)

```r
# this file is manually created from the output of step 1
fname = "my_combined_barcoded_file.xlsx"
sheets = readxl::excel_sheets(fname)
all_constructs = sapply(sheets, readxl::read_xlsx, path=fname, simplify=FALSE)
plot_barcode_overlap(all_constructs, valid_barcodes)
```

### Performing quality control on construct library sequencing

In each step of generating the target cells expressing the reference and mutated
versions of each peptide, we want to make sure our library is well-represented.
For this, we will check if all constructs that should be in there are, and whether
they are present in a similar enough amount. The steps are:

* Check the quality of the construct libraries
* Check the quality of the transduced target cells
* Check the quality of the co-culture screens

More information can be found in the [*Quality Control* vignette 🔗](https://mschubert.github.io/pepitope/articles/qc.html)

Code example

```r
# demultiplexing and counting example data
sample_sheet = system.file("my_samples.tsv", package="pepitope")
fastq_file = example_fastq(sample_sheet, all_constructs)
temp_dir = demux_fq(fastq_file, sample_sheet, read_structures="7B+T")
dset = count_bc(temp_dir, all_constructs, valid_barcodes)

# quality control plots
plot_reads(dset)
plot_distr(dset)
```

### Differential abundance testing of co-culture screens

Finally, we co-culture our target cells with T-cells expressing a variety of
TCRs with our expressed peptide libraries to find the reactive ones. Those will
be visible by decreasing in abundance more than the reference peptides compared
to a mock-transduced population that was cultured the same way. The steps are:

* Calculate the differential abundance of peptide barcodes
* Plot the results to identify peptides recognized by T-cells

More information can be found in the [*Co-culture screen* vignette 🔗](https://mschubert.github.io/pepitope/articles/screen.html).

Code example

```r
# perform abundance testing and plot results
res = screen_calc(dset, list(c("Sample", "Mock")))
plot_screen(res$`Sample vs Mock`)
```