https://github.com/k96nb01/immunogenetr_package
An R library for HLA informatics
https://github.com/k96nb01/immunogenetr_package
hla hla-typing informatics r tidyverse
Last synced: 25 days ago
JSON representation
An R library for HLA informatics
- Host: GitHub
- URL: https://github.com/k96nb01/immunogenetr_package
- Owner: k96nb01
- Created: 2024-06-13T15:12:14.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2026-05-26T05:57:54.000Z (27 days ago)
- Last Synced: 2026-05-26T06:31:46.607Z (27 days ago)
- Topics: hla, hla-typing, informatics, r, tidyverse
- Language: R
- Homepage: https://k96nb01.github.io/immunogenetr_package/
- Size: 9.72 MB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
Awesome Lists containing this project
- awesome-vdj - **immunogenetr**
README
---
output:
github_document:
# Disable pandoc's 'superscript' extension so '^' characters inside
# GL-string cells stay as literal '^' in the rendered output instead
# of being wrapped in HTML tags. Same reason we disable
# 'subscript' (paranoia; GL strings don't use '~' but could in theory).
md_extensions: -superscript-subscript
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE, # suppress library() startup messages
warning = FALSE # suppress dplyr summarise/grouping messages
)
# kable_hla(): wrap knitr::kable, rendering each character cell as a
# code span (backtick-wrapped). HLA GL strings contain '*' and '^', both
# of which trigger markdown formatting downstream: GitHub Flavored Markdown
# turns '*...*' into italics, and pandoc's superscript extension (enabled
# by default in pkgdown's render pipeline) turns '^...^' into ....
# Wrapping cell values in backticks bypasses both because content inside
# a code span is treated as literal text by every markdown renderer.
# This survives the github_document -> README.md -> pkgdown round-trip,
# which '\\*' / '\\^' escaping does not (pandoc drops the redundant '\\^'
# escape when superscript is disabled, leaving a bare '^' that pkgdown
# then re-interprets).
#
# Uses an explicit column loop rather than `df[] <- lapply(df, ...)` so the
# function is robust to tibble-specific [<- methods.
kable_hla <- function(df) {
for (nm in names(df)) {
if (is.character(df[[nm]])) {
df[[nm]] <- paste0("`", df[[nm]], "`")
}
}
knitr::kable(df)
}
```
# immunogenetr 
[](https://app.codecov.io/gh/k96nb01/immunogenetr_package)
[](https://github.com/k96nb01/immunogenetr_package/actions/workflows/R-CMD-check.yaml)
immunogenetr is a comprehensive toolkit for clinical HLA informatics. It is built on tidyverse principles and makes use of genotype list string (GL string, https://glstring.org/) for storing and using HLA genotype data.
Specific functionalities of this library include:
- **Coercion of HLA data** in tabular format to and from GL string.
- **Calculation of matching and mismatching** in all directions, with multiple output formats.
- **Automatic formatting of HLA data** for searching within a GL string.
- **Truncation of molecular HLA data** to a specific number of fields.
- **Reading HLA genotypes in HML files** and extracting the GL string.
## Table of Contents
- [Installation](#installation)
- [Usage](#usage)
- [Citation](#citation)
- [License](#license)
- [Disclaimer](#disclaimer)
## Installation
You may install immunogenetr from CRAN with the below line of code:
```{r install, eval = FALSE}
install.packages("immunogenetr")
```
## Usage
To demonstrate some functionality of `immunogenetr` we will use an internal dataset to perform match grades for a putative recipient/donor pair.
```{r load-package}
library(immunogenetr)
library(tidyverse)
# The "HLA_typing_1" dataset is installed with immunogenetr, and contains high resolution typing at all classical
# HLA loci for ten individuals.
kable_hla(HLA_typing_1)
```
immunogenetr uses genotype list strings (GL strings) for most functions, including the matching and mismatching functions. To easily convert the genotypes found in "HLA_typing_1" to GL strings we can use the `HLA_columns_to_GLstring` function:
```{r build-gl-string}
HLA_typing_1_GLstring <- HLA_typing_1 %>%
mutate(GL_string = HLA_columns_to_GLstring(., HLA_typing_columns = A1:DPB1_2), .after = patient) %>%
# Note the syntax for the `HLA_columns_to_GLstring` arguments - when this function is used inside
# of a `mutate` function to make a new column in a data frame, "." is used in the first argument
# to tell the function to use the working data frame as the source of the HLA typing columns.
select(patient, GL_string)
kable_hla(HLA_typing_1_GLstring)
```
The "HLA_typing_1_GLstring" data frame now contains a row with a GL string for each individual, containing their full HLA genotype in a single string. Let's select one individual to act as a recipient, and one to act as a donor.
```{r recip-donor}
# Select one case each for recipient and donor.
HLA_typing_1_GLstring_recipient <- HLA_typing_1_GLstring %>%
filter(patient == 7) %>%
rename(GL_string_recipient = GL_string, case = patient)
HLA_typing_1_GLstring_donor <- HLA_typing_1_GLstring %>%
filter(patient == 9) %>%
rename(GL_string_donor = GL_string) %>%
select(-patient)
# Combine the tables so recipient and donor are on the same row.
HLA_typing_1_recip_donor <- bind_cols(
HLA_typing_1_GLstring_recipient,
HLA_typing_1_GLstring_donor
)
kable_hla(HLA_typing_1_recip_donor)
```
We now have a data frame with a recipient and donor HLA genotype on one row. Let's try out some of the mismatching functions on this data.
```{r mismatches-logical}
HLA_typing_1_recip_donor_mismatches <- HLA_typing_1_recip_donor %>%
mutate(A_MM_GvH = HLA_mismatch_logical(
GL_string_recipient,
GL_string_donor,
"HLA-A",
direction = "GvH"),
.after = case) %>%
mutate(A_MM_HvG = HLA_mismatch_logical(
GL_string_recipient,
GL_string_donor,
"HLA-A",
direction = "HvG"),
.after = A_MM_GvH)
kable_hla(HLA_typing_1_recip_donor_mismatches)
```
The `HLA_mismatch_logical` function determines if there are any mismatches at a particular locus. We've determined that at the HLA-A locus there are not any mismatches in the graft-versus-host direction, but are in the host-versus-graft direction. We can use the `HLA_mismatched_alleles` function to tell us what those mismatches are:
```{r mismatched-alleles}
HLA_typing_1_recip_donor_mismatched_allles <- HLA_typing_1_recip_donor %>%
mutate(A_HvG_MMs = HLA_mismatched_alleles(
GL_string_recipient,
GL_string_donor,
"HLA-A",
direction = "HvG"),
.after = case)
kable_hla(HLA_typing_1_recip_donor_mismatched_allles)
```
The `HLA_mismatched_alleles` function reported that the "HLA-A*30:01" allele was mismatched in the HvG direction. Sometimes, however, we simply want to know how many mismatches are at a particular locus. We can do that with the `HLA_mismatch_number` function:
```{r mismatch-number}
# Determine the number of bidirectional mismatches at several loci.
HLA_typing_1_recip_donor_MM_number <- HLA_typing_1_recip_donor %>%
mutate(ABCDRB1_MM = HLA_mismatch_number(
GL_string_recipient,
GL_string_donor,
c("HLA-A", "HLA-B", "HLA-C", "HLA-DRB1"),
direction = "bidirectional"),
.after = case)
kable_hla(HLA_typing_1_recip_donor_MM_number)
```
We might want to calculate an HLA match summary for stem cell transplantation. We can use the `HLA_match_summarry_HCT` function for this:
```{r match-summary-hct}
# The match_grade argument of "Xof8" will return the number of matches at the HLA-A, B, C, and DRB1 loci.
HLA_typing_1_recip_donor_8of8_matching <- HLA_typing_1_recip_donor %>%
mutate(ABCDRB1_matching = HLA_match_summary_HCT(
GL_string_recipient,
GL_string_donor,
direction = "bidirectional",
match_grade = "Xof8"),
.after = case)
kable_hla(HLA_typing_1_recip_donor_8of8_matching)
```
Clearly, this recipient and donor are not a great match. Let's see how we could use this workflow to find the best-matched donor from several options. To do this, we'll choose a case from "HLA_typing_1" and compare it to all the cases in that data set:
```{r donor-comparison}
# Select one case to be the recipient.
HLA_typing_1_GLstring_candidate <- HLA_typing_1_GLstring %>%
filter(patient == 3) %>%
select(GL_string) %>%
rename(GL_string_recip = GL_string)
# Join the recipient to the 10-donor list and perform matching
HLA_typing_1_GLstring_donors <- HLA_typing_1_GLstring %>%
rename(GL_string_donor = GL_string, donor = patient) %>%
cross_join(HLA_typing_1_GLstring_candidate) %>%
mutate(ABCDRB1_matching = HLA_match_summary_HCT(
GL_string_recip,
GL_string_donor,
direction = "bidirectional",
match_grade = "Xof8"),
.after = donor) %>%
arrange(desc(ABCDRB1_matching))
kable_hla(HLA_typing_1_GLstring_donors)
```
We can see that donor 3 is the only donor with an 8/8 match for the recipient.
## Citation
If you use immunogenetr in your research, please cite:
Coskun B, Brown NK. Immunogenetr: A comprehensive toolkit for clinical HLA informatics. *Human Immunology*. 2026;87(1):111619. doi:[10.1016/j.humimm.2025.111619](https://doi.org/10.1016/j.humimm.2025.111619)
You can also get the citation from R with `citation("immunogenetr")`.
## License
This project is licensed under the GNU General Public License v3.0.
## Disclaimer
This library is intended for research use. Any application making use of this package in a clinical setting will need to be independently validated according to local regulations.