Prediction of HLA genotypes from single-cell transcriptome data.

Document Type


Publication Date


Publication Title

Front Immunol


washington; seattle; renton; swedish; isb; system; genomics; HLA genotype; HLA typing algorithm; allele specific expression; human leukocyte antigen (HLA); major histocompatibility (MHC); next-generation sequencing data (NGS); single-cell sequencing (scRNA-seq); Humans; Transcriptome; Sequence Analysis, DNA; HLA Antigens; Histocompatibility Antigens Class I; Genotype; Histocompatibility Antigens Class II


The human leukocyte antigen (HLA) locus plays a central role in adaptive immune function and has significant clinical implications for tissue transplant compatibility and allelic disease associations. Studies using bulk-cell RNA sequencing have demonstrated that HLA transcription may be regulated in an allele-specific manner and single-cell RNA sequencing (scRNA-seq) has the potential to better characterize these expression patterns. However, quantification of allele-specific expression (ASE) for HLA loci requires sample-specific reference genotyping due to extensive polymorphism. While genotype prediction from bulk RNA sequencing is well described, the feasibility of predicting HLA genotypes directly from single-cell data is unknown. Here we evaluate and expand upon several computational HLA genotyping tools by comparing predictions from human single-cell data to gold-standard, molecular genotyping. The highest 2-field accuracy averaged across all loci was 76% by arcasHLA and increased to 86% using a composite model of multiple genotyping tools. We also developed a highly accurate model (AUC 0.93) for predicting


Institute for Systems Biology