Characterizing T cell receptor repertoires using high throughput sequencing (TCR-seq)

Recognition of antigens by T cells relies on a diverse repertoire of T-cell receptor (TCR) chains. In the era of whole genome sequencing, the expressed repertoire of lymphocyte receptors remains “terra incognita” of unmapped genetic information that is highly variable between individuals and also dynamically changing throughout life. Global characterization of these repertoires using recent advances in high-throughput sequencing is expected to revolutionize our understanding of the adaptive immune system and its related pathologies, ranging from immunodeficiencies to cancer and autoimmunity. We developed a methodology for using high throughput sequencing to map TCR repertoires at high resolution (TCR-seq). We apply TCR-seq to study various immune perturbations, including repertoire changes in response to infection, vaccination or in autoimmunity.
Although the TCR repertoire is produced by a random process, studies during the years identified biases in the repertoire, for example the unequal usage of V and J gene segments. However, the mechanisms that govern such biases remain poorly understood. Using TCR-seq we revealed that chromatin conformation at the DJβ genomic locus explains more than 80% of the biases in Jβ usage that we measured. Remarkably, chromatin conformation also explains Jβ usage biases measured previously in human T cells (Ndifon, Gal, et. al. 2012, see Figure 2).
We demonstrate that as a consequence of these structural and other biases, the TCR repertoire, despite its random and highly diverse nature, contains a surprisingly large number of public sequences that are shared among individuals. We derive a necessary mathematical condition for this surprising finding, which indicates that the TCR repertoire contains a “core” set of receptor sequences that are highly shared among individuals. Together, our results provide evidence for an expanded role of chromatin structure in VDJ rearrangement, from control of gene accessibility to precise determination of gene usage.