SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors) is a small, fast, and accurate TCR representation model that can be used for alignment-free TCR analysis, including for TCR-pMHC interaction prediction and TCR clustering (metaclonotype discovery). Our preprint demonstrates that SCEPTR can be used for few-shot TCR specificity prediction with improved accuracy over previous methods.
SCEPTR is a BERT-like transformer-based neural network implemented in Pytorch. With the default model providing best-in-class performance with only 153,108 parameters (typical protein language models have tens or hundreds of millions), SCEPTR runs fast- even on a CPU! And if your computer does have a CUDA-enabled GPU, the sceptr package will automatically detect and use it, giving you blazingly fast performance without the hassle.
sceptr's API exposes three intuitive functions: calc_vector_representations
, calc_cdist_matrix
, and calc_pdist_vector
- and it's all you need to make full use of the SCEPTR models.
What's even better is that they are fully compliant with pyrepseq's tcr_metric API, so sceptr will fit snugly into the rest of your repertoire analysis workflow.
pip install sceptr
Please cite our preprint.
@misc{nagano2024contrastive,
title={Contrastive learning of T cell receptor representations},
author={Yuta Nagano and Andrew Pyo and Martina Milighetti and James Henderson and John Shawe-Taylor and Benny Chain and Andreas Tiffeau-Mayer},
year={2024},
eprint={2406.06397},
archivePrefix={arXiv},
primaryClass={q-bio.BM}
}