DescriptionIn order to understand mechanisms of governing formation and action of T cell receptor (TCR) repertoires there is a great need for a unified way of comparing different repertoires obtained from different sources. An understanding of differences between repertories can be obtained based on relative abundance and similarity of T cell receptors, specifically, complementarity determining regions CDR3 of the TCR ß. These can then be employed in hierarchical clustering, diversity estimates and comparisons of specific sequences between repertoires. However, to summarise this high dimensional data a suitable T-cell receptor similarity metric has to be defined. Current state of the art in measuring similarity of TCRs relies on comparisons of primary sequence information, using e.g. sequence overlap, or conservation of short stretches of amino acids (AA). The focus of the research will be on CDR3 ß since it has been shown that it is in most contact with the presented peptide on the MHC and therefore is believed to be responsible for most of the specificity towards an antigen. The overall approach to designing the similarity metric in this thesis is: i. 3D crystallographic data of TCRs that bind to the same epitope exists in public databases and will be used as starting point in the project. First 3D similarity by RMSD or TM-score of canonical structures will be calculated in LYRA based on primary sequence of these TCR structures. Next, the results of these will be compared to 3D comparisons of X-ray crystallographic structures of the same TCRs. The 3D similarity of canonical structures will provide a basis for the metric, and further parameters might be included, as described above. These TCRs will be one that bind to the same epitope so evaluate how much does 3D conformation contribute to binding specificity and can we capture this effect using predicted canonical structures. ii. Metric developed in i. will be used on primary sequence data from VDJdb with specified antigen specificity to validate and further improve the metric. This will show whether the CDR3s classified by the metric as similar, bind to the same antigens, or at the least similar antigens. Some of the sequences obtained from VDJdb will be used exclusively for testing purposes. iii. An iterative process of including different features in the metric until reaching satisfactory results as described in ii. will be implemented until a final version of the metric is obtained. iv. Finally the metric performance will be compared to state of the art in the field e.g. TCR dist, alignment score metrics etc.
|Period||2020 → …|
|Examination held at||Bioinformatics|
|Degree of Recognition||International|
- T-cell Antigen Receptor
- T-cell Receptor (TCR) clustering
- CDR3 similarity
- TCR similarity