Whole exome sequencing (WES) is increasingly used in clinical practice, especially in pediatrics, to identify the cause in patients with a suspicion of genetic disease or for large studies whose purpose is to identify new disease genes.
Thanks to technological advancement, lowering costs and reduction of the time required for diagnosis, massive sequencing has revolutionized the approach to the study of genetic diseases.
But nowadays, how is whole-exome sequencing applied in clinical practice? In how many cases does it allow to reach the definitive diagnosis?
What is the mutation detection rate of whole exome sequencing?
“Mutation detection rate” indicates the percentage of cases in which the application of WES allows to identify the causative mutation of disease and to confirm the diagnosis to the patient.
Although WES is one of the increasingly used tool in the field of genetic testing, to date there are no general and universally accepted data on the success rate of its application. For example, considering the application of WES in the context of neurodevelopmental disorders, the diagnostic yield is estimated to be between 30% and 45%. Mahler and colleagues achieved a diagnostic yield of 42% through an interdisciplinary study of 50 children; Zhai and colleagues found the cause of the disease in 35.13% of patients, in a cohort of 74 probands aged between 8 and 29 years; Dong et al. achieved diagnostic success in 36.4% of cases, starting from a cohort of 1090 patients diagnosed with a developmental disorder.
Regarding WES application in epilepsy, the average diagnostic yield varies between 26.7% and 42%: Sun and colleagues identified the variant responsible for the phenotype in 39.7% of cases, in a cohort of 73 children diagnosed with epileptic encephalopathy. However, there may be exceptions: in a cohort of 102 pediatric cases diagnosed with epilepsy and intellectual disability, Yang and colleagues found the cause of the disease by whole exome sequencing in only 17 cases (16.7%).
In a study conducted on 1360 pediatric patients without a particular clinical suspicion by Zhang and colleagues, the diagnostic yield resulting from the analysis of exome sequencing alone was approximately 37%.
What is the cause of this variability?
One of the main factors influencing the mutation detection rate in all studies is the choice of the reference cohort. The number of patients, inclusion criteria, age, sex, geographic origin and phenotypic presentation are all parameters altering the diagnostic yield. Then, we have to consider the type of disease investigated (in some disorders, the mutation is not found in most patients) and the technology used (different enrichment kits, filtering of variants, variant calling quality…).
Another very important aspect is the variants interpretation and classification: generally, for the calculation of the diagnostic yield, pathogenic or likely pathogenic variants are considered, classified according to the ACMG criteria. However, the application of internal criteria to the single laboratory, the expertise of the data scientist and access to a large number of information on the single variant can greatly vary the interpretation of a variant.
How can the mutation detection rate be increased?
- Sequencing of trios vs. singleton
One of the simplest ways to increase the diagnostic yield of whole exome sequencing is to sequence, in addition to the proband, also his parents, in what is called trio-WES. In fact, according to Zhang and colleagues, the sequencing of the trios allowed an increase in the diagnostic yield of 6.29% compared to the sequencing of singletons. This is due to the fact that the trio analysis allows to discard the non-relevant variants thanks to the application of genetic models and to quickly identify de novo variants.
- CNV analysis
It has become increasingly evident that not only single nucleotide variants (SNVs) but also copy number variants (CNVs) can play a role in the pathogenesis of a disease. Not only, in some kind of diseases, such as psychomotor development disorders, CNVs are so common that guidelines indicate chromosomal micro-array (CMA) as the first-tier for diagnosis.
In addition to the classic molecular methods such as CMA or MLPA for the research of CNVs, algorithmic analysis of CNVs starting from the WES sequencing data is increasingly gaining ground, but it still has numerous limitations.
However, the CNV analysis increased the diagnostic yield of the study by Zhai and colleagues by 18.92%, in the work of Sun et al. by 6.8%, and in the study of Zhang et al. by 6.99%.
- Detailed clinical description
Detailed clinical description of the patient helps the interpretation phase of the data obtained by the WES and allows the data scientist to distinguish one disease from another. By integrating genetic data with in-depth clinical information, it is also possible to discern between those that have very overlapping clinical pictures. According to the study by Zhang et al., a greater number of HPO terms for the description of the patient’s phenotype allowed to increase the diagnostic yield by about 10%.
What if all of this isn’t enough?
If despite all these precautions, the analysis of the proband does not provide conclusive results, it is always possible to proceed with whole genome sequencing (WGS) In fact, although about 80-85% of the mutations fall into the coding region of the genes (the one sequenced with WES), a minority portion of mutations can fall into the intronic, regulatory and UTR regions, which are sequenced only in the sequencing by WGS.
For example, according to the study by Ma and colleagues conducted on 52 patients with congenital cataracts, the WGS allowed increasing the diagnostic yield by about 10%. The main advantages deriving from the WGS, in addition to the sequencing of the regions excluded from the WES, are the possibility of identifying small deletions/duplications that escape the CMA analysis, the possibility of sequencing the GC-rich regions and the repeated regions with greater efficacy.
Yang M, Xu B, Wang J, Zhang Z, Xie H, Wang H, Hu T, Liu S. Genetic diagnoses in pediatric patients with epilepsy and comorbid intellectual disability. Epilepsy Res. 2021 Feb;170:106552. doi: 10.1016/j.eplepsyres.2021.106552. Epub 2021 Jan 7. PMID: 33486335.
Mahler EA, Johannsen J, Tsiakas K, Kloth K, Lüttgen S, Mühlhausen C, Alhaddad B, Haack TB, Strom TM, Kortüm F, Meitinger T, Muntau AC, Santer R, Kubisch C, Lessel D, Denecke J, Hempel M. Exome Sequencing in Children. Dtsch Arztebl Int. 2019 Mar 22;116(12):197-204. doi: 10.3238/arztebl.2019.0197. PMID: 31056085
Zhai Y, Zhang Z, Shi P, Martin DM, Kong X. Incorporation of exome-based CNV analysis makes trio-WES a more powerful tool for clinical diagnosis in neurodevelopmental disorders: A retrospective study. Hum Mutat. 2021 May 20. doi: 10.1002/humu.24222. Epub ahead of print. PMID: 34015165.
Dong X, Liu B, Yang L, Wang H, Wu B, Liu R, Chen H, Chen X, Yu S, Chen B, Wang S, Xu X, Zhou W, Lu Y. Clinical exome sequencing as the first-tier test for diagnosing developmental disorders covering both CNV and SNV: a Chinese cohort. J Med Genet. 2020 Aug;57(8):558-566. doi: 10.1136/jmedgenet-2019-106377. Epub 2020 Jan 31. PMID: 32005694
Sun D, Liu Y, Cai W, Ma J, Ni K, Chen M, Wang C, Liu Y, Zhu Y, Liu Z, Zhu F. Detection of Disease-Causing SNVs/Indels and CNVs in Single Test Based on Whole Exome Sequencing: A Retrospective Case Study in Epileptic Encephalopathies. Front Pediatr. 2021 May 13;9:635703. doi: 10.3389/fped.2021.635703. PMID: 34055682
Zhang Q, Qin Z, Yi S, Wei H, Zhou XZ, Su J. Clinical application of whole-exome sequencing: A retrospective, single-center study. Exp Ther Med. 2021 Jul;22(1):753. doi: 10.3892/etm.2021.10185. Epub 2021 May 12. PMID: 34035850
Ma A, Grigg JR, Flaherty M, Smith J, Minoche AE, Cowley MJ, Nash BM, Ho G, Gayagay T, Lai T, Farnsworth E, Hackett EL, Slater K, Wong K, Holman KJ, Jenkins G, Cheng A, Martin F, Brown NJ, Leighton SE, Amor DJ, Goel H, Dinger ME, Bennetts B, Jamieson RV. Genome sequencing in congenital cataracts improves diagnostic yield. Hum Mutat. 2021 Jun 8. doi: 10.1002/humu.24240. Epub ahead of print. PMID: 34101287.