Point mutations: an overview
Mutations are inheritable changes in the DNA sequence. Mutations can be of different size and may affect a single gene (genic mutations), one or more chromosomes in their structure (chromosomal aberrations), or one or more chromosomes in number (genomic aneuploidies). Genic mutations involve one or few single nucleotides. So they can be consistent with point mutations (also named SNV, Single Nucleorite Variations), small deletions or insertions.
Genic mutations can be classified based on their effect on the protein structure:
- Synonym: these mutations do not change the coded amino acid, so they are supposedly not impacting the final protein structure;
- Missense: these mutations change the coded amino acid, hence they influence the final protein structure. Their phenotypic effect can be benign, uncertain or pathogenic.
- Nonsense: these mutations cause a shift in the reading frame or the formation of a premature stop codon (truncated protein). Very often (but not always) these mutations are pathogenic.
But there's also another class of genic mutations, rather infrequent, but possible:
- Start-loss: these mutations affect the initiation codon, i.e. the very first amino acid of the protein (which is a Methionine), and their effect on the final protein structure (and therefore on the individual's clinical picture), is anything but easily deducible.
- Stop-loss: even rarer than start-loss, these mutations affect the last protein amino acid. Also for these mutations, the effect isn't easy to understand
Whole exome sequencing and whole genome sequencing in rare disorders have greatly helped to identify and understand more and more of start-loss and stop-loss mutations. Because of their relatively higher frequency and impact, in this post, we'll focus on start-loss mutations only.
The nomenclature of start-loss mutations: an effort to find the right one.
Even in peer-reviewed literature, start-loss mutations are often described as standard missense variants, e.g. as "p.Met1Val" or "p.Met1Cys". However, this nomenclature is incorrect. Why? Because the translation of the mRNA into protein simply does not start and no other amino acid is incorporated in place of the initial Methionine.
So, what's the right nomenclature for start-loss mutations? According to the HGVS - Human Genome Variation Society guidelines, we must think of two different possibilities: (a) the effect on the protein is known and demonstrated by functional studies or (b) the effect on the is only precited. So:
- If the mutation causes the absence of the protein, the correct nomenclature is "p.0" (if proven), or "p.0?" if it is a prediction.
- If the effect of the mutation is unknown, the more correct nomenclature is "p.?"
- If functional studies have demonstrated that a new, substitute start codon has been activated, precise rules must be followed, depending on whether the latter is upstream or downstream of the canonical start codon.
Start-loss: is it pathogenic?
Impulsively, we'd think start-loss mutations are always pathogenic. In practice, the evaluation of the clinical impact of start-loss mutations is very complex and must take into account alternative possibilities.
First, it has been shown several times that there are non-canonical or non-AUG translation initiation sites that are used by the cell to warrant protein translation. These sites are generally not as effective as the canonical ones, but they still guarantee a minimum production of the protein, which may be sufficient to avoid the disease.
Secondly, we must recall the existence of different protein isoforms, some of which have a different start codon. Sometimes, these isoforms are interchangeable and can make up for the lack of one of the others. Finally, a variant falling on the start codon of one isoform can exert a different effect (e.g. synonym, missense, intronic ...) on other isoforms. In this case, there are additional parameters to account for: the level of expression and tissue localization of the mutated isoform, which may determine the pathogenicity of that start-loss variant.
To further complicate the interpretation of these mutations, functional in vitro studies are often necessary to demonstrate the actual biological effect of start-loss mutation, so that a priori predictions will be difficult.
Start-loss mutation: how do we interpret them?
There is no simple answer to this question.
Undoubtedly, collecting as much information as possible is crucial to get a global picture of the situation. We should always have in mind:
(1) the interpretation guidelines for loss-of-function variants by ACMG;
(2) the frequency of the variant;
(3) the inheritance pattern
(4) if known, the typical pathogenicity mechanism of the mutations in the gene of interest (i.e. haploinsufficiency, negative dominant effect, or gain-of-function)
(5) the phenotypic overlap between the gene-associated phenotype and the patient's clinical features.
If possible, carrying out functional studies to check the real impact of the start-loss variant on the mRNA is advisable (although we know that, especially in routine diagnostics, this isn't easy to organize and achieve).
Pay attention to the phenotype!
Even if start-loss mutations may cause severe phenotypes, since cells can rely on numerous "tricks" to overcome a start-loss mutation, the phenotype caused by a start-loss mutation may differ from the classic one.
For example, Kuper and colleagues reported some patients with a start-loss mutation in the CLN3 gene. Those patients' phenotype was less severe than the phenotype caused by other mutations in the same gene. Mutations in CLN3 cause juvenile neuronal ceroid lipofuscinosis, which is characterized by progressive visual impairment, seizures and progressive dementia. The patients reported by Kuper had the same course of the retinal phenotype, but the onset of the neurological phenotype was delayed and showed a slower progression. The authors stated that the difference could be due to the fact that the start-loss mutation guarantees a residual production of the protein, through one of the above-mentioned mechanisms.
Another interesting example has been reported by Perenthaler et al. They have described patients with a homozygous start-loss mutation in the UGP2 gene, causing a severe form of epileptic encephalopathy. Usually, homozygous loss-of-function mutations in UGP2 are not viable. However, UGP2 has two isoforms: one is longer and the other one is shorter (and expressed mainly in the brain). The mutation identified by Perenthaler et al. led to the loss of the start codon on the shorter isoform only, whereas it simply resulted in a tolerated missense mutation in the longer isoform. This example is very educative and shows how start-loss mutations affecting only some tissue-specific isoforms can cause different or intermediate phenotypes.
Na CH, Barbhuiya MA, Kim MS, Verbruggen S, Eacker SM, Pletnikova O, Troncoso JC, Halushka MK, Menschaert G, Overall CM, Pandey A. Discovery of noncanonical translation initiation sites through mass spectrometric analysis of protein N terms. Genome Res. 2018 Jan; 28 (1): 25-36. Doi: 10.1101 / gr. 226050.117. Epub 2017 Nov 21. PMID: 29162641; PMCID: PMC5749180
Abou Tayoun AN, Pesaran T, DiStefano MT, Oza A, Rehm HL, Biesecker LG, Harrison SM; ClinGen Sequence Variant Interpretation Working Group (ClinGen SVI). Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criterion. Hum Mutat. 2018 Nov;39(11):1517-1524. doi: 10.1002/humu.23626. Epub 2018 Sep 7. PMID: 30192042; PMCID: PMC6185798.
Kuper WFE, van Alfen C, van Eck L, de Man SA, Willemsen MH, van Gassen KLI, Losekoot M, van Hasselt PM. The c.1A > C start codon mutation in CLN3 is associated with a protracted disease course. JIMD Rep. 2020 Feb 7;52(1):23-27. doi: 10.1002/jmd2.12097. PMID: 32154056; PMCID: PMC7052694.
Perenthaler E, et al. Loss of UGP2 in brain leads to a severe epileptic encephalopathy, emphasizing that bi-allelic isoform-specific start-loss mutations of essential genes can cause genetic diseases. Acta Neuropathol. 2020 Mar;139(3):415-442. doi: 10.1007/s00401-019-02109-6. Epub 2019 Dec 9. PMID: 31820119; PMCID: PMC7035241.