7 - Mutations and DNA Repair
The function of DNA is to store genetic information. The genetic information stored in DNA is transcribed to produce RNA molecules, and in some cases, the RNA molecules are then translated to produce proteins that control the phenotype of the cell. To store genetic information effectively, it is important that the sequence of DNA remains unchanged from generation to generation. However, DNA replication can make mistakes; recall one mistake is made every 10 to 100 million nucleotides synthesized in bacteria. Further, environmental agents, such as ionizing radiation, ultraviolet light, and the polycyclic aromatic hydrocarbons found in cigarette smoke can damage the DNA. DNA damage is often detrimental; DNA damage worsens protein activity and negatively affects phenotype. As a result, it may seem that it would be beneficial to fix all replication errors or DNA damage but remember, changes in the DNA sequence can be advantageous in some cases. Alterations in the DNA sequence create genetic diversity, and evolution requires genetic diversity in the population for natural selection to choose which allele combinations are best adapted to a particular environment. For this reason, a low level of mutation is advantageous for evolution.
In this section, we will define mutation, discuss various ways that mutations can be classified, learn the mechanisms used by the cell to repair damaged DNA, and discuss situations in which defective DNA repair leads to disease in humans.
What is a mutation?
A mutation is a change in the genetic material that is inherited by daughter cells after cell division. In most cases, an incorrect nucleotide is replaced with the correct one before the cell divides; therefore, based on the definition above, this type of change would not be considered a mutation. Only nucleotide changes that remain after cell division are mutations.
When environmental agents (ionizing radiation, ultraviolet light, polycyclic aromatic hydrocarbons) damage the DNA double-helix, induced mutations occur. In other cases, DNA polymerases inadvertently incorporate an incorrect nucleotide into the daughter DNA strand during DNA replication. This replication error is called a spontaneous mutation.
- What is the definition of mutation?
- What is the difference between an induced and a spontaneous mutation?
Germline and somatic mutations
What determines if a mutation is inherited by offspring? Gametes cells arise from specialized cells called germline cells. The non-gamete cells within the body (muscle cells, neurons, etc.) are called somatic cells. Therefore, when a mutation arises in a germline cell (germline mutation), the mutation is transmitted via the gametes to the next generation. A mutation generated in a somatic cell (somatic mutation) of an individual is not transmitted to the next generation.
The timing of the mutation can also influence its transmission to offspring. If a mutation arises in a cell during early embryonic development, then chances are good that the mutation will be found in both the germline and somatic cells. Since this mutation is found in germline cells, the mutation will be transmitted to the next generation. However, when a mutation arises late in development or in the adult organism, the mutation may only be found in one type of somatic cell and will be absent from the germline cells. As a result, the mutation will not be inherited by the offspring.
The timing of the mutation can also affect the severity of the phenotype. If a somatic mutation arises early in the development of a particular organ, it is likely that the majority of cells that make up the organ will contain that mutation. When an organ is made up of mostly mutant cells, the phenotype of the organ is impacted in a negative way. Alternatively, if only a subset of cells in the tissue contain the mutation, the phenotypic effect can be comparatively mild.
- What is the difference between a germline and a somatic mutation?
- How likely is a germline mutation transmitted from parents to offspring? How likely is a somatic mutation transmitted from parents to offspring?
- If a mutation arises early in development, is it more or less likely to be transmitted to the germline?
Sometimes a mutation alters a single nucleotide in the DNA molecule. When only one nucleotide has been altered, the mutation is a point mutation. When a purine nitrogenous base is changed to another purine (e.g., adenine for guanine) or a pyrimidine nitrogenous base is exchanged for another pyrimidine (e.g., cytosine for thymine), the point mutation is a transition mutation. Alternatively, when a purine is exchanged for a pyrimidine (or vice versa), we call the point mutation a transversion mutation. Transition mutations are more common than transversions because transitions do not alter the width of the DNA double helix.
Silent, Missense, and Nonsense mutations
When the mutation occurs in the intergenic or the repetitive DNA sequences within chromosomes, the mutation often has no or mild phenotypic consequences. Even though these mutations outside of the structural genes do not alter the cell's phenotype, these mutations are useful as they define the DNA fingerprint of an individual. Each person contains a unique collection of mutations within the intergenic and repetitive DNA sequences (see Part 1).
We will focus our attention on mutations that occur within structural genes. Remember that structural genes are transcribed into a messenger RNA (mRNA) molecule. The mRNA is then translated when the ribosome reads three nucleotide-long codons to generate the amino acid sequence of the encoded protein.
When one nucleotide has been exchanged for another nucleotide, but there is no change in the amino acid sequence of the protein, we refer to the mutation as a silent mutation (see Figure 7.1). Silent mutations occur because the genetic code is degenerate, that is, more than one codon specifies the same amino acid.
WILD-TYPE 5’ AUG . UUC . GUG . CAC . UUA . AUC . UAG 3’
MET . PHE . VAL . HIS . LEU . ILE . STOP
SILENT 5’ AUG . UUU . GUG . CAC . UUA . AUC . UAG 3’
MET . PHE . VAL . HIS . LEU . ILE . STOP
MISSENSE 5’ AUG . UUC . GUG . AAC . UUA . AUC . UAG 3’
MET . PHE . VAL . ASN . LEU . ILE . STOP
NONSENSE 5’ AUG . UUC . GUG . CAC . UAA . AUC . UAG 3’
MET . PHE . VAL . HIS . STOP
FIGURE 7.1 Base Substitutions Can Affect Gene Structure and Function. The sequence of the wild-type RNA is indicated, with the amino acid sequence of the translated protein shown below it. In each case of each mutation, the effect on the amino acid sequence of the protein is indicated in bold.
When a point mutation substitutes an amino acid with a different amino acid, the mutation is a missense mutation. Sickle cell anemia is an autosomal recessive disease caused by a missense mutation in the beta hemoglobin gene. In sickle cell anemia, the missense mutation replaces the amino acid glutamic acid at the sixth amino acid position in the beta-globin protein with the amino acid valine.
Sometimes a point mutation results in the formation of a premature stop codon (nonsense mutation). Nonsense mutations are usually more severe in phenotype than missense mutations because nonsense mutations cause the encoded protein to be shorter in length than normal. These shorter proteins are often nonfunctional.
In some cases, nucleotides are inserted into the sequence of a structural gene (i.e., an insertion mutation) or nucleotides are deleted from the sequence of the structural gene. Because the ribosome reads the encoded mRNA one codon at a time, some insertions or deletions will change the reading frame of the mRNA molecule. When the reading frame changes, the entire amino acid sequence of the encoded protein changes from the point of the insertion/deletion (indel) site onward. A mutation that changes the reading frame is called a frameshift mutation (see Figure 7.2).
WILD-TYPE 5’ AUG . UUC . GUG . CAC . UUA . AUC . UAG 3’
MET . PHE . VAL . HIS . LEU . ILE . STOP
FRAMESHIFT 5’ AUG . UUC . GCU . GCA . CUU . AAU . CUA . G 3’
MET . PHE . ALA . ALA . LEU . ASN . LEU ….
FIGURE 7.2 Frameshift Mutations Can Affect Gene Structure and Function. By inserting a single cytosine in the third codon, a shift in the reading frame occurs that changes the amino acid sequence of the protein and eliminates the stop codon.
- What is a transition mutation?
- What is a transversion mutation?
- What is the difference between a silent, missense, and a nonsense mutation?
- How do frameshift mutations affect the amino acid sequence of a protein?
B. DNA Repair Systems
Changes to the nucleotide sequence within DNA can occur at any time, and most of these changes are recognized by cellular DNA repair systems immediately. We learned in Part 5 that the DNA double-helix always has a purine (adenine or guanine) paired with a pyrimidine (cytosine or thymine). Purine-pyrimidine base pairing results in a DNA molecule that is two-nanometers (nm) wide. Repair enzymes run along the length of double stranded DNA as it is replicating, ensuring that the replicated DNA is 2 nm wide. If two purines are paired, a bulge in the DNA double-helix occurs, and if there are two pyrimidines paired, the double helix narrows. Variations in the width of the DNA double-helix signal that damage has occurred, and repair is necessary.
There is more than one DNA repair system because there are different types of mutations to correct. Even though each DNA repair system is different, there are some features in common among them. First, there is an enzyme that recognizes an incorrect base pair. Next, a nucleotide within this incorrect base pair and often a few extra nucleotides are removed. Finally, the correct nucleotide is incorporated by DNA replication.
There are six DNA repair systems:
- Mismatch repair
- Base excision repair (BER)
- Nucleotide excision repair (NER)
- Homology directed repair (HDR)
- Non-homologous end joining (NHEJ)
You will only need to know about the first four DNA repair systems. The homology directed repair (HDR) and non-homologous end joining (NHEJ) content will be added at a future date.
The most common repair system involves the DNA replication machinery itself. When a DNA polymerase synthesizes the daughter DNA strand, the DNA polymerase occasionally inserts an incorrect base. Recall that DNA polymerases possess 3’–5’ exonuclease activity; the DNA polymerase can reverse directions, remove the incorrect base in the 3' to 5' direction, and then replace the incorrect base with the correct one. In most cases, proofreading is the first line of defense against potential mutations.
If an incorrect base pair is formed during DNA replication and the mistake is not removed by proofreading, the mismatch repair system is activated to fix the mistake. Mismatch repair recognizes which of the two nucleotides in the base pair is incorrect and then removes the incorrect nucleotide. For example, suppose that cytosine (in the daughter DNA strand) has been paired accidently with adenine (in the parental DNA strand) during DNA replication. When mismatch repair recognizes this mistake, it now faces a dilemma: which of the two bases is incorrect? If mismatch repair randomly chooses the nitrogenous base to correct, then 50% of the time, it will remove the adenine in the parental DNA strand, instead of the cytosine in the daughter DNA strand.
In mismatch repair, the cell must first distinguish the parental (template) and daughter DNA strands. In the bacterium E. coli, the parental DNA strand is methylated by DNA adenine methyltransferase (Dam), while newly synthesized daughter DNA strands are unmethylated. The proteins in the mismatch repair system recognize the methylated parental strand, and then mismatch repair removes the mismatched nucleotide within the unmethylated daughter strand.
There are four major proteins that make up the mismatch repair system in the bacterium E. coli: MutS, MutL, MutH, and MutU. These four proteins function as follows:
- MutS slides along the DNA double helix and finds the mismatch.
- MutL combines with MutS, forming a MutS/MutL complex.
- MutH binds to a nearby DNA sequence containing a methylated adenine on the parental DNA strand. In essence, MutH is the protein that determines which strand is the parental DNA strand.
- The MutS/MutL complex binds to MutH producing a loop in the DNA. MutL functions as the bridge subunit that connects MutS and MutH.
- MutH is an endonuclease that makes a single-stranded DNA break in the backbone of the unmethylated (daughter) DNA strand. This break occurs between the G and A bases in the 5’-GATC-3’ sequence in the daughter DNA strand.
- MutU binds to the MutH cut site in the daughter strand. MutU has helicase activity and functions to separate the daughter DNA strand from the parental DNA strand at the MutH cut site.
- An exonuclease called ExoI degrades the daughter DNA strand in the 5’ to 3’ direction starting at the cut site produced by MutH. Degradation of the daughter DNA strand removes the mismatched nucleotide.
- The gap in the daughter DNA strand is filled in by a DNA polymerase.
- The final covalent bond in the newly synthesized daughter DNA strand is produced by DNA ligase.
- What is the purpose of mismatch repair?
- Explain how mismatch repair works.
- Describe the functions of MutS, MutL, MutH, MutU, ExoI, DNA polymerase, and DNA ligase proteins in mismatch repair.
- Which one of the four Mut proteins is responsible for identifying the mismatched base?
Base Excision Repair (BER)
Base excision repair (BER) corrects abnormal bases that are sometimes formed in DNA. For example, occasionally cytosine in DNA can be spontaneously converted to uracil. The conversion of cytosine to uracil is a transition mutation that does not distort the width of the DNA double helix.
BER in the bacterium E. coli can remove uracil from DNA as follows:
- The enzyme uracil DNA glycosylase (UNG) recognizes uracil in the DNA. UNG cleaves the covalent bond that links uracil to deoxyribose (i.e., breaks the covalent bond attached to the 1’ carbon of deoxyribose). The uracil base is released from the rest of the nucleotide, creating an abasic site in the DNA. Note that the nucleotide at the abasic site still contains deoxyribose and the phosphate group; the nucleotide is just missing the uracil nitrogenous base.
- At the abasic site, the DNA backbone is still intact in both strands. This abasic site is recognized by another enzyme, apurinic/apyrimidinic endonuclease 1 (APE1), which makes a nick in the DNA backbone at the 5’ end of the abasic site (i.e., cleaves the phosphodiester bond).
- At this point, DNA polymerase I uses its 5’-3’ exonuclease activity to remove the deoxyribose sugar and phosphate group at the abasic site. DNA polymerase I then uses its DNA synthesis activity to fill in removed nucleotide with the correct nucleotide.
- DNA ligase seals the gap in the newly synthesized DNA.
- What is an abasic site?
- Explain the function of UNG, APE1, DNA polymerase I, and DNA ligase in base excision repair.
Nucleotide Excision Repair (NER)
Ultraviolet (UV) light exposure can lead to the formation of a pyrimidine dimer mutation in DNA that distorts the structure of the DNA double-helix. A pyrimidine dimer occurs when UV light forms additional covalent bonds between two adjacent pyrimidines in the same DNA strand. When this type of mutation occurs, the hydrogen bonds between the two adjacent pyrimidines and the other DNA strand are broken. During DNA replication or transcription, when the DNA or RNA polymerase encounters a pyrimidine dimer, the polymerases either stop DNA replication or transcription altogether, or incorrect nitrogenous bases are placed in the synthesized DNA or RNA opposite the pyrimidine dimer.
In E. coli, pyrimidine dimers can be repaired with the nucleotide excision repair (NER) system. NER involves four proteins: UvrA, UvrB, UvrC, and UvrD. These four proteins remove a segment of DNA including the pyrimidine dimer, then replace the removed nucleotides via DNA replication. NER occurs as follows:
- A complex consisting of two UvrA proteins and one UvrB protein scans the double stranded DNA in search of a pyrimidine dimer.
- Once a pyrimidine dimer is identified, the complex pauses over the dimer. The UvrA proteins are released, while UvrC attaches to UvrB at the dimer site.
- UvrC is an endonuclease that cuts the damaged DNA strand on each side of the pyrimidine dimer.
- UvrD, which is a DNA helicase, separates the two DNA strands, releasing the short segment of damaged DNA, including the pyrimidine dimer itself. UvrB, UvrC, and UvrD are released.
- A DNA polymerase fills in the gap, using the other DNA strand as a template.
- DNA ligase seals the gap in the newly synthesized DNA.
- What nucleotides can be involved in a pyrimidine dimer?
- What are the functions of the UvrA, UvrB, UvrC, UvrD, DNA polymerase, and DNA ligase proteins in NER?
- Which Uvr protein unwinds and removes the damaged DNA?
Homology Directed Repair (HDR) (future content)
Nonhomologous End Joining (NHEJ) (future content)
Human Genetic Diseases Associated with Faulty DNA Repair
The repair mechanisms in the bacterium E. coli described above have counterparts in human cells. Although the proteins are not identical, much of the repair process is similar. Mutations in the genes that give rise to mismatch repair enzymes have been noted in human cells. When both copies of the gene are inactivated, then cancer can occur. Specifically, a type of inherited cancer called hereditary nonpolyposis colorectal cancer occurs because of mutations in the genes that produce human mismatch repair proteins. Mutations in the base excision repair system in humans have also been implicated in colon cancer.
Epithelial cells in the skin are exposed to UV light from the sun. As a result, the nucleotide excision repair system plays an important role in repairing pyrimidine dimers that form in epithelial cells. Xeroderma pigmentosum (XP), an autosomal recessive disease, occurs when one of the seven human genes involved in nucleotide excision repair are inactivated by mutation. Xeroderma pigmentosum patients are sensitive to sunlight (due to the inability to repair DNA damage caused by UV light) and are predisposed to forming skin cancer.
- What inheritance pattern is most often associated with diseases caused by defects in the DNA repair systems? Why do you think this is so?
C. Trinucleotide Repeat Expansions and Disease
In humans, there are DNA sequences in which a series of three nucleotides is repeated consecutively (trinucleotide repeat). In most cases, a parent transmits these repeats to their offspring without any change in the repeat number. In a few human genetic diseases, however, this repeat can expand over the course of generations due to slippage of the DNA polymerases during replication. Once the expansion reaches a critical size, it can alter the function of the encoded protein, causing disease.
Huntington’s disease (HD) is an example of a neurodegenerative disease in humans caused by an expansion of a 5'-CAG-3' repeat within the coding region of the HTT gene. Although we do not know the exact function of the encoded HTT protein in the brain, it is believed that HTT plays an important role in the function of neurons.
During protein synthesis, 5'-CAG-3' encodes the amino acid, glutamine. In unaffected people, there can be between 10 and 35 repeats of the 5'-CAG-3' sequence in the HTT gene. These repeats lead to a string of glutamine amino acids within the protein. So long as the number of 5'-CAG-3' repeats remains below 35, the encoded HTT protein functions normally. However, in some families, the number of repeats can extend beyond 35, ranging from 36 to 120 repeats. The longer string of glutamine amino acids causes the HTT protein to degrade into smaller, toxic fragments. As these toxic protein fragments accumulate, neurons die prematurely. Over time, brain activity is altered, leading to symptoms of Huntington’s disease that include uncontrolled body movements, emotional problems, and decreased ability to learn and to make decisions. The number of 5'-CAG-3' repeats in the HTT gene correlates with the severity of the disease; a patient with more 5'-CAG-3' repeats typically has more severe symptoms than a person with fewer repeats.
- Why do trinucleotide expansions above a certain threshold cause human disease?
Fill in the blank:
- A bulge in the DNA double strand occurs when two _______________________ form a base pair.
- Exposure to radiation and chemicals is responsible for causing _______________ mutations.
- A single base substitution in the coding region of a gene that exchanges one amino for a stop codon is called a ______________________ mutation.
- A deletion of four nucleotides from the coding region of a gene would result in a _________________________ mutation.
- In each repair system, ____________________________ forms the final covalent bond in the damaged DNA strand after the correct nucleotide has been added.
- If a mutation occurs in the DNA such that guanine is across from uracil, then the most likely repair system to recognize and correct this mutation would be _______________________________ .
- In mismatch repair, ______________________________ recognizes and binds to the methylated adenine on the parental DNA strand.
- ________________________________ is a helicase used in nucleotide excision repair.
- __________________________________ is an example enzyme that has 3’ – 5’ exonuclease activity.
- Xeroderma pigmentosum is caused because of a defect in the ____________________________________________ repair system.
- CAG encodes the amino acid ____________________________ and when the number of CAG repeats exceeds _____________, neurological symptoms appear consistent with Huntington’s disease.
End-of-Chapter Survey: How would you rate the overall quality of this chapter?
- Very Low Quality
- Low Quality
- Moderate Quality
- High Quality
- Very High Quality