Cover1 - Chromosome Structure2 - Chromosome Compaction3 - Chromosome Variation5 - Nucleic Acid Structure6 - DNA Replication7 - Mutations and DNA Repair8 - Polymerase Chain Reaction (PCR)9 - Transcription10 - RNA Modifications11 - Translation12 - Gene Cloning13 - The lac Operon14 - Gene Regulation in Eukaryotes15 - Epigenetics16 - Genome Editing

7 - Mutations and DNA Repair

The function of DNA is to store genetic information.  The genetic information stored in DNA is transcribed to produce RNA molecules, and in some cases, the RNA molecules are then translated to produce proteins that affect the phenotype of the cell. To store genetic information effectively, it is important that the sequence of DNA remains unchanged from generation to generation. However, DNA replication can make mistakes; recall one mistake is made every 10 to 100 million nucleotides synthesized in bacteria. Further, environmental agents, such as ultraviolet light can damage the DNA.  These mistakes often are detrimental; the damage worsens protein activity and negatively affects phenotype. As a result, it may seem that it would be beneficial to fix all mistakes in DNA but remember, mutations in some cases can be advantageous.  Mutations create genetic diversity, and evolution requires genetic diversity in a population for natural selection to choose which allele combinations are best adapted to a particular environment. For this reason, a low level of mutation is advantageous for evolution.

In this section, we will define mutation, discuss various ways that mutations can be classified, learn the mechanisms used by the cell to repair damaged DNA, and discuss situations in which defective DNA repair leads to disease in humans.

A. Mutations

What is a mutation?

A mutation is a heritable change in the genetic material. In most cases, an incorrect nucleotide is replaced with the correct one before the cell divides; therefore, based on the definition above, this type of change would not be considered a mutation. Only changes that remain after cell division are mutations.

When environmental agents (ionizing radiation, ultraviolet light, chemical agents) interact with the DNA double-helix, induced mutations occur. In other cases, DNA polymerases inadvertently incorporate an incorrect nucleotide into the daughter DNA strand during DNA replication. This replication error is called a spontaneous mutation.

Key Questions

  • What is the definition of mutation?
  • What is the difference between an induced and a spontaneous mutation?

Germline and somatic mutations

Gametes (i.e., the haploid sperm and egg cells) arise from specialized cells called germline cells. The other cells (muscle cells, neurons, epithelial cells, etc.) of the body are called somatic cells. Therefore, when a mutation arises in a germline cell (germline mutation), the mutation is transmitted via the gametes to the offspring. A mutation found in a somatic cell (somatic mutation) of the adult is not transmitted to the gametes and remains in the parent.

The timing of the mutation can also influence its transmission. If a mutation arises in a cell during early embryonic development, then chances are good that the mutation will be found in both the germline and somatic cells. However, when a mutation arises late in development or in the adult organism, the mutation may only be found in one type of somatic cell and may be absent from the germline cells. Mutations in somatic cells, as mentioned above, are not transmitted from parent to offspring.

The timing of the mutation can also affect the severity of the phenotype. If a somatic mutation arises early in the development of a particular organ, it is likely that all cells that make up the organ will contain that mutation. When an organ is made up of mostly mutant cells, the phenotype of the organ is impacted in a negative way. Alternatively, if only a subset of cells in the tissue contain the mutation, the phenotypic effect is relatively mild.

Key Questions

  • What is the difference between a germline and a somatic mutation?
  • How likely is a germline mutation transmitted from parents to offspring? How likely is a somatic mutation transmitted from parents to offspring?
  • If a mutation arises early in development, is it more or less likely to be transmitted to the germline?

Point Mutations

Sometimes a mutation alters a single base in the DNA molecule. When only one nucleotide has been altered, we refer to the mutation as a point mutation. When a purine is changed to another purine (e.g. adenine for guanine) or a pyrimidine is changed for another pyrimidine (e.g. cytosine for thymine), the point mutation is a transition mutation. Alternatively, when a purine is exchanged for a pyrimidine (or vice versa), we call the point mutation a transversion mutation. Transition mutations are more common than transversions because transitions do not alter the width of the DNA double helix.

Silent, Missense, and Nonsense mutations

When the mutation occurs in the intergenic or the repetitive DNA sequences within chromosomes, the mutation rarely has phenotypic consequences. Even though mutations outside of structural genes do not alter phenotype, these mutations are useful as they define the DNA fingerprint of an individual.  Each person contains a unique collection of mutations within the intergenic and repetitive DNA sequences.

We will focus our attention on mutations that occur within structural genes. Remember that structural genes contain the nucleotides that are transcribed into a messenger RNA (mRNA) molecule.  The mRNA is then translated when the ribosome reads groups of three nucleotides (codons) to generate the amino acid sequence of the protein.

When one nucleotide has been exchanged for another nucleotide, but there is no change in the amino acid sequence of the protein, we refer to the mutation as a silent mutation (see Figure 7.1). Silent mutations occur because the genetic code is degenerate, that is, more than one codon specifies the same amino acid.

WILD-TYPE      5’         AUG . UUC . GUG . CAC . UUA . AUC . UAG    3’

                                     MET . PHE . VAL . HIS . LEU . ILE . STOP

SILENT             5’         AUG . UUU . GUG . CAC . UUA . AUC . UAG    3’

                                     MET . PHE . VAL . HIS . LEU . ILE . STOP

MISSENSE       5’         AUG . UUC . GUG . AAC . UUA . AUC . UAG    3’

                                     MET . PHE . VAL . ASN . LEU . ILE . STOP

NONSENSE      5’        AUG . UUC . GUG . CAC . UAA . AUC . UAG     3’

                                     MET . PHE . VAL . HIS . STOP

FIGURE 7.1 Base Substitutions Can Affect Gene Structure and Function. The sequence is shown for the wild-type RNA, with the amino acid sequence of the protein shown below it. In each case of a mutation, the new amino acid sequence is shown.

When a point mutation results in the exchange of one amino acid for a different amino acid in a protein, we refer to the mutation as a missense mutation.  Sickle cell anemia is an autosomal recessive disease caused by a missense mutation in the beta hemoglobin gene.  The missense mutation leads to the exchange of the amino acid glutamic acid for valine at the sixth amino acid position in the beta-globin protein. 

Sometimes a point mutation results in an amino acid being exchanged for a stop codon (nonsense mutation).  Nonsense mutations are usually more severe in phenotype than missense mutations because nonsense mutations cause the encoded protein to be shorter in length than normal.  These shorter proteins are often nonfunctional.

Frameshift Mutations

In some cases, a single nucleotide is deleted from the sequence of a structural gene or an additional nucleotide is added. Because the ribosome reads the mRNA produced from the gene one codon at a time, a change that deletes or inserts a nucleotide will change the reading frame of the mRNA molecule.  When the reading frame changes, the entire amino acid sequence of the encoded protein changes from the point of the mutation onward. This type of mutation is called a frameshift mutation (see Figure 7.2).

WILD-TYPE      5’         AUG . UUC . GUG . CAC . UUA . AUC . UAG                  3’

                                     MET . PHE . VAL . HIS . LEU . ILE . STOP

FRAMESHIFT   5’         AUG . UUC . GCU . GCA . CUU . AAU . CUA . G            3’

                                      MET . PHE . ALA . ALA . LEU . ASN . LEU ….

FIGURE 7.2 Frameshift Mutations Can Affect Gene Structure and Function. By inserting a single cytosine in the third codon, a shift in the reading frame occurs that changes the amino acid sequence of the protein and eliminates the stop codon.

Key Questions

  • What is a transition mutation?
  • What is a transversion mutation?
  • What is the difference between a silent, missense, and a nonsense mutation?
  • How do frameshift mutations affect the amino acid sequence of a protein?

B. DNA Repair Systems

Changes to the nucleotide sequence within DNA can occur at any time, and most of these changes are recognized by cellular DNA repair systems immediately.  We learned in Part 5 that the DNA double-helix always has a purine (adenine or guanine) paired with a pyrimidine (cytosine or thymine). Purine-pyrimidine base pairing results in a DNA molecule that is two-nanometers (nm) wide. Repair enzymes run along the length of double stranded DNA as it is replicating, ensuring that the replicated DNA is 2 nm wide.  If two purines are paired, a bulge in the DNA double-helix occurs, and if there are two pyrimidines paired, the double helix narrows.  Variations in the width of the DNA double-helix signal that damage has occurred, and repair is necessary.

There is more than one DNA repair system because there are different types of mutations to correct. Even though each DNA repair system is different, there are some features in common among them. First, there is an enzyme that recognizes an incorrect base pair. Next, a nucleotide within this incorrect base pair and often a few extra nucleotides are removed. Finally, the correct nucleotide is incorporated by DNA replication.

There are six DNA repair systems:

  1. Proofreading
  2. Mismatch repair
  3. Base excision repair (BER)
  4. Nucleotide excision repair (NER)
  5. Homology directed repair (HDR)
  6. Non-homologous end joining (NHEJ)


The most common repair system involves the DNA replication machinery itself. As DNA polymerase synthesizes the daughter DNA strand, the polymerase occasionally inserts an incorrect base. Recall that most of the prokaryotic and eukaryotic DNA polymerases possess 3’–5’ exonuclease activity; the polymerase can reverse directions, remove the incorrect base, and replace it with the correct base. In most cases, this is the first line of defense in recognizing and replacing incorrect nucleotides.

Mismatch Repair

If an incorrect base pair is formed during DNA replication and the mistake is not removed by proofreading, the mismatch repair system is activated to fix the mistake.  Mismatch repair recognizes which of the two nucleotides in the base pair is incorrect and then removes the incorrect nucleotide. For example, suppose that cytosine has been paired accidently with adenine during replication. When mismatch repair recognizes this mistake, it now faces a dilemma: which of the two bases is incorrect? If mismatch repair randomly chooses one of the two bases to correct, then 50% of the time, it will remove the wrong base.

In mismatch repair, the cell must first identify which DNA strand contains the error. In the bacterium E. coli, the parental DNA strand is methylated (at adenine nucleotides) by DNA adenine methyltransferase (Dam), while newly synthesized DNA strands are unmethylated. The proteins in the mismatch repair system recognize the methylated parental strand, and then mismatch repair removes the mismatched nucleotide within the unmethylated daughter strand. 

There are four major proteins that make up the mismatch repair system in the bacterium E. coli: MutS, MutL, MutH, and MutU. These proteins function as follows:

  1. MutS slides along the DNA double helix and finds the mismatch.
  2. MutL combines with MutS, forming a MutS/MutL complex.
  3. MutH binds to a nearby DNA sequence containing a methylated adenine on the parental DNA strand.  In essence, MutH is the protein that determines which strand is the parental DNA strand.
  4. The MutS/MutL complex binds to MutH producing a loop in the DNA.  MutL functions as the bridge subunit that connects MutS and MutH.
  5. MutH is an endonuclease that makes a single-stranded DNA break in the backbone of the unmethylated (daughter) DNA strand.  This break occurs between the G and A bases in the 5’-GATC-3’ sequence in the daughter DNA strand.
  6. MutU binds to the MutH cut site in the daughter strand.  MutU has helicase activity and functions to separate the daughter DNA strand from the parental DNA strand at the MutH cut site. 
  7. An exonuclease called ExoI degrades the daughter DNA strand in the 5’to 3’ direction starting at the cut site produced by MutH.  Degradation of the daughter DNA strand removes the mismatched nucleotide.
  8. The gap in the daughter DNA strand is filled in by a DNA polymerase.
  9. The final covalent bond in the newly synthesized daughter DNA strand is produced by DNA ligase.

Key Questions

  • Explain how mismatch repair works.
  • Describe the functions of MutS, MutL, MutH, MutU, exonuclease, DNA polymerase, and DNA ligase proteins in mismatch repair.
  • Which one of the four Mut proteins is responsible for identifying the mismatched base?

Base Excision Repair (BER)

Base excision repair (BER) corrects abnormal bases that are sometimes formed in DNA.  For example, occasionally cytosine can be spontaneously converted to uracil.  The conversion of cytosine to uracil is a transition mutation that does not produce a distortion in the width of the DNA double helix.

Base excision repair in the bacterium E. coli can remove uracil from DNA as follows:

  1. The enzyme uracil DNA glycosylase (UNG) recognizes uracil in the DNA. UNG cleaves the covalent bond that links uracil to deoxyribose (i.e., breaks the covalent bond attached to the 1’ carbon of deoxyribose).   Uracil is released, creating an abasic site in the DNA.  Note that the nucleotide at the abasic site still contains deoxyribose and the phosphate group; the nucleotide is just missing the nitrogenous base.
  2. At the abasic site, the DNA backbone is still intact in both strands. This abasic site is recognized by another enzyme, apurinic/apyrimidinic endonuclease 1 (APE1), which makes a nick in the DNA backbone at the 5’ end of the abasic site (i.e., cleaves the phosphodiester bond).
  3. At this point, DNA polymerase I uses its 5’-3’ exonuclease activity to remove the deoxyribose sugar and phosphate group at the abasic site. DNA polymerase I then fills in the correct nucleotide.
  4. DNA ligase seals the gap in the newly synthesized DNA.

Key Questions

  • What is an abasic site?
  • Explain the function of UNG, APE1, DNA polymerase I, and DNA ligase in base excision repair.

Nucleotide Excision Repair (NER)

Ultraviolet (UV) light exposure can introduce a specific type of mutation (pyrimidine dimer) that distorts the structure of the DNA double-helix.  A pyrimidine dimer occurs when UV light forms a covalent linkage between two adjacent pyrimidines in the same DNA strand. When this type of mutation occurs, the hydrogen bonds between the two adjacent pyrimidines and the other DNA strand are broken. During DNA replication or transcription, when the DNA or RNA polymerase encounters a pyrimidine dimer, the polymerases either stop replication/transcription, or an incorrect base is placed in the synthesized DNA/RNA.

In E. coli, pyrimidine dimers can be repaired with the nucleotide excision repair (NER) system.  NER involves four proteins: UvrA, UvrB, UvrC, and UvrD. These four proteins remove a segment of DNA including the pyrimidine dimer, then replace the removed nucleotides via DNA replication.  NER occurs as follows:

  1. A complex consisting of two UvrA proteins and one UvrB protein scans the double stranded DNA in search of a pyrimidine dimer.
  2. Once a pyrimidine dimer is identified, the complex pauses over the dimer.  The UvrA proteins are released, while UvrC attaches to UvrB at the dimer site.
  3. UvrC is an endonuclease that cuts the damaged DNA strand on each side of the pyrimidine dimer.
  4. UvrD, which is a DNA helicase, separates the two DNA strands, releasing the short segment of damaged DNA, including the pyrimidine dimer itself. UvrB, UvrC, and UvrD are released.
  5. A DNA polymerase fills in the gap, using the parental DNA strand as a template.
  6. DNA ligase seals the gap in the newly synthesized DNA.

Key Questions

  • What nucleotides can be involved in a pyrimidine dimer?
  • What are the functions of the UvrA, UvrB, UvrC, UvrD, DNA polymerase, and DNA ligase proteins in NER?
  • Which Uvr protein unwinds and removes the damaged DNA?

Homology Directed Repair (HDR) (future content)

Nonhomologous End Joining (NHEJ) (future content)

Human Genetic Diseases Associated with Faulty DNA Repair

The repair mechanisms in the bacterium E. coli described above have counterparts in human cells.  Although the proteins are not identical, much of the repair process is similar. Mutations in the genes that give rise to mismatch repair enzymes have been noted in human cells. When both copies of the gene are inactivated, then cancer can occur. Specifically, a type of inherited cancer called hereditary nonpolyposis colorectal cancer occurs because of mutations in the genes that produce human mismatch repair proteins. Mutations in the base excision repair system in humans have also been implicated in colon cancer.

Epithelial cells in the skin are exposed to UV light from the sun. As a result, the nucleotide excision repair system plays an important role in repairing pyrimidine dimers in epithelial cells. Xeroderma pigmentosum (XP), an autosomal recessive disease, occurs when one of the seven human genes involved in nucleotide excision repair are inactivated by mutation. Xeroderma pigmentosum patients are sensitive to sunlight (due to the inability to repair DNA damage caused by UV light) and are predisposed to forming skin cancer.

Key Questions

  • What inheritance pattern is most often associated with diseases caused by defects in the DNA repair systems? Why do you think this is so?

C. Trinucleotide Repeat Expansions and Disease

In humans, there are DNA sequences in which a series of three nucleotides is repeated consecutively (trinucleotide repeat). In most cases, a parent transmits these repeats to their offspring without any change in the repeat number. In a few human genetic diseases, however, this repeat can expand over the course of generations due to slippage of the DNA polymerases during replication. Once the expansion reaches a critical size, it can alter the function of the encoded protein, causing disease.

Huntington’s disease (HD) is an example of a neurodegenerative disease in humans caused by an expansion of a CAG repeat within the coding region of the HTT gene. Although we do not know the exact function of the encoded HTT protein in the brain, it is believed that HTT plays an important role in the function of neurons.

During protein synthesis, CAG encodes the amino acid, glutamine. In unaffected people, there can be between 10 and 35 repeats of the CAG sequence in the HTT gene. These repeats lead to a string of glutamine amino acids within the protein. So long as the number of CAG repeats remains below 35, the encoded HTT protein functions normally. However, in some families, the number of repeats can extend beyond 35, ranging from 36 to 120 repeats. The longer string of glutamine amino acids causes the HTT protein to degrade into smaller, toxic fragments. As these toxic protein fragments accumulate, neurons die prematurely. Over time, brain activity is altered, leading to symptoms of Huntington’s disease that include uncontrolled body movements, emotional problems, and decreased ability to learn and to make decisions. The number of CAG repeats in the HTT gene correlates with the severity of the disease; a patient with more CAG repeats typically has more severe symptoms than a person with fewer repeats.

Key Questions

  • Why do trinucleotide expansions above a certain threshold cause human disease?

Review Questions

Fill in the blank:

  1. A bulge in the DNA double strand width occurs when two _______________________ form a base pair.
  2. Exposure to radiation and chemicals is responsible for causing _______________ mutations.
  3. A single base substitution in the coding region of a gene that exchanges one amino for a stop codon is called a ______________________ mutation.
  4. A deletion of four nucleotides from the coding region of a gene would result in a _________________________ mutation.
  5. In each repair system, ____________________________ forms the final covalent bond in the damaged DNA strand after the correct nucleotide has been added.
  6. If a mutation occurs in the DNA such that guanine is across from uracil, then the most likely repair system to recognize and correct this mutation would be _______________________________ .
  7. In mismatch repair, ______________________________ recognizes and binds to the methylated adenine on the parental DNA strand.
  8. ________________________________ is a helicase used in nucleotide excision repair.
  9. __________________________________ is an example enzyme that has 3’ – 5’ exonuclease activity.
  10. Xeroderma pigmentosum is caused because of a defect in the ____________________________________________ repair system.
  11. CAG encodes the amino acid ____________________________ and when the number of CAG repeats exceeds _____________ then neurological symptoms appear consistent with Huntington’s disease.

End-of-Chapter Survey

: How would you rate the overall quality of this chapter?
  1. Very Low Quality
  2. Low Quality
  3. Moderate Quality
  4. High Quality
  5. Very High Quality
Comments will be automatically submitted when you navigate away from the page.