7 - Mutations and DNA Repair

The function of DNA is to store genetic information. To store genetic information effectively, it is important that the DNA sequence remains unchanged from generation to generation. However, sometimes mistakes occur during DNA replication; recall one mistake is made every 10 to 100 million nucleotides incorporated into daughter DNA strands in bacteria. Further, environmental agents, such as ionizing radiation, ultraviolet light, and a myriad of chemicals damage DNA. DNA damage is often detrimental; damage to structural genes worsens protein activity and negatively affects phenotype. As a result, it may seem that it would be beneficial to fix all replication errors or DNA damage; however, changes in the DNA sequence can be advantageous in rare, but important, cases. Alterations in the DNA sequence can create new structural gene variants called alleles in a population. Evolution requires this diversity of alleles; natural selection chooses which allele combinations survive and reproduce in a particular environment. For this reason, a low level of mutation is required for evolutionary change. In this section, we will define mutation, discuss various ways mutations can be classified, learn the mechanisms used by a bacterial cell to repair damaged DNA, and discuss situations in which defective DNA repair leads to disease in humans.

A. Mutations

What is a mutation?

A mutation is a change in the DNA sequence that is inherited by daughter cells after cell division. When environmental agents, such as ionizing radiation and ultraviolet light damage the DNA double-helix, induced mutations occur. In other cases, DNA polymerases inadvertently incorporate an incorrect nucleotide into the daughter DNA strand during DNA replication. This DNA replication error is called a spontaneous mutation.

Key Questions

What is the definition of mutation?
What is the difference between an induced and a spontaneous mutation?

Germline and somatic mutations

What determines if a change in the DNA sequence is inherited? Gametes cells, such as sperm and egg cells, arise from specialized cells called germline cells. In contrast, the non-gamete cells within the body, such as muscle cells, neurons, and epithelial cells, are considered somatic cells. Therefore, when a mutation arises in a germline cell (germline mutation), the mutation is transmitted via the gametes to the individual's offspring. In contrast, a mutation generated in a somatic cell (somatic mutation) is not transmitted to the individual's offspring. Even though somatic mutations are not inherited by the individual's offspring, alterations in the genomes of somatic cells are still considered mutations. When a somatic cell divides by mitosis, the mutation passes to the daughter cells.

The timing of the mutation can also influence transmission to offspring. If a mutation arises in a cell during embryonic development, then chances are reasonable that the mutation will be found in both the germline and somatic cells of the adult. Since this mutation is now found in germline cells, the mutation will then be transmitted to the individual's offspring. In contrast, when a mutation arises late in development or in the adult organism, the mutation may only be found in one type of somatic cell and will likely be absent from the germline cells. As a result, this somatic mutation will not be transmitted to the individual's offspring.

The timing of the mutation can also affect the severity of the phenotype. If a somatic mutation arises early in the development of a particular organ, it is likely that the majority of cells that make up the organ will contain the mutation. When an organ is made up of mostly mutant cells, the phenotype of the organ may be negatively impacted. Alternatively, if only a subset of cells in the organ contains the mutation, the negative phenotypic effect is comparatively mild.

Key Questions

What is the difference between a germline and a somatic mutation?
How likely is a germline mutation inherited by offspring? How likely is a somatic mutation passed from parents to offspring?
If a mutation arises early in development, is it more or less likely to be transmitted to the germline?

Point Mutations

Sometimes a mutation alters a single nitrogenous base in the DNA molecule. When only one nitrogenous base has been altered, the mutation is classified as a point mutation. When a purine is changed to another purine (e.g., adenine to guanine), or a pyrimidine is exchanged for another pyrimidine (e.g., cytosine to thymine), the point mutation is a transition. Alternatively, when a purine is exchanged for a pyrimidine (or vice versa), the point mutation is a transversion.

Silent, Missense, and Nonsense Mutations

When the mutation occurs in the intergenic or repetitive DNA sequences within chromosomes, the mutation often has no or mild phenotypic consequences. Even though these mutations outside of the structural genes do not alter the cell's phenotype, these mutations are useful in forensics as they define the DNA fingerprint of an individual (see Chapter 1). We will focus our attention on mutations that occur within structural genes. Recall that structural genes are transcribed to produce messenger RNA (mRNA) molecules. The mRNA is then translated when the ribosome reads three nitrogenous base-long codons to generate the amino acid sequence of the encoded protein (see Chapter 11). When one nitrogenous base has been exchanged for another nitrogenous base, but there is no change in the amino acid sequence of the encoded protein, we refer to the mutation as a silent mutation (see Figure 7.1). Silent mutations occur because the genetic code is degenerate, that is, more than one triplet codon RNA sequence encodes the same amino acid (see Chapter 11).

WILD-TYPE 5’ AUG . UUC . GUG . CAC . UUA . AUC . UAG 3’

MET . PHE . VAL . HIS . LEU . ILE . STOP

SILENT 5’ AUG . UUU . GUG . CAC . UUA . AUC . UAG 3’

MET . PHE . VAL . HIS . LEU . ILE . STOP

MISSENSE 5’ AUG . UUC . GUG . AAC . UUA . AUC . UAG 3’

MET . PHE . VAL . ASN . LEU . ILE . STOP

NONSENSE 5’ AUG . UUC . GUG . CAC . UAA . AUC . UAG 3’

MET . PHE . VAL . HIS . STOP

FIGURE 7.1 Base Substitutions Can Affect Gene Structure and Function. The sequence of the wild-type RNA is indicated at the top, with the amino acid sequence of the translated protein shown below it. In each mutation, the effect on the amino acid sequence of the encoded protein is indicated in bold. Silent mutations do not change the encoded amino acid. Missense mutations change a single amino acid in the encoded protein. Nonsense mutations change a codon that encodes an amino acid into a stop codon.

When a point mutation substitutes a single amino acid in the encoded protein, the mutation is a missense mutation. Although missense mutations seem relatively minor, there are examples of missense mutations that cause disease. For example, sickle cell anemia is an autosomal recessive disease caused by a missense mutation in the structural gene that makes the beta globin protein (i.e., one of the subunits of the oxygen transport protein hemoglobin). In sickle cell anemia, the missense mutation in the beta globin gene mRNA replaces the amino acid glutamic acid with the amino acid valine.

Sometimes a point mutation results in the formation of a premature stop codon, terminating the translation process. This type of point mutation is a nonsense mutation. Nonsense mutations are usually more severe in phenotype than missense mutations because nonsense mutations cause the encoded protein to be shorter than normal. These shorter proteins are often nonfunctional.

Frameshift Mutations

In some cases, nitrogenous bases are either inserted (i.e., an insertion mutation) or deleted from the sequence of the structural gene. Because the ribosome reads the encoded mRNA one codon at a time during translation (see Chapter 11), insertions or deletions change the reading frame of the mRNA molecule. When the reading frame changes, the entire amino acid sequence of the encoded protein changes from the insertion/deletion (indel) site onward. A mutation that changes the reading frame is called a frameshift mutation (see Figure 7.2).

WILD-TYPE 5’ AUG . UUC . GUG . CAC . UUA . AUC . UAG 3’

MET . PHE . VAL . HIS . LEU . ILE . STOP

FRAMESHIFT 5’ AUG . UUC . GCU . GCA . CUU . AAU . CUA . G 3’

MET . PHE . ALA . ALA . LEU . ASN . LEU ….

FIGURE 7.2 Frameshift Mutations Affect Gene Structure and Function. By inserting a single cytosine base in the third codon (underlined), a reading frame shift occurs that changes the amino acid sequence of the protein. Note that this frameshift mutation eliminates the stop codon.

Key Questions

What are transition and transversion mutations?
What is the difference between a silent, missense, and a nonsense mutation?
How do frameshift mutations affect the amino acid sequence of the encoded protein?

B. DNA Repair Systems

Most changes to the DNA sequence are recognized by cellular DNA repair systems immediately. Some of these DNA repair systems recognize changes in the width of the DNA double-helix. Recall that the DNA double-helix always pairs a purine (adenine or guanine) with a pyrimidine (cytosine or thymine). Purine-pyrimidine base pairing results in a DNA molecule that is 2 nanometers (nm) wide (see Chapter 5). Many DNA repair enzymes scan the DNA double-helix during replication, ensuring that the daughter DNA molecules are 2 nm wide. If two purines are paired accidently, a bulge in the DNA double-helix occurs. Similarly, if there are two pyrimidines paired, the double-helix narrows. Variations in the width of the DNA double-helix signal that damage has occurred, and DNA repair is activated.

There are multiple DNA repair systems because there are different types of mutations to correct. Even though each DNA repair system has different components and recognizes a different defect in the DNA, there are some features common among all DNA repair systems. First, each pathway contains an enzyme that recognizes the DNA damage. Next, the DNA damage and often a few extra nucleotides are removed from one of the two DNA strands. Finally, the excised nucleotides are replaced by a DNA polymerase and the final phosphodiester bond in the damaged DNA strand is formed by DNA ligase.

There are six DNA repair systems:

Proofreading
Mismatch repair
Base excision repair (BER)
Nucleotide excision repair (NER)
Homology directed repair (HDR)
Non-homologous end joining (NHEJ)

You will only need to know about the first four DNA repair systems in our current BIO375 class. The homology directed repair (HDR) and non-homologous end joining (NHEJ) content will be added to BIO375 in the future.

Proofreading

The first-line DNA repair system involves the DNA polymerases we discussed in Chapter 6. When a DNA polymerase synthesizes the daughter DNA strand, the DNA polymerase occasionally inserts an incorrect nucleotide. Recall that DNA polymerases possess 3’ to 5’ exonuclease activity, meaning the DNA polymerase reverses directions and removes the incorrect nucleotide in the 3' to 5' direction. The DNA polymerase then moves 5' to 3' again, replacing the excised nitrogenous bases.

Mismatch Repair

If an incorrect base pair is formed during DNA replication and the mistake is not removed by proofreading, the mismatch repair system is activated to fix the mistake. Mismatch repair recognizes which of the two nitrogenous bases in the base pair is correct and then removes the incorrect nitrogenous base. For example, suppose that cytosine in the daughter DNA strand has been paired accidently with adenine in the parental DNA strand during DNA replication. When mismatch repair recognizes this mistake, it now faces a dilemma: which of the two nitrogenous bases is incorrect? If mismatch repair randomly chooses the nitrogenous base to correct, then 50% of the time, it will remove the adenine in the parental DNA strand, instead of the cytosine in the daughter DNA strand.

In mismatch repair, the cell must first distinguish the parental and daughter DNA strands. Recall that in the bacterium E. coli, the parental DNA strand is methylated by DNA adenine methyltransferase (Dam), while newly synthesized daughter DNA strands are unmethylated. The proteins in the mismatch repair system recognize the methylated parental DNA strand, and then mismatch repair removes the mismatched nitrogenous base within the unmethylated daughter DNA strand.

In addition to Dam, there are seven other proteins that make up the mismatch repair system in the bacterium E. coli: MutS, MutL, MutH, MutU, ExoI, DNA polymerase, and DNA ligase. These seven proteins function as follows:

The MutS protein slides along the DNA double helix and identifies the mismatched base pair.
The MutH protein binds to a nearby DNA sequence containing a methylated adenine in the parental DNA strand. MutH is the protein that distinguishes the parental from the daughter DNA strand.
The MutL protein binds to both MutS and MutH forming a loop in the DNA.
MutH is an endonuclease that makes a single-stranded DNA break in the backbone of the unmethylated daughter DNA strand. This break occurs between the G and the A nitrogenous base in the 5’-GATC-3’ sequence recognized by MutH in the daughter DNA strand.
The MutU protein binds to the MutH cut site in the daughter DNA strand. MutU has DNA helicase activity and functions to separate the daughter DNA strand from the parental DNA strand at the MutH cut site.
The ExoI protein is the exonuclease that digests the damaged daughter DNA strand in the 5’ to 3’ direction starting at the cut site produced by MutH. Digestion of the damaged daughter DNA strand continues until ExoI removes the mismatched nucleotide.
The gap in the daughter DNA strand is filled in by a DNA polymerase (i.e., either the DNA polymerase III holoenzyme or DNA polymerase I).
The final phosphodiester bond in the newly synthesized daughter DNA strand is formed by DNA ligase.

Key Questions

What is the purpose of mismatch repair?
Describe the functions of the Dam, MutS, MutL, MutH, MutU, ExoI, DNA polymerase, and DNA ligase proteins in mismatch repair.

Base Excision Repair (BER)

Base excision repair (BER) corrects abnormal nitrogenous bases that sometimes form in DNA. For example, occasionally cytosine in DNA is spontaneously converted into uracil. BER in the bacterium E. coli repairs this transition mutation as follows:

The enzyme uracil DNA glycosylase (UNG) recognizes uracil in the DNA. UNG cleaves the covalent bond that links uracil to deoxyribose (i.e., breaks the covalent bond attached to the 1’ carbon of deoxyribose). The uracil base is released from the rest of the nucleotide, creating an abasic site in the DNA. Note that the nucleotide at the abasic site still contains deoxyribose and the phosphate group; the nucleotide is just missing the uracil nitrogenous base.
The abasic site is detected by apurinic/apyrimidinic endonuclease 1 (APE1), which cleaves the phosphodiester bond at the 5’ end of the abasic site. This nick in the DNA backbone generates the free 3'-OH group that is used as the binding site for DNA polymerase I.
DNA polymerase I uses its 5’ to 3’ exonuclease activity to remove the nucleotide (i.e., the deoxyribose sugar and phosphate group) at the abasic site. DNA polymerase I then uses its DNA synthesis activity (5' to 3' polymerase activity) to replace the removed nitrogenous base with the correct nitrogenous base.
DNA ligase forms the final phosphodiester bond in the DNA strand.

Note that there are similar base excision repair pathways to repair other unconventional nitrogenous bases that form in DNA. For example, base excision repair removes the nitrogenous base hypoxanthine (H) that forms spontaneously from adenine. Instead of UNG, another glycosylase releases hypoxanthine; the other enzymes in BER work the same as described above.

**Figure 7.4 - Base excision repair (BER).**

Key Questions

What is an abasic site?
Explain the functions of UNG, APE1, DNA polymerase I, and DNA ligase in base excision repair.

Nucleotide Excision Repair (NER)

Ultraviolet (UV) light exposure often leads to the formation of a pyrimidine dimer in DNA that distorts the structure of the DNA double-helix. A pyrimidine dimer occurs when UV light causes additional covalent bonds to form between two adjacent pyrimidines in the same DNA strand. When this type of alteration occurs, the hydrogen bonds between the two damaged pyrimidines and the nitrogenous bases in the other DNA strand are broken. When DNA polymerases encounter a pyrimidine dimer in the parental DNA strand, the DNA polymerases either stop replication altogether, or incorrect nitrogenous bases are incorporated into the daughter DNA strands.

In E. coli, pyrimidine dimers are repaired with the nucleotide excision repair (NER) system. NER involves six proteins: UvrA, UvrB, UvrC, UvrD, DNA polymerase, and DNA ligase. These six proteins remove a segment of DNA including the pyrimidine dimer and then replace the removed nucleotides via DNA replication. NER occurs as follows:

A complex consisting of two UvrA proteins and one UvrB protein scans the double stranded DNA in search of a pyrimidine dimer. Pyrimidine dimers often distort the width of the DNA double-helix, alerting the UvrA/UvrB protein complex.
Once a pyrimidine dimer is identified, the UvrA/UvrB complex pauses over the dimer. The UvrA proteins are released, while UvrC attaches to UvrB at the pyrimidine dimer site.
UvrC is an endonuclease that cuts the damaged DNA strand on each side of the pyrimidine dimer.
UvrD, which is a DNA helicase, separates the two DNA strands, releasing the short segment of damaged DNA, including the pyrimidine dimer itself. The UvrB, UvrC, and UvrD proteins are released.
A DNA polymerase (i.e., either the DNA polymerase III holoenzyme or DNA polymerase I) fills in the sequence gap, using the other DNA strand as a template.
DNA ligase forms the final covalent bond in the newly synthesized DNA.

**Figure 7.5 - Nucleotide excision repair (NER).**

Key Questions

Which nitrogenous bases can potentially form pyrimidine dimers?
What are the functions of the UvrA, UvrB, UvrC, UvrD, DNA polymerase, and DNA ligase proteins in NER?
Which Uvr protein is the helicase that unwinds and releases the damaged DNA?

Homology Directed Repair (HDR) (future content)

Nonhomologous End Joining (NHEJ) (future content)

Human Genetic Diseases Associated with Faulty DNA Repair

The DNA repair mechanisms in the bacterium E. coli described above have counterparts in eukaryotic cells. Although the proteins are not identical, each repair process is similar. Mutations in the genes that give rise to mismatch repair enzymes have been noted in human cells. When the genes involved in DNA repair are damaged by mutations, cancer often occurs. For example, a type of inherited cancer called hereditary nonpolyposis colorectal cancer occurs because of mutations in the genes that produce human mismatch repair proteins. Mutations in the base excision repair system in humans have also been implicated in colon cancer. Epithelial cells in the skin are constantly exposed to UV light from the sun. As a result, the nucleotide excision repair system plays an important role in repairing pyrimidine dimers that form in epithelial cells. Xeroderma pigmentosum (XP), an autosomal recessive disease, occurs when one of the genes involved in nucleotide excision repair is damaged by mutation. Xeroderma pigmentosum patients are sensitive to sunlight, due to the inability to repair DNA damage caused by UV light, and are predisposed to forming skin cancer.

Insert Xeroderma pigmentosum image here.

Key Questions

What inheritance pattern is most often associated with diseases caused by defects in the DNA repair systems? Why do you think this is so?

C. Trinucleotide Repeat Expansions and Disease

In humans, certain structural genes contain repeats of a particular three nucleotide sequence (trinucleotide repeat). In most cases, a parent transmits these sequence repeats to their offspring without any change in the repeat number. However, in a few human genetic diseases, this repeat can expand over the course of generations due a defect in DNA replication. This type of mutation is called a trinucleotide repeat expansion. Once the expansion exceeds a threshold size, the function of the encoded protein is altered, causing disease.

Huntington’s disease (HD) is an example of a neurodegenerative disease in humans caused by a trinucleotide repeat expansion of 5'-CAG-3' within the structural gene HTT. Although the exact function of the encoded HTT protein is unknown, it is believed that the HTT protein plays an important role in the function of neurons. During protein synthesis, the 5'-CAG-3' codon encodes the amino acid glutamine. In unaffected people, there can be between 10 and 35 repeats of the 5'-CAG-3' codon in the HTT gene. These repeats lead to a string of 10-35 glutamine amino acids within the encoded protein. As long as the number of 5'-CAG-3' repeats remains below 35, the encoded HTT protein functions normally. However, in some families, the number of repeats can extend beyond 35, ranging from 36 to 120 repeats. This longer string of glutamine amino acids causes the HTT protein to degrade into small, toxic fragments. As these toxic protein fragments accumulate, neurons die prematurely. Over time, brain activity is altered, leading to symptoms of Huntington’s disease, such as uncontrolled body movements, emotional problems, and decreased ability to learn and to make decisions. The number of 5'-CAG-3' repeats in the HTT gene correlates with the severity of the disease; a patient with more 5'-CAG-3' codon repeats typically has more severe symptoms than a person with fewer repeats.

Insert a Huntington's disease figure here.

Key Questions

Why does more than 35 repeats of 5'-CAG-3' lead to Huntington's disease?

Review Questions

Fill in the blank:

A bulge in the DNA double strand occurs when two _______________________ form a base pair.
Exposure to radiation and chemicals is responsible for causing _______________ mutations.
A single nucleotide substitution in a structural gene that exchanges an amino acid for a stop codon is called a ______________________ mutation.
A deletion of four nucleotides from the coding region of a gene would result in a _________________________ mutation.
In each DNA repair system, ____________________________ forms the final phosphodiester bond in the DNA strand after the DNA damage has been repaired.
If a mutation occurs in the DNA such that guanine is paired with uracil, then the most likely DNA repair system to recognize and correct this mutation would be _______________________________ .
In mismatch repair, ______________________________ recognizes and binds to the methylated adenine on the parental DNA strand.
________________________________ is the DNA helicase used in nucleotide excision repair.
__________________________________ is an example enzyme that has 3’ – 5’ exonuclease activity.
Xeroderma pigmentosum is caused because of a defect in the ____________________________________________ repair system.
5'-CAG-3' encodes the amino acid ____________________________ and when the number of 5'-CAG-3' repeats exceeds _____________, neurological symptoms appear consistent with Huntington’s disease.