9 - Transcription

When a gene is activated, the gene is transcribed, producing an RNA intermediate. Structural genes are genes that are transcribed to produce messenger RNA (mRNA) molecules. The mRNA molecule is then translated to make a protein product. Nonstructural genes are also transcribed to produce RNA molecules; however, the RNA molecule is not translated and instead functions directly in the cell. These functional RNA molecules, called noncoding RNAs (ncRNAs), include transfer RNA molecules (tRNAs), ribosomal RNA molecules (rRNAs), and the Xist and Tsix RNA molecules discussed in Part 2.

Key Questions

  • What is the difference between a structural and a nonstructural gene?

A. Transcription in Bacteria

Expression of Structural Genes

What factors determine whether a bacterial structural gene is expressed; in other words, the gene is activated to make a mRNA molecule?  Gene expression requires the interaction between transcription factor proteins and specific DNA sequences near the gene.

The DNA sequences that regulate the expression of a particular structural gene include (see figure 9.1):

Transcription produces an RNA molecule that is complementary to the template or antisense strand of DNA. The other DNA strand, the one that forms hydrogen bonds with the template DNA strand, is called the coding or sense DNA strand. The coding DNA strand is identical in sequence to the RNA transcript, except that the RNA molecule contains uracil (U) instead of thymine (T).

Figure 9.1 DNA sequences that control transcription.  --- Image created by SL

Key Questions

  • What are the functions of the three DNA sequences that regulate transcription?
  • What is the difference between the template and the coding DNA strands?

Transcription Stages

Transcription of structural genes in the bacterium E. coli  has the following three stages (see figure 9.2):

  1. Initiation. During the initiation stage of transcription in bacteria, a transcription factor protein called sigma (σ) factor guides the RNA polymerase to the promoter.
  2. Elongation. During elongation, the RNA polymerase bound to the promoter acts as a DNA helicase, separating the two DNA strands, forming an open complex. RNA polymerase then reads the template DNA strand while synthesizing a complementary mRNA transcript.
  3. Termination. The termination stage of transcription involves the release of the RNA polymerase and the mRNA molecule from the DNA.
Figure 9.2 Transcription Overview --- Image created by SL.

Key Questions

  • What is happening during the three stages of transcription?
  • What is the function of sigma factor?

Promoter Structure in Bacteria

The bacterial promoter is located upstream (typically drawn to the left) of the structural gene to be transcribed and serves as a docking site for the sigma (σ) factor protein and later RNA polymerase. DNA sequence elements within the promoter are numbered relative to the +1 site, the first nucleotide in the template DNA strand that is transcribed (see figure 9.3). Important DNA sequences within the bacterial promoter include the following:

Both the -35 and -10 sequences described above (5’-TTGACA-3’ and 5’-TATAAT-3’) are consensus sequences, meaning that they are the “average” sequences found when the DNA sequences of many E. coli  promoters are compared. Some bacterial promoters are strong promoters, whereas others are weak promoters. The difference between strong and weak promoters largely depends on how closely the promoter DNA sequence in question matches the -35 and -10 consensus sequences.  Strong promoters initiate transcription frequently, while weak promoters initiate transcription less frequently.

Figure 9.3 Bacterial Promoter --- Image created by SL

Key Questions

  • What is the function of the -35 sequence?
  • What are the two functions of the -10 sequence?
  • What is the function of the +1 site?
  • What is the difference between a strong and a weak promoter?

Bacterial RNA Polymerase

In the bacterium E. coli, the RNA polymerase core enzyme is composed of five protein subunits (α1, α2, β, β’, and ω) (see figure 9.4). The two α subunits and the ω subunit function to assemble the enzyme and bind to the DNA sequence to be transcribed. The RNA molecule is synthesized between the β and β’ subunits.

The RNA polymerase core enzyme 1, α2, β, β’, and ω subunits) associates with the sigma (σ) factor protein to form the RNA polymerase holoenzyme.  E. coli makes at least eight different types of sigma factor proteins, depending on the environmental conditions encountered by the cell. For example, the main sigma factor in E. coli is called the housekeeping sigma factor or σ70 protein.  The σ70 protein functions to guide the RNA polymerase core enzyme to the promoters of structural genes required for the viability of the E. coli cell in a typcial environment (e.g., body temperature with plenty of carbon and nitrogen sources). In addition to the σ70 protein, there are specialized sigma factor proteins that guide the RNA polymerase core enzyme to survival genes when an E. coli cell encounters stressful environments. These specialized sigma factors include a nitrogen starvation sigma factor (σ54), a carbon starvation sigma factor (σ38), and a heat shock sigma factor (σ32).  Because the sigma (σ) factor proteins regulate transcription, the sigma (σ) factor proteins are example transcription factor proteins 

Figure 9.4 RNA Polymerase Holoenzyme Subunits --- Image created by SL

Key Questions

  • Why does E. coli make several different types of sigma factor proteins?
  • What is the difference between the RNA polymerase core enzyme and the RNA polymerase holoenzyme?

Transcription Initiation in Bacteria

Transcription initiation in bacteria (E. coli) occurs as follows:

  1. The RNA polymerase holoenzyme recognizes the promoter via sigma (σ) factor binding to the -35 and -10 DNA sequences. At this stage, the RNA polymerase holoenzyme:DNA complex is called a closed complex because the two DNA strands are still hydrogen bonded together.
  2. The AT hydrogen bonds within the -10 sequence are broken forming an open complex. The RNA polymerase core enzyme is the DNA helicase that separates the two DNA strands at the -10 sequence.
  3. A short RNA molecule is synthesized beginning at the +1 sequence; however, the RNA polymerase core enzyme is still attached to σ factor.  Sigma (σ) factor is still bound to the -10 and -35 DNA sequences.
  4. The sigma (σ) factor protein is released, freeing the RNA polymerase core enzyme.
  5. Once the sigma (σ) factor protein is released, transcription transitions to the elongation phase as the RNA polymerase core enzyme incorporates additional nucleotides at the 3’ end of the RNA transcript.

Key Questions

  • Describe the initiation phase of transcription in bacteria.

Elongation in Bacteria

The elongation phase of transcription in bacteria involves RNA synthesis by the RNA polymerase core enzyme (see figure 9.5). The E. coli RNA polymerase core enzyme has the following features:

Figure 9.5 Transcription Elongation --- Image used from OpenStax (access for free at https://openstax.org/books/biology-2e/pages/1-introduction)

Key Questions

  • What are the similarities and differences between the RNA polymerase core enzyme and the DNA polymerases discussed in Part 6?
  • Which protein functions as the DNA helicase for transcription?
  • What molecules provide the energy for transcription?

Transcription of Multiple Genes

Not all genes use the same DNA strand as the template strand. In figure 9.6, genes A and B use the bottom DNA strand as the template strand for RNA synthesis, because the promoter is located to the left of the gene.  Alternatively, gene C uses the top DNA strand as the template strand, as the promoter is located to the right of the gene. Genes A and B are transcribed left to right, while gene C is transcribed right to left.  

Figure 9.6 Transcription of Multiple Genes --- Image created by SL

Rho (ρ)-Dependent Termination

While the RNA polymerase core enzyme is synthesizing a mRNA molecule, an RNA-DNA double helix molecule is formed within the enzyme. Transcriptional termination involves weakening the hydrogen bonds within this RNA-DNA double helix, resulting in dissociation of the RNA (and the RNA polymerase core enzyme) from the DNA.

Transcriptional termination can occur in two different ways in the bacterium E. coli:

The rho (ρ)–dependent mechanism of termination requires binding between the rho (ρ) protein, a helicase that breaks the hydrogen bonds within an RNA-DNA double helix, and an RNA sequence near the 3’ end of the mRNA transcript called the rho utilization site (rut) (see figure 9.7). The ρ-dependent mechanism of transcription termination also requires the formation of a secondary structure within the RNA transcript called a stem-loop or hairpin loop. The stem-loop is formed when guanine (G) and cytosine (C) bases are produced in the mRNA as the  RNA polymerase core enzyme reads the terminator DNA sequence. The stem-loop, composed of hydrogen bonds between these G and C nucleotides within the same mRNA molecule, slows the RNA polymerase core enzyme during transcription. The rho (ρ) protein then catches up with the RNA polymerase, separates the RNA from the template DNA strand, and releases the RNA transcript and the RNA polymerase core enzyme from the DNA. Transcription is terminated.

Key Questions

  • What three components are involved in rho (ρ)-dependent termination?
  • What are the functions of each of these components in rho (ρ)-dependent termination?

Rho (ρ)-Independent Termination

The rho (ρ)-independent termination mechanism does not require rho (ρ) protein or the rut RNA sequence (see figure 9.7). In rho (ρ)-independent termination of transcription, a stem-loop structure is formed in the newly synthesized RNA that slows the RNA polymerase core enzyme. This pausing of the RNA polymerase is aided by the NusA protein. While the RNA polymerase slows down, a uracil-rich region is synthesized in the RNA because the RNA polymerase core enzyme is copying an adenine-rich region in the template DNA strand. Recall that each uracil base in the mRNA forms two hydrogen bonds with each adenine base in the template DNA strand. This weak base pairing between U and A bases tends to break spontaneously, releasing the mRNA and RNA polymerase, terminating transcription.

The mechanism that is used for transcription termination depends on the gene. About 50% of E. coli genes use the rho (ρ)-dependent mechanism, the other 50% of genes use the rho (ρ)-independent mechanism.  An individual gene does not use both termination mechanisms.

Figure 9.7 Transcription Termination in Bacteria --- Image created by SL

Key Questions

  • What three components are involved in rho (ρ)-independent termination?
  • What are the functions of each of these components in rho (ρ)-independent termination?

B. Transcription in Eukaryotes

Transcription is important to a eukaryotic cell, as the activation of a structural gene allows eukaryotic cells to adapt to environmental changes (e.g., the presence of a hormone in the blood can activate transcription; see Part 14). Moreover, many eukaryotic organisms are multicellular, so genes need to be transcribed at the right time during development and in the correct cell type. For example, genes involved in building the central nervous system should be transcribed during embryonic development. Genes that encode proteins involved in muscle contraction should be transcribed in muscle cells and not transcribed in other cell types, such as white blood cells. These phenotypic differences are due to transcription, as all cell types (neurons, muscle cells, white blood cells) in the body contain an identical collection of genes.

DNA Sequences Control Eukaryotic Transcription

The transcription of eukaryotic genes is controlled by several types of DNA sequence elements, including the following (see figure 9.8):

For a eukaryotic gene to be transcribed, the TATA box and the +1 site must be present. However, if these two sequences are the only DNA sequences present upstream of a gene, the gene is transcribed at a low, yet constant rate, the so-called basal level of transcription.

The DNA sequences that influence transcription of an adjacent gene are called cis-acting DNA elements. Cis-acting DNA elements include the core promoter, enhancer, and silencer sequences. The transcription factor proteins that bind to these cis-acting DNA elements are called trans-acting factor proteins. Trans-acting factors proteins, also called transcription factor proteins, include activator proteins, repressor proteins, and the general transcription factor (GTF) proteins (see below).

Figure 9.8 Eukaryotic Core Promoter --- Image created by SL

Key Questions

  • What are the names of the two sequence features within the core promoter?
  • What are the two functions of the TATA box?
  • What are names and functions of the two regulatory DNA sequences that influence the transcription of eukaryotic genes?
  • What are the names of the proteins that bind to these two regulatory DNA sequences?

RNA Polymerases in Eukaryotes

In eukaryotes, there are three types of RNA polymerases that handle transcription:

Key Questions

  • What types of genes do the three eukaryotic RNA polymerases transcribe?

Initiation in Eukaryotes

Both basal (constant, low level) transcription and regulated (above or below the basal level) transcription of structural genes in eukaryotes require the following proteins (see figure 9.9):

The association of RNA polymerase II with the six GTF proteins listed above forms a preinitiation complex. The preinitiation complex is also called the basal transcription apparatus.


Figure 9.9 Transcription Initiation in Eukaryotes --- Image used from OpenStax (access for free at https://openstax.org/books/biology-2e/pages/1-introduction)

Key Questions

  • Which GTF binds to the core promoter?
  • Which GTF acts as a bridge to connect the GTF bound to the core promoter to the GTF bound to RNA polymerase II?
  • Which GTF is the DNA helicase that separates the two DNA strands?
  • Which GTF activates RNA polymerase II?

General and Regulatory Transcription Factors

Transcription factor proteins influence the ability of RNA polymerase II to bind to a eukaryotic core promoter. A huge number of eukaryotic genes encode transcription factor proteins; it is estimated that as many as 1000 human genes encode proteins that regulate transcription!  There are two categories of transcription factor proteins:

The DNA binding sites (core promoter, enhancer, and silencer sequences) for these transcription factor proteins tend to be near the genes they control. As a result, the DNA sequences are called cis-acting DNA elements.  However, these cis-acting DNA elements do not need to be immediately adjacent to the core promoter. Some enhancers and silencers can be within the gene they control or can be thousands of base pairs away.  The transcription factor proteins (GTFs, activators, and repressors) that bind to the cis-acting DNA elements are trans-acting factor proteins.

Since transcriptional control requires both input from a myriad of DNA sequences and proteins, some component in the cell needs to interpret the various activation and repression signals to provide an overall signal to RNA polymerase II.  A large multi-subunit mediator protein complex regulates the interaction between RNA polymerase II and the activator and repressor proteins. Mediator thus serves as a link between transcription factors that bind to enhancer and silencer DNA sequences and RNA polymerase II, thereby determining the overall rate of transcription.

Figure 9.10 Regulatory Transcription Factors and Mediator.  Mediator (light blue) interprets the activation signals from activator proteins (orange) bound to enhancer DNA sequences (green) and the repression signals from repressor proteins (yellow) bound to silencer DNA sequences (magenta).  Mediator then communicates an overall transcription signal (an activation signal in this case) to the general transcription factor proteins (purple) and RNA polymerase II (pink).  RNA polymerase II is positioned on the +1 site (not shown) and transcribes the gene towards the right. --- Image created by SL

Key Questions

  • What are three examples of cis-acting DNA elements?
  • What are three examples of trans-acting factor proteins?
  • What is the function of the mediator protein complex?

Transcription Elongation in Eukaryotes

The elongation step in eukaryotic transcription is virtually identical to the transcription elongation step in prokaryotes. RNA polymerase II in eukaryotes has the same functional capabilities as the RNA polymerase core enzyme from E. coli.

Key Questions

  • What are the names of the two proteins that act as DNA helicases in eukaryotic transcription?

Transcription Termination in Eukaryotes

Transcriptional termination in eukaryotes occurs during the process of 3' end polyadenylation, a modification to the 3’ ends of eukaryotic mRNAs. We will cover 3' end polyadenylation in more detail in Part 10. In short, an endonuclease called cleavage and polyadenylation specificity factor (CPSF) binds to a polyadenylation signal sequence (5'-AAUAAA-3') near the 3' end of the mRNA. CPSF then cuts the mRNA approximately 20 nucleotides downstream (towards the 3’ end of the mRNA) from the polyadenylation signal sequence. Cleavage of the mRNA by CPSF releases the mRNA from RNA polymerase II.

After CPSF releases the mRNA from RNA polymerase II, there are two potential ways that RNA polymerase II can be released from the DNA, thereby terminating transcription:


Figure 8.11 Transcription Termination in Eukaryotes A) Torpedo Model B) Allosteric Model --- Images created by SL

Key Questions

  • What is the difference between the torpedo and the allosteric models of transcription termination?

Review Questions

Fill in the blank:

  1. When structural genes are expressed, they produce ___________________RNA molecules; when nonstructural genes are expressed, they produce _________________RNA molecules.
  2. _________________ is a GTF protein that has both DNA helicase and kinase activity.
  3. The _____________ protein binds to the -10 and -35 sequences.
  4. The RNA polymerase holoenzyme consists of the _____________________________ protein subunits and the _________________________ factor protein.
  5. The TATA box (-25 sequence) is the binding site for the _________________ protein.
  6. The ____________ protein binds to the rut sequence found in 50% of bacterial mRNA molecules.
  7. RNA polymerase _____ is responsible for transcribing eukaryotic structural genes.
  8. Phosphorylation of _________________________ helps to activate transcription in eukaryotes.
  9. A(n) __________________ protein binds to an enhancer sequence in the DNA to activate transcription above the basal level, while a(n) ______________ protein binds to a silencer sequence in the DNA to decrease transcription below the basal level.
  10. The ____________ protein causes the RNA polymerase core enzyme to pause at the stem loop in the rho (ρ)-independent mechanism.
  11. DNA replication requires the use of DNA helicase to unwind double-stranded DNA, while transcription in bacteria uses the _________________________ to unwind double-stranded DNA.

This content is provided to you freely by BYU-I Books.

Access it online or download it at https://books.byui.edu/genetics_and_molecul/20___transcription.