Tools and Settings
Content
Questions and Tasks
When a gene is activated, the gene is transcribed, producing an RNA intermediate. Structural genes are genes that are transcribed to produce messenger RNA (mRNA) molecules. The mRNA molecule is then translated to make a protein product. Nonstructural genes are also transcribed to produce RNA molecules; however, the RNA molecule is not translated and instead functions directly in the cell. These functional RNA molecules, called noncoding RNAs (ncRNAs), include transfer RNA molecules (tRNAs), ribosomal RNA molecules (rRNAs), and the Xist and Tsix RNA molecules discussed in Part 2.
What factors determine whether a bacterial structural gene is expressed; in other words, the gene is activated to make a mRNA molecule? Gene expression requires the interaction between transcription factor proteins and specific DNA sequences near the gene.
The DNA sequences that regulate the expression of a particular structural gene include (see figure 9.1):
Transcription produces an RNA molecule that is complementary to the template or antisense strand of DNA. The other DNA strand, the one that forms hydrogen bonds with the template DNA strand, is called the coding or sense DNA strand. The coding DNA strand is identical in sequence to the RNA transcript, except that the RNA molecule contains uracil (U) instead of thymine (T).
Transcription of structural genes in the bacterium E. coli has the following three stages (see figure 9.2):
The bacterial promoter is located upstream (typically drawn to the left) of the structural gene to be transcribed and serves as a docking site for the sigma (σ) factor protein and later RNA polymerase. DNA sequence elements within the promoter are numbered relative to the +1 site, the first nucleotide in the template DNA strand that is transcribed (see figure 9.3). Important DNA sequences within the bacterial promoter include the following:
Both the -35 and -10 sequences described above (5’-TTGACA-3’ and 5’-TATAAT-3’) are consensus sequences, meaning that they are the “average” sequences found when the DNA sequences of many E. coli promoters are compared. Some bacterial promoters are strong promoters, whereas others are weak promoters. The difference between strong and weak promoters largely depends on how closely the promoter DNA sequence in question matches the -35 and -10 consensus sequences. Strong promoters initiate transcription frequently, while weak promoters initiate transcription less frequently.
In the bacterium E. coli, the RNA polymerase core enzyme is composed of five protein subunits (α1, α2, β, β’, and ω) (see figure 9.4). The two α subunits and the ω subunit function to assemble the enzyme and bind to the DNA sequence to be transcribed. The RNA molecule is synthesized between the β and β’ subunits.
The RNA polymerase core enzyme (α1, α2, β, β’, and ω subunits) associates with the sigma (σ) factor protein to form the RNA polymerase holoenzyme. E. coli makes at least eight different types of sigma factor proteins, depending on the environmental conditions encountered by the cell. For example, the main sigma factor in E. coli is called the housekeeping sigma factor or σ70 protein. The σ70 protein functions to guide the RNA polymerase core enzyme to the promoters of structural genes required for the viability of the E. coli cell in a typcial environment (e.g., body temperature with plenty of carbon and nitrogen sources). In addition to the σ70 protein, there are specialized sigma factor proteins that guide the RNA polymerase core enzyme to survival genes when an E. coli cell encounters stressful environments. These specialized sigma factors include a nitrogen starvation sigma factor (σ54), a carbon starvation sigma factor (σ38), and a heat shock sigma factor (σ32). Because the sigma (σ) factor proteins regulate transcription, the sigma (σ) factor proteins are example transcription factor proteins
Transcription initiation in bacteria (E. coli) occurs as follows:
The elongation phase of transcription in bacteria involves RNA synthesis by the RNA polymerase core enzyme (see figure 9.5). The E. coli RNA polymerase core enzyme has the following features:
Not all genes use the same DNA strand as the template strand. In figure 9.6, genes A and B use the bottom DNA strand as the template strand for RNA synthesis, because the promoter is located to the left of the gene. Alternatively, gene C uses the top DNA strand as the template strand, as the promoter is located to the right of the gene. Genes A and B are transcribed left to right, while gene C is transcribed right to left.
While the RNA polymerase core enzyme is synthesizing a mRNA molecule, an RNA-DNA double helix molecule is formed within the enzyme. Transcriptional termination involves weakening the hydrogen bonds within this RNA-DNA double helix, resulting in dissociation of the RNA (and the RNA polymerase core enzyme) from the DNA.
Transcriptional termination can occur in two different ways in the bacterium E. coli:
The rho (ρ)–dependent mechanism of termination requires binding between the rho (ρ) protein, a helicase that breaks the hydrogen bonds within an RNA-DNA double helix, and an RNA sequence near the 3’ end of the mRNA transcript called the rho utilization site (rut) (see figure 9.7). The ρ-dependent mechanism of transcription termination also requires the formation of a secondary structure within the RNA transcript called a stem-loop or hairpin loop. The stem-loop is formed when guanine (G) and cytosine (C) bases are produced in the mRNA as the RNA polymerase core enzyme reads the terminator DNA sequence. The stem-loop, composed of hydrogen bonds between these G and C nucleotides within the same mRNA molecule, slows the RNA polymerase core enzyme during transcription. The rho (ρ) protein then catches up with the RNA polymerase, separates the RNA from the template DNA strand, and releases the RNA transcript and the RNA polymerase core enzyme from the DNA. Transcription is terminated.
The rho (ρ)-independent termination mechanism does not require rho (ρ) protein or the rut RNA sequence (see figure 9.7). In rho (ρ)-independent termination of transcription, a stem-loop structure is formed in the newly synthesized RNA that slows the RNA polymerase core enzyme. This pausing of the RNA polymerase is aided by the NusA protein. While the RNA polymerase slows down, a uracil-rich region is synthesized in the RNA because the RNA polymerase core enzyme is copying an adenine-rich region in the template DNA strand. Recall that each uracil base in the mRNA forms two hydrogen bonds with each adenine base in the template DNA strand. This weak base pairing between U and A bases tends to break spontaneously, releasing the mRNA and RNA polymerase, terminating transcription.
The mechanism that is used for transcription termination depends on the gene. About 50% of E. coli genes use the rho (ρ)-dependent mechanism, the other 50% of genes use the rho (ρ)-independent mechanism. An individual gene does not use both termination mechanisms.
Transcription is important to a eukaryotic cell, as the activation of a structural gene allows eukaryotic cells to adapt to environmental changes (e.g., the presence of a hormone in the blood can activate transcription; see Part 14). Moreover, many eukaryotic organisms are multicellular, so genes need to be transcribed at the right time during development and in the correct cell type. For example, genes involved in building the central nervous system should be transcribed during embryonic development. Genes that encode proteins involved in muscle contraction should be transcribed in muscle cells and not transcribed in other cell types, such as white blood cells. These phenotypic differences are due to transcription, as all cell types (neurons, muscle cells, white blood cells) in the body contain an identical collection of genes.
The transcription of eukaryotic genes is controlled by several types of DNA sequence elements, including the following (see figure 9.8):
For a eukaryotic gene to be transcribed, the TATA box and the +1 site must be present. However, if these two sequences are the only DNA sequences present upstream of a gene, the gene is transcribed at a low, yet constant rate, the so-called basal level of transcription.
The DNA sequences that influence transcription of an adjacent gene are called cis-acting DNA elements. Cis-acting DNA elements include the core promoter, enhancer, and silencer sequences. The transcription factor proteins that bind to these cis-acting DNA elements are called trans-acting factor proteins. Trans-acting factors proteins, also called transcription factor proteins, include activator proteins, repressor proteins, and the general transcription factor (GTF) proteins (see below).
In eukaryotes, there are three types of RNA polymerases that handle transcription:
Both basal (constant, low level) transcription and regulated (above or below the basal level) transcription of structural genes in eukaryotes require the following proteins (see figure 9.9):
The association of RNA polymerase II with the six GTF proteins listed above forms a preinitiation complex. The preinitiation complex is also called the basal transcription apparatus.
Transcription factor proteins influence the ability of RNA polymerase II to bind to a eukaryotic core promoter. A huge number of eukaryotic genes encode transcription factor proteins; it is estimated that as many as 1000 human genes encode proteins that regulate transcription! There are two categories of transcription factor proteins:
The DNA binding sites (core promoter, enhancer, and silencer sequences) for these transcription factor proteins tend to be near the genes they control. As a result, the DNA sequences are called cis-acting DNA elements. However, these cis-acting DNA elements do not need to be immediately adjacent to the core promoter. Some enhancers and silencers can be within the gene they control or can be thousands of base pairs away. The transcription factor proteins (GTFs, activators, and repressors) that bind to the cis-acting DNA elements are trans-acting factor proteins.
Since transcriptional control requires both input from a myriad of DNA sequences and proteins, some component in the cell needs to interpret the various activation and repression signals to provide an overall signal to RNA polymerase II. A large multi-subunit mediator protein complex regulates the interaction between RNA polymerase II and the activator and repressor proteins. Mediator thus serves as a link between transcription factors that bind to enhancer and silencer DNA sequences and RNA polymerase II, thereby determining the overall rate of transcription.
The elongation step in eukaryotic transcription is virtually identical to the transcription elongation step in prokaryotes. RNA polymerase II in eukaryotes has the same functional capabilities as the RNA polymerase core enzyme from E. coli.
Transcriptional termination in eukaryotes occurs during the process of 3' end polyadenylation, a modification to the 3’ ends of eukaryotic mRNAs. We will cover 3' end polyadenylation in more detail in Part 10. In short, an endonuclease called cleavage and polyadenylation specificity factor (CPSF) binds to a polyadenylation signal sequence (5'-AAUAAA-3') near the 3' end of the mRNA. CPSF then cuts the mRNA approximately 20 nucleotides downstream (towards the 3’ end of the mRNA) from the polyadenylation signal sequence. Cleavage of the mRNA by CPSF releases the mRNA from RNA polymerase II.
After CPSF releases the mRNA from RNA polymerase II, there are two potential ways that RNA polymerase II can be released from the DNA, thereby terminating transcription:
Fill in the blank: