Thursday, June 08, 2006

RNAi fundamentals   posted by Coffee Mug @ 6/08/2006 12:39:00 PM

Can you understand this? Let me teach you a lesson, yo / The pre-existence of the mathematical biochemical equations / the manifestations of God: earth, air, fire, and water which are in it's basic formation / solid, liquid, and gases that caused the land masses, and the space catalyst and all matter that exists and is dense... - RZA

I plan on writing about RNA interference (RNAi) a lot as there are new discoveries in this area all the time and I'm trying to keep up to date. In the interest of not having to re-introduce the basics everytime, I am creating this post so I can link to it from now on when I write about any of the details. I will focus entirely on regulation at the translation (protein synthesis) level, since I have only recently discovered the transcription-level literature and I haven't been able to digest it at all. I don't study RNAi directly in my research. That is to warn you that my knowledge of it is probably more incomplete than I may let on, and I may be misunderstanding certain aspects of it with no negative feedback sources around. In spite of this, I find the whole scene fascinating, and I think it is really an exciting time to understand molecular biology and watch this entirely new regulatory system with myriad potential applications as a research and therapeutic tool unfold.

Small regulatory RNAs come in two major types: small interfering RNAs (siRNAs) and microRNAs (miRNAs). In their mature, effective state they range from 21 to 30 nucleotides long. The function of these small RNAs is to specifically inhibit translation of target messenger RNAs (mRNAs). Messenger RNAs are the normal RNA that you think about in the central dogma of genetics (DNA -> RNA -> protein). Translation is the process by which ribosomes and associated factors gather around an mRNA and translate from the triplet code of nucleotides to an amino acid chain (protein). Just a reminder in case you don't think about this everyday.

I say that these small RNAs work specifically because they don't globally inhibit translation. A given small RNA will have a sequence that is the mirror image of the sequence of some particular mRNA. The small RNA will guide silencing machinery to its target mRNA through this sequence complementarity, so that no protein is made from this mRNA. This is RNA silencing or RNA interference. You can distinguish siRNAs from miRNAs by their biogenesis pathways. The siRNA pathway is rather short relative to the miRNA pathway. In the end, for both pathways, you have a small RNA template loaded into RNA silencing machinery. Then the function of small RNAs can diverge again based on their mechanism of silencing, which is determined by the structure created when the template binds to its target RNA.


Small interfering RNAs are derived from larger chunks of double-stranded RNA. The siRNA pathway seems to function as a viral defense or maybe a genome integrity defense mechanism. HIV-1 and HIV-2, for instance, are dsRNA viruses. Cells also encounter dsRNA in the form of transposons (jumping genes). Transposons are a lot like viruses that just happen to be hanging out in our own genome. Alu elements, which have been used in human ancestry studies are a form of transposon. Transposons get transcribed from one section of the genome and then seem to sort of randomly insert into other parts. This can cause problems if they were to drop into the middle of an important gene, so cells generally would like to keep transposon activity at a minimum. Another source of dsRNA is pesky experimenters, who synthesize it themselves and inject it into cells. One thing to note is that you don't need two RNA strands to get dsRNA. A single strand can form a "hairpin" or "stem-loop" structure in which it folds back and base-pairs with itself, leaving a little loop at one end and single-strand tails at the other end. Thus, sometimes researchers that will fold into short hairpin RNAs (shRNAs) that are good for RNA silencing.

Long dsRNAs are converted to small RNAs by a protein called Dicer. Recent work has shown that Dicer acts as a molecular ruler, measuring out chunks of dsRNA in the proper ~25 nucleotide range and chopping them off. Dicer contains a conserved dsRNA-binding domain called PAZ that is actually found in proteins in later steps of the RNAi process too. The proposed mechanism relies on the ~65 angstrom (corresponding to the length of about 25 nucleotides) distance between a portion of the PAZ domain and the active site of RNA cleavage where two RNase III domains line up in the 3D structure and cut across the dsRNA. The main thing here is that Dicer is the protein that produces siRNAs from long dsRNA precursors.

The final step in microRNA production is the same. A pre-miRNA is cleaved into a properly sized mature miRNA by Dicer. However, the miRNAs are purposefully endogenously produced to play a regulatory role in several cellular processes. They are conserved across higher eukaryotes, and are estimated to regulate some 30% of human genes. The precursors to miRNAs are transcribed from our DNA by RNA polymerase II just like plain old mRNAs. The initial transcript is called a pri-miRNA. These can be thousands of bases long. Pri-miRNAs are processed by a pair of proteins, one of which is called Drosha. Part of the difficulty in reading RNAi work is that it is being done across several model systems, and people may name a newly-discovered protein one thing in flies and a different thing in human cells. For instance, Drosha's partner-in-crime is called Pasha in invertebrates and DGCR8 in vertebrates. The whole pri-miRNA processing complex is referred to as the Microprocessor.

The Microprocessor recognizes stem-loop structures that arise in the transcribed pri-miRNA and cuts them out to produce ~65 nucleotide-long pre-miRNAs that consist of ~22 base-pairs, a loop at one end, and a little single-stranded overhang at the other end. This overhang is on the 3' end of the pre-miRNA sequence and is important for further miRNA processing. The 'R' in RNA is for Ribose, a sugar with 5 carbons. Because nucleotides polymerize by linking the third (3') carbon to the fifth (5') carbon, the nucleotides at either end of a string will have the 3' or 5' carbon unattached, and we designate that end accordingly. I think of the 5' end as the start of the strand and place it on the left in my head because that's how my molecular genetics prof drew it. Then you read left to right to get to the 3' end. The third and fourth nucleotides from the 3' end of a pre-miRNA base-pair with the first two nucleotides on the 5' end, leaving a two-nucleotide, 3' overhang.

The reason I'm making a big deal about the 3' overhang is that the concept will pop up again later in the process, and also the 3' overhang is recognized by exportin 5, the protein responsible for transporting pre-miRNAs out of the nucleus to the cytoplasm. Exportin 5 also requires the pre-miRNA to have a dsRNA stem that is larger that 16 base-pairs. The loop part on the other end of the stem from the 3' overhang doesn't seem to be important. Once the pre-miRNA is out in the cytoplasm it can be recognized by Dicer and cut down to the small, double-stranded mature miRNA size.

The final step in miRNA and siRNA maturation occurs as the little double-stranded chunks are being loaded into RNA-induced silencing complex (RISC). One of the strands in the duplex will have slightly less thermodynamic stability in base-pairing near its 5' end. This strand will serve as the template for RNA silencing while the other strand is degraded.


The RNA-induced silencing complex (RISC) is the conglomeration of proteins that actually carries out the dirty work of RNA inteference. Dicer cuts the dsRNAs down to size, but then remains bound to them and ends up associating with RISC through an interaction with another protein called R2D2 (at least that's what it's called in flies). Dicer and R2D2 in complex are in part responsible for loading the siRNAs into RISC. There are associated proteins found in some purifications of RISC but not others: certain RNA-unwinding proteins and even the fly version of fragile X mental retardation protein (Fmr1).

Argonaute is the protein at the catalytic center of RISC and has been found in every preparation of RISC so far. RISC is usually detected by is endonuclease activity. This is the simplest mechanism of RNA silencing. The target RNA is simply lined up with the template and then chopped into two pieces by a protein that until recently wasn't identified, but was referred to as Slicer. In recent years it has become apparent that Slicer is Argonaute. Argonaute activity relies on two important domains: a PAZ domain similar to the one that is present in Dicer and a PIWI domain. The PAZ domain plays the same role in Dicer and Argonaute. It binds to the 3' overhang of the template strand and positions it in relation to a RNA slicing domain.

The actual slicing is performed by the PIWI domain. Slicing is very specific, cutting in between the two nucleotides on the target strand that lie across from the 10th and 11th nucleotides of the template. People in the field characterize the mechanism of cleavage by way of analogy to an enzyme called RNase H, but this may not help in this case. The RNA backbone is built by connecting sugars through phosphate groups. The PIWI domain acts on the target strand and separates the phosphate from the 3' carbon. This reaction is dependent on divalent metal ions. This may be more than you want to know, but that's the price you pay for reading this far.

All that is to introduce you to RISC. siRNAs and miRNAs are incorporated into RISC to do their job. This is often referred to as 'programming' the RISC. In most cases, siRNAs guide this cleavage reaction of their target sequence. MicroRNAs, on the other hand, often don't lead to cleavage, but instead reduce the level of their target RNA in a slicer-independent manner. The major distinction determining whether RNA is reduced by slicer or not appears to be the degree of base-pairing complementarity between template and target. If there are bumps in the base-pairing (i.e. the sequences aren't exact mirror images) especially at the site where slicing would occur then the reaction cleavage reaction doesn't take place. It just happens that siRNAs tend to have perfect base-pairing and miRNAs tend not to, but the issue doesn't appear to be route of biogenesis. A miRNA that base-pairs perfectly will probably guide cleavage.

If RNAs aren't being chopped up by RISC, what is happening to them? This is where all the recent buzz about P-bodies comes in. P-bodies (processing bodies) are cytosolic sites of mRNA degradation, a conglomeration of proteins that carry out the orderly sequential steps of deadenylation, decapping, and exonucleolytic degradation. These exo- and endo- nuclease terms may be unfamiliar. All they indicate is whether the RNA is being chopped up from the ends of the strand or from within the stand, respectively. Some of the proteins involved in decapping are Dcp1, Dcp2, and Xrn1 (more here). They are all found in these little foci in the cytoplasm called P-bodies, and it turns out that Argonaute proteins co-localize with P-bodies in a small RNA-dependent fashion. So even if Argonaute can't directly catalyze target RNA destruction it can drag it to the processing center where it will be degraded via another pathway.

Finally, miRNAs don't even have to cause RNA degradation to regulate translation. The target RNA can remain intact but inaccessible to the translation initiation machinery. This process again may involve P-bodies, as ribosomes and initiation factors are excluded from P-bodies. I think this avenue could lead to the most sensitive regulation since you wouldn't need to transcribe a whole new mRNA if you wanted to turn translation of that particular protein back on. You could just release it from translational repression. An issue that remains to be resolved is whether translational repression leads to P-body targeting and degradation or vice versa.


siRNAs and miRNAs are distinguished by their route of biogenesis. miRNAs are endogenous regulatory molecules and must be processed through a series of steps culminating Dicer action whereas siRNAs usually start out as foreign dsRNAs and are processed directly by Dicer. Dicer chauffeurs the diced product into RISC which is the effector of RNA interference/silencing. RISC can silence RNAs through three mechanisms: 1) If the template and target have perfect complementarity, the target is cleaved directly by Argonaute, 2) If there are mismatches in the base-pairing the target RNA can be localized to P-bodies for degradation, or 3) sequestered away from translation machinery also in P-bodies.

Recommended Reading

If you're feeling ambitious after reading this, Nature Genetics has a "MicroRNA Revolution" supplement this month offered for free.

Zamore PD. Haley B. 2005. Ribo-gnome: The big world of small RNAs. Science. 309:1519-1524.

Sontheimer EJ. 2005. Assembly and function of RNA silencing complexes. Nat. Rev. Microbio. 6:127-138.

Valencia-Sanchez MA. Liu J. Hannon GJ. Parker R. 2006. Control of translation and mRNA degradation by miRNAs and siRNAs. Genes & Dev. 20:515-524.

MacRae IJ. Zhou K. Li F. Repic A. Brooks AN. Cande WZ. Adams PD. Doudna JA. 2006. Structural Basis for Double-Stranded RNA Processing by Dicer. Science. 311:195-198.

Zeng Y. Cullen BR. 2004. Structural requirements for pre-microRNA binding and nuclear export by Exportin 5. Nuc. Acids. Res. 32:4776-4785.

Seitz H. Zamore PD. 2006. Rethinking the microprocessor. Cell. 125:827-829.