User:ElNando888/Blog/SCOR

Today, I just received a message from (a Spiral group member) in PM

(...) The technical aspects of RNA are still above my understanding.. tried to look at the SCOR website but obviously don't know how to make use of it.. :/

Hmm, "make use of it"... Well, I'm not sure yet what it can really be useful for, but I started trying to apply some things, like in the example of the hairpin issue in CL17.

Other than that, I've simply tried to "explore" the motifs, trying to understand better what kind of shapes and/or interactions actually exist, specially those that are not covered by the nearest-neighbor model(s) that we normally use.

I'll give an example. When I discovered this database, I was fascinated by the multiplicity of the 3-dimensional motifs that can be found in RNAs. Specifically, one appeared so mysterious to me that I wanted to know more about it, namely the "dinucleotide platform". What the bloody heck is that ??

I'm not sure I could give you a direct link to the motifs, so please follow this path:

Go to SCOR
click "Structural Classification"
expand (click the "+" sign in front of) "Internal Loops"
expand "Loops with dinucleotide platforms"
expand "Loops with simple dinucleotide platforms"
expand "In 16S rRNA"

Ok, there's a very short sequence, UA, and it seems to be a "conserved" motif, since it is found in both Escherichia coli and in Thermus thermophilus (fascinating to see how scientists "love" those organisms :p). The one for E. Coli could be a small sequence, as the numbers are low, so let's try to see what's in this PDB accession 1BGZ.

For those who want to follow the story, and don't have Chimera yet, check out my doc Quickstart with Chimera 1.7.

Everyone ready ? Ok, open Chimera and fetch the 1BGZ structure, process it like I do in the quickstart (additionally, you will need to use the Model Panel to ungroup the models, and then make all but the first invisible), and observe carefully the "thing".

Wow ! Do you see what I'm seeing ? Specially, the relative spatial positions of U5 and A6 ?

Intriguing, but what does EteRNA think of this sequence ? Before we can answer that question, we need to find what is this native structure... How to get the native secondary structure ? Well, to be perfectly honest, that's quite a hike :

on SCOR follow the 1BGZ link, you land here -> http://scor.berkeley.edu/pdbInfo.jsp?pdb=1bgz&idTopClass=structure
follow PDB summary, you land here -> http://www.rcsb.org/pdb/explore.do?structureId=1bgz
bottom of that page, follow the NDB link, you land here -> http://ndbserver.rutgers.edu/servlet/IDSearch.NDBSearch1?id=1BGZ
on that page, follow Base Pair Parameters, you land here -> http://ndbserver.rutgers.edu/atlas/nmr/structures/id/1bgz/1BGZ-bpp.html
and then you have to "decode" the list and input the dot-bracket manually in the puzzle editor

Out of breath ? Don't worry, you get used to it after a while :P

Update: Ok, if you guys prefer to trust me on this, this is the whole data:

Sequence	`GGGAUACUGCUUCGGUAAGUCCC`
MFE structure	`(((((..(((....))).)))))`
Native structure	`((((..((((....)).))))))`

Just copy/paste these in the puzzle maker.

Now, let's see how it looks like in EteRNA.

Don't you feel that it's totally outrageous that Mother Nature prefers a fold that is 2.7 Kcal/mol worse than our beautifully dynamic-programmatically calculated Minimum Free Energy structure ? Well, maybe there are logical reasons for this fact.

(to be continued...)

(Last time on EteRNA... and now, the conclusion)

Well, I think there are two ways to approach the problem. Either the model used by EteRNA is overestimating the stability of the MFE, or it's underestimating the native fold. I believe, the answers are more likely to be found in the second option.

One of the first features of the 3D structure that caught my attention, was that U5 and A6 are co-planar. And very close to each other, of course. And you know what happens when nucleobases are co-planar and close ? Of course, you do know, they form hydrogen bonds. In other words, they pair.

Now if you think about it for a minute, where did we find this sequence and 3D structure in the first place ? In a database, classified as a "dinucleotide platform". What does that mean ? "Dinucleotide" is clear and simple, we're talking about two bases. But if it's just a normal pair, canonical or not, why would they invent this "platform" thing ? Yes, because it's a very unusual pairing, which occurs between neighboring bases on the same side of the chain.

So, coming back to the 2D native structure, EteRNA (and probably all other models) is missing a couple hydrogen bonds from a "weird" pair, which should make for a small contribution to stability. But that's not all of it, I think. Observing the right-most 3D shot above, following idea occured to me.

It may seem a little far fetched, but to me, this could suggest that this 2-0 bulge that we can see in the native structure might actually work nearly as good as a 1-0 bulge.

Let me show you another section of the 3D rendering.

This is the area around the 1-0 bulge at A17. On the left, the "normal" view, which I find quite difficult to "decode". Don't ask me how I got the picture on the right, but after toying for a while with some Chimera menu options, I managed to make the backbone invisible, leaving only the bases. And I believe, the result is quite eloquent.

Observe the placement of A17. Yes, that's an even weirder combo, A18 actually pairs with both U8 and A17. That's what the scientists dudes call a "base triple".

Also, observe the alignment of the vertical axis, in its globality. See a bent anywhere ? Well, to me, just like earlier, this suggests that, because A17 is in the same plane as the U8-A18 pair, this 1-0 bulge may very well work just like a... 0-0 bulge.

Now, what does EteRNA think of the native structure without A6 and without A17 ?

Ok, remember that the only facts on this page are the 3D renderings (source PDB) and the native secondary structure (source NDB), everything else are insanities of my uneducated and dysfunctioning brain. But really, if you guys have a better explanation for why the native structure is preferred over the MFE, I'm all ears :)

----

Beyond these analyses and theories, one could ask : assuming that above ideas are correct, what are the implications in terms of EteRNA lab experiment ? Would the SHAPE data for this sequence be predictable ? If so, what would it look like ?

If I ever find the time and motivation, it will be the subject of a new blog entry.