User talk:Brourd
Please use the '+' link above to start a new thread.
== Targets ==
I've been thinking about targets and their design rules. Regarding the evaluation of sequences in the lab, I think it would be good to have the binding site "flipping" as much as possible.
<tbody> </tbody>
NAGGAUAU |
AGAAGGN |
|
FMN bound | xxxx.xxx |
xxxxxxx |
FMN not bound | xx...xxx |
xx.xxxx |
no FMN | ?????... |
?.....? |
The idea is to be able to get a rough idea of how much of the bound shape occurred when no FMN had been added, as well as get an estimate how well the switch occurred when FMN was present.
Thoughts?
-- ElNando888 (talk) 02:03, 22 February 2014 (UTC)
----
Indeed, observing the SHAPE signals for these specific residues will probably be one of the more specific ways to determine if the binding site formed.
I think in that same forum post I linked about about repeatability of riboswitch scoring, that, it was observed in several of the cloud labs, that the binding site residues in the first state, were protected from the SHAPE probe, when they should have been exposed. One explanation I gave, was that the FMN ready state was forming before FMN was introduced into the solution.
--Brourd (talk) 16:24, 22 February 2014 (UTC)
----
This is something I don't think was ever tested in EteRNA
An inverted binding site, with AGAAGGN residing 5' of NAGGAUAU.
So far, I see no logical reasons why it wouldn't work just as well. Or if it doesn't, I'd like to know why... Thoughts?
And while we're on this topic, the scientific papers I've read do not indicate a preference for the closing base pair. This closing UA pair could apparently be mutated to any canonical or GU wobble pair. Do we want to test that as well?
-- ElNando888 (talk) 12:22, 22 February 2014 (UTC)
----
So, part of the reason I am currently running http://eterna.cmu.edu/web/lab/3376174/ is due to this very theory. While I am not entirely convinced that this inverted binding site works, I am certainly willing to allow it to be tested if the SHAPE signals for the inverted loop, are similar to that of the normal FMN aptamer, when FMN is not present.
As for testing alternative, closing base pairs, that could very well be done, and, if I get word back on an experimental protocol that could potentially allow us to test the riboswitch constructs in the solution with FMN, it would certainly be easy to do.
--Brourd (talk) 16:24, 22 February 2014 (UTC)
----
I've been looking at the puzzles I generated in my SCLT-NG series. So far, the one I'd rather have as a first target for the upcoming lab would be one of these:
- SCLT-NG 10: looks fairly simple, yet seems like a possibly very interesting candidate, 5 clear switching segments... could also be shortened a bit (the non-switching neck area doesn't really need to stay like that)
- SCLT-NG 11: a few switching segments, a long sliding stack... the near lack of switching bases in the 53-73 area is a bit unsatisfactory though.
- SCLT-NG 13: looks very good
- SCLT-NG 15: looks acceptable to me, 4 switching segments
- SCLT-NG 21: so far my favorite, large spread of switching bases in all areas, numerous switching groups, a multiloop in the unbound structure, looks great to me
- SCLT-NG 35: also good IMO, but the unpaired area 68-86 in the unbound structure looks a little scary...
- SCLT-NG 39: if we're tempted by the feeling of "Mission Impossible", I'd say, that's the one to go with :D
I neglected to mention this one, but maybe I shouldn't:
- SCLT-NG 4: essentially, 2 very large switching segments
This one is a little "special", as it is a sort of personal "subproject" in the field of riboswitches. Maybe we will have the opportunity to talk about it later, and even to test the idea. At this point, it seems less important than other goals we have in mind.
Nevertheless, this particular puzzle seems to present interesting qualities. Large switching areas that would be easy to score, even just visually, great simplicity...
-- ElNando888 (talk) 06:32, 10 March 2014 (UTC)
----
My suggestion would be to ask the players which switch they would like to solve. Since any and all of these may make excellent switch mimic pilot candidates.
In addition, I have been thinking of exapnding the number of mimic sequences to test from 4, to 6 or 7.
--Brourd (talk) 23:11, 10 March 2014 (UTC)
----
First off, and for the other readers: Brourd and I have been discussing the choice of the first target over chat. After he raised an issue about SCLT-NG 21, he came up with a few proposals himself.
Also, I've written a script (which is unfortunately difficult to publish for others to use, I'll come back to this later) that allows me to get a "feeling" of what the effects of mimics would possibly be.
As I tried to predict general features for Bourd's DYN - 2, it appeared rather clearly that the currently considered mimics are mostly inadequate for this puzzle. The reason for that somewhat surprising fact is that the locked bases of the binding site are all unpaired in the unbound state, and can easily bind with other bases facing them in these loops, if one attempts to mutate them. The consequence is that the effect of the mimic is greatly minimized, since mutations tend to boost both shapes (unbound and bound).
Which means:
- in general, we will probably need very different sets of mimics for different targets
- we could end up needing to code something that gives us good mimic candidates for a given target, rather than relying on intuition like we're doing now
A more immediate concern are the pilots we're starting next week. If we go for DYN - 2, then we will probably require a very different set of mimics than the one that has accumulated (15 already) in Brourd's pilot lab.
We have alternatives though. SCLT-NG 10 for instance, seems to fit the current set of mimics very well. Although I found a potential issue in that puzzle, I believe it can be dealt with successfully, so I would still consider it a good candidate. And I'm also working on fixing SCLT-NG 21, which I hope, will still match that same mimics set.
-- ElNando888 (talk) 16:19, 12 March 2014 (UTC)
----
I'd say, let's just go with DYN-2, and these mimics.
Supposedly good:
- CUGAAC / GACGG
- CGUUAC / GGAGG
Supposedly average:
- CGGAUA / GACGG
- AGGAUA / UAUUG
Supposedly quite bad:
- ACGAUA / GACGG
-- ElNando888 (talk) 11:01, 13 March 2014 (UTC)
----
== Background ==
Hi Brourd,
I think it would be good to present the motivations behind the whole project. Since I'm not entirely sure I'm getting 100% of it myself, I'm going to write down what I understood, and I'd appreciate your correcting me when I'm wrong.
I believe both the scientists and the players would love to try designing switches again. There are issues though, related to the new pipeline, Cloud Lab.
- it doesn't make sense to run switch-related sequences along with mono-state ones. It would represent a costly waste of resources
- making a round with exclusively switching sequences is basically twice the work for the lab, and probably also twice the price
So, what can we do? Some time ago, an idea emerged, that could be used in conjunction with Cloud Lab as it is now, without disturbing the cycle too much. Essentially, the idea consists in submitting the same sequence twice, once with the FMN bindng site, once with that area mutated to something else, which we will dub as "mimic". Assuming the mutation would have the same effect as a real FMN binding with the aptamer (in other words, providing a bonus of about -4.9 kcal/mol), this system would allow us to screen for good switch candidates. And once we have accumulated enough of those, a dedicated lab run could be done, with the real FMN addition this time.
If I understand correctly our purpose in this project, our goals are:
- first, to create good mimics
- second, to start generating good switches
About the first step, I may have missed something, but I couldn't find any documents, neither on EteRNA, nor on RMDB, related to tests with mimics that Das Lab would have already made. The only thing I heard (from you Brourd) is that they tested this:
<tbody> </tbody>Which I would argue, doesn't look very good to me, but let's leave this topic for later.
And once we have (or at least, we think we have) good enough mimics, the second step sounds almost "small and easy": design targets, and sequences for those. And voila :)
-- ElNando888 (talk) 15:07, 20 February 2014 (UTC)
So, the question of where the FMN mimic sequence originally started...
The FMN Mimic was actually run for 2 designs of every lab in Round 70 of the cloud lab, aka, the first and only round of FMN switches in the cloud lab so far.
As you may recall, I actually <a href="https://getsatisfaction.com/eternagame/topics/fmn_mimics_and_the_repeatability_of_switch_results" target="_blank">wrote something about it, as well as a question of the repeatability of the riboswitch scoring</a> a very, very, long time ago.
In addition, Dr. Rhiju Das <a href="http://eterna.cmu.edu/web/blog/2891462/" target="_blank">wrote a little blog post</a> about how his Ph.D students participated in the first round of the EteRNA switches, and if I recall correctly, these switches were made and scored with the FMN mimic.
All the chemical mapping data for the FMN switches and mimics are available in <a href="http://rmdb.stanford.edu/repository/detail/ETERNA_R70_0000" target="_blank">rounds 70</a> and <a href="http://rmdb.stanford.edu/repository/detail/ETERNA_R71_0000" target="_blank">71</a> on the RMDB. I'm sure you can figure out what to do from here!
So, back to the motivation and goals for the project, and why they were not included in the original post.
It was 1 am :P (I'll work on including that)
However, your observation is correct. The goal of the FMN mimic is to essentially create a protocol, where a multiple state RNA system based on the binding of a ligand, can be potentially tested in absence of said ligand (other multistate systems may be a tad more difficult). This protocol could potentially extend to other other aptamers as well, which would be a future goal. in addition to this, this project has two additional main goals, as well as a few secondary goals
- The characterization and creation of successful riboswitch constructs.
- The characterization and implementation of successful riboswitch design rules.
- Potential reworking of riboswitch scoring based on the SHAPE chemical mapping protocol.
- A comparison of riboswitches with canonical base pairs versus those with noncanonical base pairs.
- Implement a pipeline for the testing of riboswitches using the Das Lab's high throughput chemical mapping protocol.
- Determine if current automated algorithms can design successful riboswitches (NUPACK, ViennaUCT, any other publicly available multistate design algorithms)
- (A VERY minor/secondary goal) The creation of an automated algorithm to design riboswitches, coding both the rules for constructs and sequences into it.
-- Brourd (talk) 22:51, 20 February 2014 (UTC)
----
Thanks Brourd,
The data I can find in round 70 on RMDB indicates that only and all EteRNA players' designs were tested 4 times, with and without FMN, with 2 different chemical probes, 1M7 (SHAPE) and DMS. I see no traces of mimics in the dataset.
Round 71 (Cloud Lab round 3) was a repeat of Cloud Lab 1, so completely different sequences and constructs. Though, this batch does include the students switch constructs Rhiju is talking about. But those sequences weren't tested against the real FMN...
So, I still don't see any data that could speak about the effectiveness of the method for screening good FMN switches.
-- ElNando888 (talk) 03:32, 21 February 2014 (UTC)
----
So, in round 70, annotation data 3713 to annotation data 3872. The map-seq ID has the ID, then -a, to indicate that it is a mimic. (a for alternate, maybe?)
Example
ANNOTATION_DATA:3868 modifier:DMS MAPseq:design_name:JG #1 MAPseq:project_name:Top Notch by jmf028 MAPseq:ID:2426211-a
sequence:GGAAAUUUAAGCACAGAGGGCCUAUCUCGAAACGAGAAGGUCCUCACCAUCAAAAGAUGGAAGUGCAAGUUUACAUUCGUGUAAACAAAAGAAACAACAACAACAAC
structure:..........((((.((((((((.(((((...))))))))))))).(((((....)))))..))))..(((((((....))))))).....................
signal_to_noise:weak:0.325 MAPseq:tag:FAM-RTB003 chemical:FMN:200uM
The FMN mimic is in bold in the sequence.
*minor note* the Das lab never actually published the results in EteRNA, so maybe they thought the results were a bust? lol
Which is good news, since that means that this project's, work to chance of failure ratio, has just increased, what fun! :)
---Brourd (talk) 04:22, 21 February 2014 (UTC)
----
Ok, I think I finally found it, thanks for the hints.
The mimicking sequences are located in http://rmdb.stanford.edu/site_media/rdat_files/ETERNA_R70_0000/ETERNA_R70_0000.rdat and the annotation data span from 3713 to 3872, just as you indicated. Those 160 data points are for 40 sequences (2 designs were selected from all 20 labs), and those sequences underwent the same protocol as the others, tested with and without FMN, probed with 1M7 and DMS. Here a note related to what I was saying earlier about waste of resources: these mimicking sequences didn't need to be tested with FMN, 80 slots were "wasted"...
Unfortunately, the data associated with the mimics is only in RMDB, not in EteRNA, and it's a little hard to compare "by hand". Locating which annotations should be compared to which other one, is already some work in itself. Let's take an example:
Data points 3713 to 3716 are for the same mimicking sequence, the one we want to use is 3713 (modifier: 1M7, FMN: 0uM), the field MAPseq:ID:2426173-a gives us the base one (2426173, without dash a), the data for that sequence is located at data points 1-4, and there, we want to use the number 2 (modifier: 1M7, FMN: 200uM)
I'm trying to figure out how to be systematic with this, and haven't come up with much yet. But while doing this, I already found one comparison that seems to relate to what I was saying about false positives. Consider JMF's LaJ Solve http://eterna.cmu.edu/game/solution/2426174/2426905/seeresult/ From a visual inspection, it seems to me clear that the sequence folded into the unbound shape, and that the addition of FMN resulted in simply nothing, the molecule just didn't budge. Now, comparing data sets 3793 to 1606 should convince you that the mimic was strong enough to actually change things...
-- ElNando888 (talk) 08:55, 21 February 2014 (UTC)
----
Well, who ever said the job of finding a mimic would be easy? :)
The potential for false positives can exist in three different contexts.
- Mutations to the binding site affect the fold of the secondary structure globally, preventing or allowing for the presence of suboptimal structures that differ from the WT sequence.
- The free energy contribution of the mimic is not equivalent to the free energy contribution of FMN at a 200uM concentration.
- Tertiary structure differs significantly between the way the mimic folds, and the way FMN folds, potentially altering the global structure (and resultign SHAPE signals)
With these factors in mind, our goals for this project remain the same
- Development of a sequence that can potentially mimic the binding of a molecule in a multiple state RNA system.
- Development of structures and sequences that allow for the succesful design of riboswitches.
- Development of a protocol to implement these features.
We could also add a new goal as well, if you wish
- Determine if the use of a mimic is possible, and if it is not, write a detailed proposal for the Das lab that explains why they need to implement a pipeline for riboswitches, in their high throughput, synthesis protocol.
---Brourd (talk) 15:36, 21 February 2014 (UTC)
----
Ok, I started this manually:
https://docs.google.com/spreadsheet/ccc?key=0AsEEBMO3fRaUdDJpbFF2azdMM3g2Y3ZnY0lEY21KNEE#gid=0
Probably gonna take me a while to put them all in there, but I think it will be worth the effort.
And agreed on all you just said.
-- ElNando888 (talk) 17:24, 21 February 2014 (UTC)
----
I started a google doc collecting the data. It contains three rows for each design:
- The original design without FMN
- The original design with 200uM FMN
- The mimic design
The first sheet "Annotated," is just all of the original data. In the second sheet, "Reactivity Difference," I calculated the differences between:
- The original design without FMN and the original design with 200uM FMN
- The original design without FMN and the mimic
I started to add some graphs of the calculated differences. After generating a few of these, I thought it might be helpful to note where the designs were supposed to switch, so I attempted to do this on the first two. I defined the FMN-bound state as the mimic state, because it was easy for me to do, and I don't know the reactivity pattern for the FMN-bound section of RNA. If you think adding this to the graph is useful (or have an idea of a better way to visualize it), let me know.
The doc is here:
https://docs.google.com/spreadsheet/ccc?key=0AppiCUq-Rq1tdHl5MUpRSmlscVVzMFJCVGNXb3hjbFE#gid=2
Meechl (talk) 03:25, 23 February 2014 (UTC)
----
Hi Meechl, and these are great news :)
You will need to fix the sharing of the document though, as of now, it seems to be private and only you can view the content.
-- ElNando888 (talk) 13:30, 23 February 2014 (UTC)
----
Nice Work Meechl!
--Brourd (talk) 18:52, 23 February 2014 (UTC)
----
So, I was working on adding the switch trends to the graphs when I noticed the data wasn't making sense... then I noticed I had mislabeled the rows on the sheet with the graphs. D:
I fixed the problem in this NEW document. It uses the new version of speadsheets and has been working faster than the old page. I'll delete the sheet with the graphs on the original document eventually to avoid confusion. The sharing on the new document should be set so that anyone can view and comment. I can give you ability to edit too if you want.
https://docs.google.com/spreadsheets/d/1K2Zp-75Im-34U78f0-zv-HG5kro1RsKDEzm6SLp8S7E/edit?usp=sharing
Meechl (talk) 03:20, 24 February 2014 (UTC)
----
== Recommended readings ==
To be honest, I'm not sure what to recommend here. Though I want to mention a publication that I find quite enlightening:
Giulio Quarta, Ken Sin and Tamar Schlick
Dynamic Energy Landscapes of Riboswitches Help Interpret Conformational Rearrangements and Function
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280964/
PLoS Comput Biol. 2012 February; doi: <a href="http://dx.doi.org/10.1371%2Fjournal.pcbi.1002368" target="pmc_ext">10.1371/journal.pcbi.1002368</a>
Granted, it's... quite a lot to gobble for the occasional RNA-toying amateur...
But if you manage to get past the biotechnical mumbo-jumbo, it gives a lot of insights about what's going on in cells, and how those riboswitches appear to work... I find specially interesting the classification between kinetically-driven and thermodynamically-driven switches. If I understand our pipeline correctly, we're probably trying to create switches of the latter class. But I wonder if applying their methods, which would in theory ensure the creation of functional switches in vivo, would not be actually the smart thing to do for us, since we would acquire knowledge and expertise that would be applicable not only in vitro.
Also, the computational methods are actually accessible (not a 5 minutes job, but still), which means that I could comtemplate the idea of applying these methods (with some adaptations) in my bot for instance. A long term plan though.
-- ElNando888 (talk) 18:35, 21 February 2014 (UTC)
----
Another suggestion would be to check out this page:
http://2011.igem.org/Team:Peking_R/Project/RNAToolkit2
This is not a scientific paper, and it's not even about FMN, but it illustrates how some scientists go about using what they call "1nt slipping mechanism" with a good number of non-canonical pairs to engineer a switch. I believe something similar could very well be used with FMN as well.
-- ElNando888 (talk) 00:24, 23 February 2014 (UTC)
----
I'll be adding a few pointers here, as I find them:
- http://www.nottingham.ac.uk/ncmh/BGER/pdf/volume_24/15-Smolke-BGER.pdf
- http://intl-cshperspectives.cshlp.org/content/4/2/a003566.full
-- ElNando888 (talk) 08:13, 24 February 2014 (UTC)
----
Brourd and I already talked a few times about this, but I think it may be important to mention it here, so that the other members of the group are informed.
There is a potential issue about FMN that we seem to ignore at EteRNA. Some 17 years ago, a german team of scientists described a phenomenon called "photocleavage", when RNAs (with GU pairs inside stacks) are in the presence of FMN and... light. And by light, I mean it doesn't need to be a strong source, a 20W halogen lamp at about 30 cm is enough to cause the reaction.
If I'm not mistaken, their first paper was http://pubs.acs.org/doi/abs/10.1021/ja962918p but as you can see, it is paywalled. Still, the supplementary informations located at http://cdn-pubs.acs.org/doi/suppl/10.1021/ja962918p/suppl_file/ja1137.pdf provide quite a number of very interesting results.
Later, they published http://nar.oxfordjournals.org/content/25/20/4018.long which goes into much more details of this reaction.
What does that mean for EteRNA, and of course, for us? To be honest, I have no idea, but considering the potential for disturbances that this reaction could cause, I would be glad if someone would ask the scientists what is done to take this factor into account, or to prevent it. Maybe the next lab meeting would be a good opportunity for that. Because if they're not doing anything about it, we'd probably better avoid GU pairs in our designs...
-- ElNando888 (talk) 18:36, 27 February 2014 (UTC)
----
Something about SHAPE and scoring:
Maybe this would be the proper way to deal with the issue of scoring switches: instead of messing around with whether a base participates in a closing pair or mismatch, first process the SHAPE data according to the methods used in the paper above, and then use a simple (paired/unpaired) scheme for scoring the switches.
-- ElNando888 (talk) 04:41, 30 March 2014 (UTC)
----
== Riboswitch Testing via RNA Arrays ==
I am currently working on bringing a new method for riboswitch analysis into this project, using a method that Dr. Johan Andreasson describes in the January 9th Monthly EteRNA Meeting with the Das Lab.
If we are successful with this, we should have the ability to independently assess the effectiveness of rinboswitches, ligand mimics, potential mutations to the structure or sequence of the FMN binding site, and the ability for SHAPE to be used in the determination of successful riboswitches.
--Brourd (talk) 18:52, 23 February 2014 (UTC)
----
== Questions about the project ==
Switch work in the lab
Maybe a dumb question:
"making a round with exclusively switching sequences is basically twice the work for the lab..."
Why is that - do both states have to be made, then run for shape verification, or do they need a chemical trigger to switch, or why is this exatly twice the work>
Salish99 (talk) 19:59, 24 February 2014 (UTC)
----
Answer: In order for the Das lab to probe a sequence with two conditions (With FMN and without FMN, in the case of the riboswitch labs), they need two different sets of sequences. This means, that they need run the experimental protocol on both sets, meaning double the work, double the concentration, and double the tracking. Essentially, it is not quite as easy a task for their experimental team as a single set of sequences is.
--Brourd (talk) 22:05, 24 February 2014 (UTC)
----
== Switch scoring in the context of mimics ==
<tbody> </tbody>The scoring model I envision for the newly proposed lab would be like following:
Nomenclature:
- shape_U = SHAPE value in the unbound structure
- shape_B = same for the bound structure
- T = threshold (standardized to 0.5 now)
There are 21 switching bases in this puzzle.
For the bases going from unpaired to paired (example base 6)
- 2 points, if shape_U > T > shape_B
- 1 point, if shape_U > shape_B
For the bases going from paired to unpaired (example base 24), the reverse
- 2 points, if shape_U < T < shape_B
- 1 point, if shape_U < shape_B
Finally, because we don't know yet how the mimic will behave SHAPE-wise, we have to exclude those bases from the scoring, but I would argue that I see at least 2 bases where we could do something. So, for bases 17 & 60
- 1 point, if shape_U > T
The rationale behind this last scoring rule is that if the molecule tends to form the bound structure even in the absence of ligand, then these bases would tend to show protection. So, rewarding the fact that they are generally reactive in the absence of FMN seems to be a good thing to do.
Total 44 possible points, which would be scaled up to 100.
Thoughts?
-- ElNando888 (talk) 11:57, 9 March 2014 (UTC)
----
Trying to see if I'm getting correctly the things Brourd and I talked about online.
- ShU = SHAPE value in the unbound structure
- ShB = same for the bound structure
- ThP = threshold under which a SHAPE value is considered protected (0.25?)
- ThR = threshold over which a SHAPE value is regarded as reactive (0.5?)
<tbody> </tbody>
Unbound → Bound ↓ |
Paired (not closing) |
Paired (closing) |
Unpaired (mismatch) |
Unpaired (not mismatch) |
Paired (not closing) |
if (ShB < ThP) then MinReward if (ShU < ThP) then MinReward |
if (ShB < ThP) then MinReward if (ShU < ThR) then MinReward |
if (ShU > ThR) && (ShB < ThP) then MaxReward |
if (ShU > ThR) && (ShB < ThP) then MaxReward |
Paired (closing) |
if (ShB < ThR) then MinReward if (ShU < ThP) then MinReward |
if (ShB < ThR) then MinReward if (ShU < ThR) then MinReward |
if (ShU > ThR) && (ShB < ThP) then MaxReward |
if (ShU > ThR) && (ShB < ThP) then MaxReward else if (ShU - ShB > ThR - ThP) then ReducedReward else if (ShU > ThR) then MinReward |
Unpaired (mismatch) |
if (ShB > ThR) && (ShU < ThP) then MaxReward |
if (ShB > ThR) && (ShU < ThP) then MaxReward |
if (ShB > ThP) then MinReward if (ShU > ThP) then MinReward |
if (ShB > ThP) then MinReward if (ShU > ThR) then MinReward |
Unpaired (not mismatch) |
if (ShB > ThR) && (ShU < ThP) then MaxReward |
if (ShB > ThR) && (ShU < ThP) then MaxReward else if (ShB - ShU > ThR - ThP) then ReducedReward else if (ShB > ThR) then MinReward |
if (ShB > ThR) then MinReward if (ShU > ThP) then MinReward |
if (ShB > ThR) then MinReward if (ShU > ThR) then MinReward |
I'd propose to make:
- ReducedReward = 3 x MinReward
- MaxReward = 5 x MinReward
This first table would apply to all bases, excluding the binding site. Another table would be needed for the bases in the binding site itself.
Open points/questions:
- what would you suggest as formulas in the empty cells above?
- how do we deal with 1 unbound + 4 bound mimics?
-- ElNando888 (talk) 07:04, 10 March 2014 (UTC)
----
So then, the threshold under which a residue is considered protected is 0.5.
The threshold over which a residue is considered reactive is .25
As for the empty cells of the scoring protocol...
The SHAPE signal of both mismatches and closing base pairs can vary wildly, based on the sequence. Therefore, we will need special rules that exempt them from our normal scoring terms, to prevent the potential loss of successful switch designs.
What these special rules are, I don't know (sorry!), however, the continued observance of multistate RNA sequences should allow us to expand upon our knowledge of where SHAPE signals are likely to vary.
--Brourd (talk) 23:11, 10 March 2014 (UTC)
----
I guess we can boil it down this way:
<tbody> </tbody>0.25 | 0.5 | |||
Protected | | | Neither Protected, nor Reactive | | | Reactive |
| | Both Protected and Reactive | | |
The upper definition could be seen as the "strict" one, the bottom one as "relaxed"... Maybe we can simply apply "strict" to the unequivocal cases (no closing, no mismatch), and the "relaxed" definition for the less clear-cut cases... ?
-- ElNando888 (talk) 04:59, 11 March 2014 (UTC)
----
I looked over the proposed scoring mechanism, and I have a few questions:
- Is the plan to score all bases, or just to score the ones that switch? The first post on scoring gave me the impression that only the switching bases would be scored, but the table looks like it scores all bases.
- Does every scored base get Max, Min, Reduced, or no award? By the table, it looks like the paired-paired and unpaired-unpaired bases get Min or no award.
- I want to make sure I'm clear on the definition of mismatch. Does that just mean two unpaired bases right next to a stack?
- Do you think it'd be worthwhile to attempt to score some old switches this way, to see how this metric compares? I would be interested in trying to do this.
Thanks!
Meechl (talk) 16:22, 13 March 2014 (UTC)
----
Hi Meechl,
I took the liberty of numbering your questions, it'll be easier to follow I think.
- Yes, we would like to score all bases, but I'm trying to give an accent on the switching ones.
- Indeed, the plan is to give non-switching bases a modest score (if we put the unit as 1, then such bases would get between 0 and 2 points), while switching bases would get 5 (Max), 3 (reduced), 1 (as a tolerance for closing/mismatch) or 0 (failed)
- Yes, but I would think that it should apply only as long as they possibly influence free energy. So, not the ones in 0-X bulges.
- It would be awesome and very welcome, thanks :)
This said, this is only a first draft I came up with, for the sake of trying something. We also got some time before we have anything new to score with this model, so take your time, and if you feel like it, enjoy experimenting :)
-- ElNando888 (talk) 17:33, 13 March 2014 (UTC)
----
== Current round ==
So, here is what I came up with for our first target, the Bistable 3 design :
<tbody> </tbody>-2.51 mimics for Bistable 3 (VRNA 1.8.5) | https://gist.github.com/ElNando888/2857e1e5b75097201b38 |
-4.86 mimics for Bistable 3 (VRNA 1.8.5) | https://gist.github.com/ElNando888/df5d302c45e0eb8f802c |
-2.51 mimics for Bistable 3 (VRNA 2.1.1) | https://gist.github.com/ElNando888/3082998f661e7ee04d20 |
-4.86 mimics for Bistable 3 (VRNA 2.1.1) | https://gist.github.com/ElNando888/2030f8b1180dc922943a |
I hope I didn't screw up somewhere, so I'd be thankful for an independent verification. It is doubtful that you guys can check out what VRNA 2.x predicts, but using the switchmaker in EteRNA, it should be possible for anyone to pick one of the proposed mimics in the lists above, compute the cumulative effect on the sequence and verify that it is indeed either -2.5 or -4.9
In terms of freedom of choice, I hope this will be enough to pick from...
-- ElNando888 (talk) 17:55, 21 April 2014 (UTC)
== Thermodynamic Mimics: Round 1 results summary. ==
Experimental riboswitch design has been stagnant in the cloud, biochemistry game of Eterna, due to the prohibitive cost for the multiple, chemical mapping experiments required, in addition to a very low success rate for designs overall. Recently, Dr. Johan Andreasson, in collaboration with the Das Lab at Stanford University, proposed a method for riboswitch analysis via RNA arrays and fluorescent markers, which is currently an active project that players can take part in. This represents one of many advances for Eterna in riboswitch design, however, there are multiple requirements necessary for the analysis of the riboswitches with Dr. Andreasson’s protocol, as well a lack of single nucleotide resolution, structure mapping measurements.
In order to combat the prohibitive costs, as well as gain experimental information about the switching potential of an RNA, the Das Lab created a 6x5 sequence they referred to as a mimic. This mimic is inserted into the 6x5 FMN binding aptamer, internal loop, creating Watson-Crick base pairs that should simulate the binding of FMN to an RNA sequence. Unfortunately, this mimic protocol was not implemented for the design and synthesis of riboswitches in Eterna. However, using experimental data and a reformed mimic protocol, we aim to show that this technique can be used for the successful simulation of riboswitches in vitro.
As an overview of the experiment we ran, there were eight targets:
- Hand and Finger Remade - OFF state: The original hand and finger FMN riboswitch was remade to help combat several issues, this being the OFF state of the RNA.
- Hand and Finger Remade - Mimic: The origina, Hand and Finger FMN riboswitch was remade to help combat several issues, this being the ON state of the structure, but, with the FMN 6x5 aptamer unlocked, so that players can mutate the loop into one of the many mimics they generated.
- Simple RNA Switch Remade - OFF State: The original Simple RNA Switch was remade to help combat several issues, this being the OFF state of the RNA.
- Simple RNA Switch Remade - Mimic: The original Simple RNA Switch was remade to help combat several issues, this being the ON state of the structure, but, with the FMN 6x5 aptamer unlocked, so that players can mutate the loop into one of the mimics they generated.
- Hair Trigger - Sub 2 (Stable) by Brourd: -2.51 Kcal/mol Mimic Bonus: This Hair Trigger sequence was synthesized in the presence of FMN during the cloud lab switches, ran during round 70 of Eterna. All mimics were pre-generated to have a free energy bonus close to -2.51 kcal/mol, according to either the Vienna 1.8.5 algorithm, or the Vienna 2.1.1 algorithm.
- Hair Trigger - Sub 2 (Stable) by Brourd: -4.86 Kcal/mol Mimic Bonus: This Hair Trigger sequence was synthesized in the presence of FMN during the cloud lab switches, ran during round 70 of Eterna. All mimics were pre-generated to have a free energy bonus close to -4.86 kcal/mol, according to either the Vienna 1.8.5 algorithm, or the Vienna 2.1.1 algorithm.
- Bistable 3 - Mod of Eli by mat747: -2.51 Kcal/mol Mimic Bonus: This Bistable 3 sequence was synthesized in the presence of FMN during the cloud lab switches, ran during round 70 of Eterna. All mimics were pre-generated to have a free energy bonus close to -2.51 kcal/mol, according to either the Vienna 1.8.5 algorithm, or the Vienna 2.1.1 algorithm.
- Bistable 3 - Mod of Eli by mat747: -4.86 Kcal/mol Mimic Bonus: This Bistable 3 sequence was synthesized in the presence of FMN during the cloud lab switches, ran during round 70 of Eterna. All mimics were pre-generated to have a free energy bonus close to -4.86 kcal/mol, according to either the Vienna 1.8.5 algorithm, or the Vienna 2.1.1 algorithm.
The hypothesis for this round, was to use an algorithm to procedurally generate FMN mimics for specific free energy values, to try and emulate the free energy bonus a specific concentration of FMN is modeled to have when in the solution with RNA.
Specifically, we used the Hair Trigger and Bistable 3 sequences for this.
The Hair Trigger ensemble shifted with a concentration 200um FMN, which correlates to a predicted free energy bonus of -2.51 kcal/mol. Therefore, the -2.51 kcal/mol mimics and -4.86 kcal/mol mimics should both indicate a shift in the ensemble.
The Bistable 3 ensemble did not shift with a concentration of 200um FMN, which means that it should not switch with the -2.51 kcal/mol mimics. In addition, the ViennaRNA 2.1.1 model predicted that the RNA needed at least -5.8 kcal/mol in order to shift in the ensemble, and therefore, it was hypothesized that the RNA would not successfully switch with the -4.86 kcal/mol mimics either.
Results
The results for this indicate with little doubt, that there was a definite correlation between the success of the RNA switch and the free energy bonus. The following graphic has four histograms of the results. On the X axis, is the percentage of switching nucleotides that had a shift in reactivity and on the Y axis is the total number of designs that fall into a specific range.
For each mimic RNA, if the switching residue is paired in the OFF state and unpaired in the ON state, then the residue gets a full point if the SHAPE reactivity is greater than 0.5.
For each mimic RNA, if the switching residue is paired in the ON state and unpaired in the OFF state, then the residue gets a full point if the SHAPE reactivity is less than 0.5.
For each target, a single residue was removed from the overall switch score, due to data in the control designs, which point towards a consistency in reactivity beteween ON and OFF states.
For now, the Mimic sequence in the 6x5 loop is ignored in scoring, due to the randomness of the sequence and the many structures they may fold into at equilibrium. We are currently looking into ways to improve this.
The histograms shown above, group all designs based on the percentage of residues that shift reactivity correctly. Based on this simple analysis, we can make some important observations.
First, none of the mimics for Bistable 3 2.51 kcal/mol had more than 60% of their residues shift in SHAPE reactivity, and the majority of mimic designs had less than 20% of the residues shift reactivity.
Second, 15 out of 33, of the Hair Trigger, 2.51 kcal/mol mimics had 60% or more of their residues shift reactivity. We could most likely infer that this means the mimics had close to a 50% success rate for modeling the presence of FMN with hair trigger -2.51 kcal/mol, but, still not the near 100% success rate we are aiming for.
Third, both the Hair Trigger and Bistable 3 sequences have not been chemically probed in the concetration of FMN necessary to achieve this bonus. On the other hand, the results from both provide some very important details. The most important of these details, is the fact that bistable 3 -4.86 mimics had a switch success rate of less than 50%. Additionally, the distribution of designs would seem to indicate, unlike the Hair Trigger -2.51 mimics, that a mimic was either highly successful, or a total failure. As for Hair Trigger -4.86, almost every mimic switched with more than 60% of residues that shift reactivities, with an overall success rate of 84%.
In conclusion, the proceduarally generated, thermodynamic mimics, were succesful overall, but, were not a 100% simulation of +FMN conditions. However, enough positive evidence has been provided for us to continue with the project, and refine the basic score terms of the fitness algorithm used to generate the mimic sequences. Below are screenshot comparisons of the +FMN data and the Mimics for Hair Trigger -2.51 kcal/mol and Bistable 3 -2.51 kcal/mol.
Hair Trigger
+FMN
Mimics
It is quite obvious, based on the data we gained, that the ViennaRNA thermodynamic model and nearest neighbor parameters, greatly overestimate the stability of those mimics that have no G-C pairs. Even the thermodynamic mimic with a lonely, G-C base pair, did better than those mimics where the thermodynamic contribution was solely due to A-U and G-U pairs.
Bistable 3
+FMN
Mimics
The Bistable 3 mimics were failures for a number of sequences overall, but, once again, it was not a 100% emulation of +FMN conditions. For example, 5/35 of the synthesized mimics had 40% or more residues shift reactivity. In addition, an interesting point of comparsion here are the 18% and 54% mimics. The main thermodynamic contributions in the sequences are identical, yet, there is a a gap in score that translates to several residues shifting in reactivity. In addition, all of the mimics shown here contain G-C pairs as a part of the generated sequence, and many still failed.
Methods
All mimics for these two RNA sequences were procedurally generated using a monte carlo algorithm, that randomly searches through the 4,194,304 different sequences that can potentially be the 6x5 internal loop for the FMN aptamer. By combining the free energy deltas for the ON and OFF state structures of each generated sequence, an approximate free energy bonus is calculated. This is one of the features used to determine the "fitness" of each sequence, in addition to structure distance, and whether or not the mimic misfolds with the RNA sequnce. Mimics that score higher than 0.9 were deemed good enough for use, and that was the what players were advised to use.
A list of 80 pre generated mimics were provided for players with each target. 40 of these mimics were generated using the ViennaRNA 1.8.5 parameters, and 40 were generated using the ViennarRNA 2.1.1 parameters. Mimics were mixed to gain unbiased information on both models.
Discussion
The chemical mapping data from the first round of the thermodynamic mimics, provides clear evidence of simulating +FMN conditions to a cetain extent. However, various factors, including noncanonical loop interactions and the base pair content of a mimic, will have a significant effect on the stability of the 6x5 motif, and the frequency of the ON and OFF state in the ensemble. The use of custom scoring terms in the Mimic Fitness Algorithm are being implemented, as well as the addition of the partition function of the RNA ensemble. With these additional scoring terms, we aim to better model thermodynamic mimics in round 2 of the project.
A small note. The two novel targets in the project, Hand and Finger Remade, and Simple RNA Switch Remade, both came out with mostly terrible results. High reactivity errors in Hand and Finger Remade, caused a lack of reliable, single-nucleotide resolution data, that is necessary to adequeately score riboswitch sequences. As for Simple RNA switch Remade, it is theorized that a suboptimal structure present in the ensemble caused issues with an uncomplicated switch score.
To do list
- Work on better novel targets for switch design.
- Update fitness algorithm with custom score terms, and partition function.
- Get Round 2 going asap.
- Determine if there are better ways to score switches. After this, post a short summary of the results for the H&F Remade and SRNAS Remade.
- Find out how difficult it would be to include a basic score function for the mimics used in the 6x5 internal loop.