2015.10.23 Dev Chat

On Jen Pearl's DPAT Pairing Probabilities Script, Switch Lab Scoring Artifacts, Newest Lab Guide & Feedback, Wuami's Evaluation of Eternabot vs. Players Lab Solutions, & Potential for Designing Shape Structure First Using Algorithmic Nucleotide Selection


Nando: so, what's on the agenda today Elves? [3:00 PM]

LFP6: I'm working on rewriting some parts of my lab data dump script. I've spent a few hours on transitioning my storage of the data from going directly into a string to being stored inside an object first. It's only a little beneficial, but I'm learning a lot from it and it's good to 'do it right'. [3:00 PM]

Jennifer Pearl: nive [3:00 PM]

Jennifer Pearl: nice [3:00 PM]

Elves: that's an excellent question, I believe the floor is open for a dev chat, if there are any announcements or questions from dev side [3:01 PM]

Jennifer Pearl: I have noticed a corellation between what Johan described in the monthly meeting  and the pairing probabilities [3:01 PM]

Nando: well, let's see if the 'star' has something up his sleeve [3:02 PM]

Elves: heheh [3:02 PM]

Nando: hi rhiju [3:02 PM]

rhiju: hi nando, hi everyone. i got nuthin [3:02 PM]

LFP6: Oh, hey Rhiju :) [3:02 PM]

Elves: lol hi Rhiju [3:02 PM]

Jennifer Pearl: there are 2 groups of high variations of probabilities in teh same score ranges as the two large groups descibesd by Johan [3:02 PM]

Jennifer Pearl: hi Rhiju [3:02 PM]

macclark52: Hi, you all.  [3:03 PM]

Elves: hi mac & welcome :) [3:03 PM]

rhiju: jen can you remind us what johan's two large groups were? [3:03 PM]

Jennifer Pearl: when looking at teh number of designs that scored within a specific ranges there were two groups of a large amount of designs [3:05 PM]

Jennifer Pearl: at around 30 I want to say and maybe 60 and then as the scores go up the distribution gets very small [3:05 PM]

Jennifer Pearl: if i remember right Exlusion NG 3 or Same State NG 3 had nothing over about 70  or 80 [3:06 PM]

rhiju: i think that's a sort of artifact of hte scoring -- there are two subscores that contribute 30 and 30. the hardest one (the switch subscore) contributes 40 [3:06 PM]

rhiju: so there's a pileup at 30 and 60. [3:06 PM]

Jennifer Pearl: ok [3:06 PM]

rhiju: it would be really interesting if you saw features that predicted if a design ended up in those 30 or 60 piles [3:06 PM]

Jennifer Pearl: I will look at that [3:07 PM]

rhiju: you mentioned something above about pairing probabilities? [3:07 PM]

Jennifer Pearl: I think I might be able to  [3:07 PM]

Jennifer Pearl: tghe pairing probability ranges for the two groups of scores is very large  [3:08 PM]

Jennifer Pearl: I dont have my personal computer with me right now though so cant post a picture [3:08 PM]

rhiju: @wuami do you want to describe the hypothesis that you and johan have been testing? and our new puzzling results [3:09 PM]

rhiju: let me go find @wuami [3:10 PM]

Nando: in the meantime, can I ask you guys a question: how are you feeling about the newest lab? problems? complaints? [3:11 PM]

Nando: ... not a sound, excellent! :D [3:12 PM]

Jennifer Pearl: i found a pic [3:13 PM]

Elves: that's a good question lol. Was just looking at the active lab. it does seem slower to load for me lately, but that could just be me [3:13 PM]

Elves: like sometimes i just get bubbles [3:13 PM]

jnicol: Theres a great writeupo here :) [3:10 PM]

jnicol: http://eternawiki.org/wiki/index.php5/User:ElNando888/Blog/Designing_for_the_A_B_lab [3:10 PM]

Jennifer Pearl: I started to work on it an d was able to get a couple of constraints [3:13 PM]

Elves: and reloading the page brings the elements in [3:13 PM]

Jennifer Pearl: i=havent spend more than a few hours on it [3:13 PM]

Elves: thanks for the writeup :) :) :)  [3:14 PM]

Nando: thx, jandersonlee has done some very nice followup work [3:14 PM]

Elves: nice! :)  [3:15 PM]

Elves: time is still the main hurdle for me, to really sit down and solve. I think it reflects the progress we've made in increasing complexity. alas, time is still a hurdle for me. [3:17 PM]

rhiju: +1 on jandersonlee's followup work -- he wrote a script to simluate all player designs in nupack and tag which ones might work best and worst.  [3:17 PM]

Elves: wow!! [3:17 PM]

Jennifer Pearl: nice [3:17 PM]

Elves: that is super cool [3:18 PM]

Jennifer Pearl: looking at nando's writup DPAT can automate picking out the pairing probabilites if it comes to that [3:19 PM]

wuami: hi all! [3:20 PM]

Elves: hi wuami! [3:20 PM]

LFP6: heya :D [3:20 PM]

Jennifer Pearl: hi wuami [3:20 PM]

Nando: hi michelle [3:21 PM]

LFP6: Btw Jen, as a forewarning I'm changing around the 'column' orders in the output of my script [3:21 PM]

Jennifer Pearl: ok. let me know when you change it so I can modify DAPT [3:22 PM]

LFP6: Design is the primary key, so I'm putting the design ID and name before lab data as opposed to how I did before [3:22 PM]

Jennifer Pearl: ok that works [3:22 PM]

LFP6: Yep, I'll send you a PM when I upload the newer version (due to that change, I'll probably upload a new script as opposed to editing the current one) [3:22 PM]

Jennifer Pearl: I found the picure of the groups I was talking about earlier [3:24 PM]

Jennifer Pearl: https://dl.dropboxusercontent.com/u/87351147/groups.JPG [3:24 PM]

Elves: ohhh  [3:25 PM]

rhiju: @wuami do you want to describe the hypothesis that you and johan have been testing? and our new puzzling results  [3:25 PM]

wuami: ah sure [3:26 PM]

LFP6: John, if you're around and could hop on IRC for a sec, I'd appreciate it [3:26 PM]

wuami: we were trying to test if eternabot does well on the same puzzles that you all did well on in the labs, as measured by how long it takes for eternabot to get to a solution [3:27 PM]

rhiju: we had a specific hypothesis, based on interaction with players: [3:28 PM]

rhiju: some puzzles with prescribed structures of state 1 and state 2 would be easier to solve than others [3:28 PM]

wuami: instead we found that exclusion 3 was super hard for eternabot, but you all did super well on that one [3:28 PM]

Elves: interesting results! [3:29 PM]

rhiju: its pretty surprising (perhaps even scary) [3:29 PM]

Jennifer Pearl: interesting [3:29 PM]

rhiju: we were hoping that players could help us define structures for state 1 and state 2 that would help guarantee success in synthesized designs [3:29 PM]

rhiju: in fact, we thought that maybe players could define the structures and we (well, michelle & eternabot) could provide automated ways to fill in the nucleotides [3:30 PM]

Jennifer Pearl: @Rhiju is that for a specific design based on previous runs or from the start [3:30 PM]

Elves: ahh so purely the shape, not the sequence? [3:30 PM]

rhiju: its for the FMN exclusion puzzles -- I think there were 5 of those, plus one called 'Brourds mod of exclusion 4' [3:30 PM]

Jennifer Pearl: if is for previous runs DPAT right now can do that [3:31 PM]

Jennifer Pearl: moslty for just the specific elements [3:31 PM]

rhiju: @elves, yes we want rules that will let us look at the shapes of the two states and decide (1) will solutions be more likely to 'work' and (2) will eternabot be able to find solutions rapidly.  [3:31 PM]

Omei: FWIW, I did a cluster analysis on the last round of Exclusion 3, and there were only two substantially different designs that had at least one submision with a score of 90 or more.; all the rest were minor variations on thse two. [3:31 PM]

wuami: exclusion1-6 + brourd's mod so 7 total [3:31 PM]

Elves: @Omei thank you I was wondering how many were mods, that is very helpful [3:32 PM]

rhiju: @omei OK that's illuminating. and possibly an explanation. did those two designs 'take over' the rounds only in the last round, or earlier? [3:32 PM]

Omei: I haven't looked. [3:33 PM]

wuami: @omei, have you done a similar analysis for other exclusion puzzles? [3:33 PM]

rhiju: yea, its  difficult now to visualize what is happening from round to round. [3:33 PM]

Omei: @wuami, no, I just did that one puzzle and one round. [3:34 PM]

Elves: @rhiju it's interesting, since I think about the sequence and shape as the same thing, since the sequence determines the shape.  [3:35 PM]

Omei: It would be straightforward, though, to combine all the Exclusion 3 puzzles into one spreadsheet and then do a cluster analysis on ther group [3:35 PM]

Jennifer Pearl: LFP6's script will do that now [3:35 PM]

Omei: If someone wants to produce that spreadsheet, I'll run the analysis [3:35 PM]

Jennifer Pearl: put it all in one spreadsheet [3:35 PM]

Jennifer Pearl: give me a couple minutes and i will have it for you [3:36 PM]

Elves: so it's an interesting approach to think of the shape first, since using algorithmic methods to determine best functioning sequence would offload a huge amount of time from players [3:36 PM]

Omei: Super!   [3:36 PM]

Jennifer Pearl: generating it now [3:37 PM]

Elves: @wuami & rhiju when you say that exclusion 3 was hard for eternabot and players did well, how is that measured? [3:37 PM]

Elves: @Omei & wuami it would be cool to see that analysis on eternabot's solution set too [3:38 PM]

wuami: takes eternabot a long time to get a solution [3:38 PM]

Jennifer Pearl: I will wokr on tweeking DPAT to give actuall structures instead of just stacks [3:39 PM]

Elves: to see how much variation there is between its suggested sequences [3:39 PM]

wuami: and for players, the eterna scores were fairly high for exclusion 3 [3:39 PM]

wuami: compared to the others [3:39 PM]

Omei: I can say that all Eternabot's solutions fell in the "miscellaneous" cluster.  But I'm not sure that means anything. [3:39 PM]

Elves: if we can normalize to exclude mods from both sets, and just compare the scores of the relatively unique sequences by players with same from eternabot, then what does it say? [3:40 PM]

Omei: ... other than that Eternabot wasn't modding player designs :-) [3:40 PM]

Elves: ahh thanks Omei [3:40 PM]

Elves: yes hehehe [3:40 PM]

rhiju: @omei, do you have a table or image of those clusters somewhere? [3:40 PM]

Omei: I have a fusion table with cluster assignments. [3:40 PM]

Omei: I also have a dendograph of the clustering, but you can't read any design names from it. [3:41 PM]

rhiju: can you post in https://getsatisfaction.com/eternagame/topics/lab-submission-evolutionary-tree ? Maybe @rbierman and @wuami can follow up a bit. [3:41 PM]

Omei: Sure [3:41 PM]

rhiju: @omei, OMG you have to post that dendrogram [3:41 PM]

Elves: @Jen it would also be cool to run same through DPAT and compare eternabot stack prevalence in highest scoring designs to players [3:41 PM]

Elves: that might help narrow down mod overlap [3:42 PM]

Jennifer Pearl: @Omei https://dl.dropboxusercontent.com/u/87351147/Exclusion%203_all%20rounds_10-23-15.txt [3:43 PM]

Jennifer Pearl: @Elves i may have missed something but do you want me to run Exclusion 3 through DPAT or something else [3:45 PM]

Elves: exclusion 3 eternabot solutions through DPAT, to see if there are any stacks that score high, as compared to player submissions for exclusion 3 through DPAT, same [3:46 PM]

Jennifer Pearl: I could run exclusion 3 though DAPT adn soo what stacsk appear in low scoring designs and high scoring designs to see what stacks are in common [3:46 PM]

Omei: Got it, Jennifer.  Looks like I can work with that. [3:46 PM]

Jennifer Pearl: the macro language was writien to do that [3:46 PM]

Elves: that way if multiple designs use same stack clusters, modding overlap will be muted and can just see which stacks signify high performance, not necessarily total # of designs that score high [3:47 PM]

Elves: yep exactly [3:47 PM]

Elves: and then we can also see if crossover of stacks between eternabot & player designs ( sounds like from Omei's analysis more variation in eternabot, even if lower scoring ) [3:48 PM]

Jennifer Pearl: It can pick out stacks that are in specific score ranges adn can check for density of designs with values for that stack adn if outliers are allowed adn the tnage and percentage of entire design that the outiler make up [3:48 PM]

Elves: it would also resolve a question i have about the evolutionary tree in general, which is that some spontaneous mutations produce seemingly linear or related mods, but they may not actually be aware of or... [3:50 PM]

...linked to each other. so this would help identify pure crossover regardless of connection in design

Elves: when you design a piece of RNA, the patterns can be reused in other designs, like in loops or stacks or different parts of the structure [3:52 PM]

Elves: sometimes this is done on purpose like when someone makes a modification of another design by copying the whole thing and changing part of it [3:53 PM]

Elves: sometimes it happens by accident [3:53 PM]

Elves: either way, it helps to know how to analyze the results saying 100 designs "scored well" if we know how much overlap there is in the patterns of those designs [3:53 PM]

Elves: so like if players design 100 sequences that score well, but only 2 of the designs are unique, then it helps us know how to compare that to say eternabot designing 100 designs, and getting a certain range of... [3:55 PM]

...scores, and level of uniqueness

Elves: Jen writes a program that helps analyze what patterns score well ( stacks = branches or legs of RNA sequences ) [3:56 PM]

Elves: Omei analyzed uniqueness of patterns related to their scores [3:57 PM]

Jennifer Pearl: back [3:58 PM]

Elves: Wuami is looking at the eternabot scores and figuring out how it is doing solving labs compared to players [3:58 PM]

Elves: welcome back Jen :) [3:58 PM]

Jennifer Pearl: thanks [3:58 PM]

Elves: I am curious about this approach where we could focus on designing shapes that get populated with algorithmically chosen sequences [4:00 PM]

rhiju: hey everyone thanks for coming and participating -- see you in a couple weeks. [4:00 PM]

Omei: Hm.  There is one issue with the list you gave me, Jennifer.  It doesn't have cluster counts, which means I can't filter out designs with very low cluster sizes, which I did in my previous analysis.  I'll go ahead and run the analysis, but it will have more "noise" than I would want. [4:01 PM]

Jennifer Pearl: Bye Rhiju  [4:01 PM]

Elves: bye Rhiju! thanks! [4:01 PM]

Jennifer Pearl: @LFP6 can you add cluster size to the script [4:02 PM]

AndrewKae: It was fun listening, thanks all who contributed [4:01 PM]

Elves: thanks Andrew :) [4:02 PM]

Omei: I think he got thr data from the Eterna servers, and that isn't being recorded there. [4:02 PM]

Jennifer Pearl: oh year. John is wokring on adding things to the API i think but not sure if taht will be added [4:03 PM]

jnicol: number of clusters is in the database and available [4:04 PM]

Jennifer Pearl: @john so LFP6 can tweak to script to get it? [4:07 PM]

jnicol: I am talking with him about that, but yes, soon [4:05 PM]

Jennifer Pearl: nice [4:08 PM]