Week 8: Data Selection and Writing
Rohan V -
Hi all, it’s Rohan!
With the functional part of my project (the new change point detection algorithm) finished, I’ve turned towards finalizing my results and drafting my research paper and presentation. Now, a large part of the work lies in deciding how to best represent my data. Since the output of this research project is an algorithm, in and of itself it is just a thousand lines of code. The data of interest is generated when this change point detection (CPD) algorithm is applied to molecular datasets. As a quick refresher, each dataset that I run this algorithm on consists of thousands of pairs of (x,y) data, where the x axis represents distance and the y axis represents the conductance. My algorithm searches this data for points of significant structural change – for instance, where the slope of the data changes significantly.
In the lab, our main goal is to find molecules which can conduct electricity at very high levels but also remain stable for a long time. As a result, we test molecules with a wide range of characteristics: some are quite short – in fact, one of the molecules we’ve analyzed is only six atoms long, with some linker chains on the ends – and others go on for dozens of atoms. Naturally, each of these molecules exhibit different conductance patterns. Because we have so many molecular datasets to choose from, my challenge lies in deciding which molecules I want to showcase and which ones I want to leave out. As of now, I’m leaning towards choosing a prototypical (representative) molecule from each category to showcase, in the hope that this representation can accurately capture the broader trends we see in our data from other molecules.
As always, let me know if you have any questions/comments, or suggestions for how I can try to better represent my results. See you next week!
Comments:
All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.