Week 5- One Step Backwards, Two Step Forwards [Re-running STAR Alignment and HTSeq]

Arnab M -

Hi guys, It’s Arnab and this is my fifth weekly update on my senior research project: Exploring the Genomic Effects of PNPLA7 Mutations on Cerebral Palsy through RNA Sequencing.

As expected from the start of this project there would be setbacks, trials, and tribulations; however, no obstacle should ever separate you from your goals especially when fighting for a cause as novel as children battling neurological movement disorders.

Previously, I anticipated running DESeq2 after our HTSeq-Counts outputs were finished processing, but when I opened the TSV file (a text-based file containing HTSeq-Counts outputs data) there were only columns of zeroes present. Baffled, I discussed with my advisor what might have been the root of our problems. We tracked back to our HTSeq-Counts code but found nothing. It wasn’t until we tracked back into STAR Alignment that we discovered the problem: I had mistakenly only added half our data files, coding a tediously long and inefficient line of code. With this newfound discovery, I quickly coded a for-loop in Shell Script so I wouldn’t have to individually list out every single directory pathway file like I previously did, leading to my error. The for-loop performed exceptionally as we ran it, and it let us place those output files for a test run in HTSeq-Counts, and when that was successful we let the HTSeq-Counts code run on every single data file (as shown in the attached image). To give you some insight into how long some of this code takes, our single HTSeq-Counts test run file was timed at 1 hour and 2 minutes, meaning processing the entire data we have will take around 31 hours.

As for some insight into my lab dynamics with the PhDs and Postdocs, previously in a lab meeting, we discussed the future of sequencing with PacBio’s HiFi Sequencing Machine and the future of DNA and RNA Sequencing. It was a very insightful and indelible experience that deepened my relationship with these incredible researchers. It furthermore reminds me how big and rapidly evolving this field of Bioinformatics is and why I wanted to join in the first place. See you next week!

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    camillebennett
    Hi Arnab, great problem solving! I'd love to hear more about the future of sequencing and it's applications in the future. It sounds like there is huge impact in this field beyond your specific research.

Leave a Reply