Week 10 -A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature

Adam B -

Hello,

This past week, I ran my AI model through the testing phase: processing a bulk set of ten research papers to gauge how reliably it could extract and summarize key experimental details. While it was not as effective as I had hoped, I have ideas on how to continue improving the model. This week, I will catalog some of the errors I faced and what I did to fix them.

The Errors

First, the AI repeatedly misidentified article titles. This might sound minor, but correct titling is critical for referencing and attributing results accurately. Second, it occasionally misread the condition of the experimental group, especially in studies where multiple experimental groups received either variable amounts of the same treatment or variable treatments as a whole, leading to confusion over which subset of data belonged to which treatment or control. Finally, there was a recurring issue with the direction of gene regulation; correct directionality is critically important to determine meaningful changes in biological markers.

Solutions:

In response, I updated the model used in my API with more precise instructions, emphasizing how to parse the structure of each paper and how to correctly interpret language around gene expression. I also focused on making my instructions more explicit about group identities, which I hope will eliminate confusion over experimental design. Beyond these updates, I returned to  verifying output quality by re-reading the papers it extracted from and cross-checking the AI’s results one-by-one

Overall, this experience has reinforced the importance of thorough testing.

Thank you for following along.
Adam

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    rohan_va
    Hi Adam, it's fascinating to see how your AI model has progressed from the start to finish of this project! What would you say has been the most crucial advancement in the model's functionality thus far?
    kira_a
    Hi Adam, I am fascinated by your project and the implications it can have on the field of psychiatry and its patients! I believe that the ability to recognize a problem or roadblock and address it as you detailed in this blog is crucial to good research. Do you find that there have been times during this project that "fixing" a problem has been more difficult for you than other instances and what do you do in that situation?
    aashi_h
    Hi Adam, your project sounds amazing! What do you think has been the most exciting part of the project so far?
    adam_b
    Hi Rohan, that's a great question! The biggest advancement for this model has to be its ability to process papers in bulk with the API!
    adam_b
    Hi Kira, that's a great question! One of the most interesting challenges with computational projects is that often when you fix an error, a few more appear. Engineering my prompt for this AI exhibited this characteristic particularly more than other problems. I think the best way to address that kind of problem is be organized, cataloging each change made so that it's easier to diagnose where a new problem arises later on.
    adam_b
    Hi Aashi, that's a great question! The most exciting part of this project has been the novelty of working with an AI in psychiatry.

Leave a Reply

Your email address will not be published. Required fields are marked *