Week 8-A Supervised Approach To Training An Artificial Intelligence To Extract Relevant Genomic Data From Literature

Adam B -

Hello

This week, I began validating the AI by processing 20 research papers by my own hand. This serves as a control to compare the AI’s output. During this process of comparison, I identified several notable concerns.

Upregulation or Downregulation:

One of the most important aspects of the AI’s output is whether a given gene is upregulated or downregulated. The latest concern is that some genes are being misclassified due to unclear criteria (For example, on how to process between simple fold changes or LogFC, measurements of gene upregulation or downregulation that are either based on the multiplicative increase [Folds] of genetic material found in a sample or the Log of that value [Hence, Log(FC)] ) and a default setting to opt for downregulation when faced with an ambiguous gene case. These issues appear to be repairable with improvements to the instructions on how threshold-based designations should be handled as well as how to handle ambiguous gene output.

Table Fables:

Another critical issue arises from the AI’s method of extracting information from tables embedded within the articles. Certain data tables list experimental conditions and statistical measures in dense, varied formats, and the AI’s current instructions appear insufficient for recognizing the correct data boundaries: some extractions contain incorrect gene-condition pairings or others have information too jumbled to understand constructively. This issue should be fixed by providing a more systematic approach to process tables for the model to utilize instead of having it develop one on its own each time it is given a table.

Next week, the validation process will continue alongside improvements to the AI model.

Adam

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    rohan_va
    Hi Adam, I'm thrilled to see that you're validating the AI's accuracy through comparing it's work to your own. It's quite similar to some of the work in my lab, where we validate machine learning algorithms by cross-checking them with our own results. One complication I've found when doing this sort of checking is that when I fix one aspect of the AI, another starts to fall apart. Do you anticipate this issue happening in your case?
    camille_bennett
    Sounds like you are getting some great experience in testing an AI model. Is there a specific threshold for accuracy that you hope to reach?
    adam_b
    Hi Rohan, that's a great question! Absolutely. In fact, I just about rely on it, which makes the testing of an AI such a lengthy process.
    adam_b
    Hi Ms. Bennett, that's a great question! Yes, I am hoping to reach or surpass the accuracy of a human reader attempting the same task of extracting genes from scientific papers.

Leave a Reply

Your email address will not be published. Required fields are marked *