Adam B's Senior Project Blog
|
Project Title: A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature BASIS Advisor: Amy Anderson Internship Location: U of A Phoenix Biomedical Campus Onsite Mentor: Dr. Alexander Niculescu |
Project Abstract
Psychiatry, the branch of medicine focused on diagnosing, treating, and preventing mental, emotional, and behavioral disorders, is unique in that it grapples with subjectivity in diagnosis, often relying on observational data, self-reported symptoms, and the individual deduction of a psychiatrist. One of psychiatry’s most pressing challenges is to find objective bases for diagnoses, similar to those in other fields of medicine. Currently, the psychiatric literature, while containing vast amounts of data, is difficult to analyze through traditional means. It is here that we hope to modernize the approach to analysis through the use of artificial intelligence (AI), a powerful tool for extremely complex data processing. This project, based in Dr. Niculescu’s lab at U of A’s Biomedical Sciences Partnership Building, attempts to develop an AI model capable of consolidating and analyzing vast amounts of psychiatric literature to identify correlations between biomarkers—quantifiable measurements of gene expressions, proteins, or other biological diagnostics—and psychological diseases. Our method involves the use of a specialized generative pre-trained transformer (GPT) model, due to the ease of both providing specific instructions for the model and fine-tuning it through corrections based on comparisons with manual analyses. The ultimate goal is to standardize the biological foundations of psychiatric conditions, enhancing diagnostic precision and facilitating further research into treatments.
Week The Last -A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Hello, This week was spent on further debugging. This included addressing errors related to indexing the correct formats of data and being sure to include additional metadata to clarify genetic entries as they are placed in the databases. Additionally, I spent time on my senior project, finalizing my presentation and working on a demo version... Read More
Week 10 -A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Hello, This past week, I ran my AI model through the testing phase: processing a bulk set of ten research papers to gauge how reliably it could extract and summarize key experimental details. While it was not as effective as I had hoped, I have ideas on how to continue improving the model. This week,... Read More
Week 9-A Supervised Approach To Training An Artificial Intelligence To Extract Relevant Genomic Data From Literature
Hello everyone, This week, I continued working on the API for my AI model. This week, I managed to implement a solution that resolved the problems described last week. I would like to spend this week’s blog discussing this process. Previously, the API’s workflow worked for a given paper by first calling the AI with a... Read More
Week 8-A Supervised Approach To Training An Artificial Intelligence To Extract Relevant Genomic Data From Literature
Hello This week, I began validating the AI by processing 20 research papers by my own hand. This serves as a control to compare the AI’s output. During this process of comparison, I identified several notable concerns. Upregulation or Downregulation: One of the most important aspects of the AI’s output is whether a given gene... Read More
Week 6-A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Hello This week’s emphasis continued to focus on the implementation of my paper processor into API format. Since, originally, the AI was largely web-based, I am finding many unique discrepancies between its old version and its new API. Here, I will discuss those and how they have been addressed throughout the week. Information Restrictiveness One... Read More
Week 6-A Supervised Approach To Training An Artificial Intelligence To Extract Relevant Genomic Data From Literature
Hello,This past week, I continued working on the API for automating the AI extraction process. This process involved many bouts of testing, refining, and retesting. So, this week’s blog will highlight some of the more notable challenges faced on improving the API. “Not-Specified”Since I am no longer using a conventional AI interface (the usual chatbot... Read More
Week 5-A Supervised Approach To Training An Artificial Intelligence To Extract Relevant Genomic Data From Literature
Hello, This past week, I’ve been working on finalizing the API that will automate the analysis my AI model uses. This is essential to address the sheer size of psychiatric literature within a reasonable span of time. Major Developments The API has improved dramatically from when I first began last week. It can now process... Read More
Week 4-A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Hello everyone! After last week’s focus on refining instructions and leaning more on Grok 3 for its nuanced outputs, I’ve spent these past few days working through the remaining papers for the testing phase of my AI model. Our testing set consists of a variety of psychiatric papers, assessing everything from rat models to human... Read More
Week 3-A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Hello everyone! This past week brought me deeper into both model comparisons and streamlining workflows for data extraction. This week, I used two different AI models: ChatGPT o3, and a second, newer, and less well-known AI, Grok 3. Grok is an interesting AI made by X because it was trained on typical public documents... Read More
Week 2 Introduction—A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Hi everyone, it’s Adam! In the last post, I mentioned I would describe some of the challenges that appear when training the AI model: 1) Inconsistent Paper Structure: The first major problem I’ve found deals with the diverse formats of research papers. Some papers include lengthy tables that run across pages, others unconventional table structures,... Read More
Week 1—A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Training the AI Model: How the Analysis Works Hi everyone, it’s Adam! In my previous discussion, I explained the use of AI in psychiatry and its potential to extract meaningful biological markers from research. Now, I’ll explain how the process of analysis works to train the model to accomplish just that. Data Input and Instruction... Read More
Introduction—A Supervised Approach to Training an Artificial Intelligence to Extract Relevant Genomic Data from Literature
Psychiatry is one of the most fascinating yet frustrating fields of medicine. Unlike other specialties (take oncology, the study and treatment of cancer, which relies on measurable biological markers and lab tests), psychiatry still depends heavily on observed symptoms, patient reporting, and the subjective observations of a doctor. There are no simple blood tests or... Read More