Rare Disease Datasets (continued)

Heet D -

For this blog post, I thought I would take a moment to explain how my project has changed, and what is next.

With my project I hope to advance and contribute to research in using zero-shot and few-shot learning for rare disease diagnosis. In recent times this has been a growing area of research and it has a lot of potential for its diagnosis capabilities. My project requires establishing a SOTA baseline for classification of images based on the NIH Chest X-ray dataset using the model CLIP. So far I have been able to do this for both zero-shot and few-shot. By this I mean I was able to outperform the SOTA results cited in literature with mine. This has allowed me to have a good baseline for common diseases with which I can compare my rare disease results.

Earlier, in the blog post where I presented results, I had also tested a fine-tuned case where I trained CLIP on the entire dataset as opposed to zero-shot and few-shot learning. Initially I had kept this as an entirely different test case, but as I looked at more and more literature on this topic, I realized that I could significantly improve my results if I implemented what is called domain adaptation by combining aspects of fine-tuning with zero-shot and few-shot. Domain adaptation is essentially just preparing CLIP for zero-shot and few-shot without specifically training the model for it. This might sound unclear at the moment, but it will make a lot more sense in my next post with my results.

Overall, I am ready for the rare disease part of my project. As mentioned in my last blog post, I was searching for a dataset. Over the past week I have been considering many options, and have decided to filter a larger dataset to get these images. I was initially considering using a synthetic dataset, but since it won’t have a ground truth for evaluation it is not a good choice for my needs. I am currently looking at the CheXpert Datsaet by Stanford, and will likely proceed with this. Choosing the dataset is one of the most important parts of my project, which is why I decided to take some extra time to ensure that my dataset is suitable. I spent this week modifying my code for domain adaptation along with finalizing my dataset. 

This coming week I will focus on pre-processing and training and shortly I should have results to share with you all, along with the explanation for domain adaptation. Thank you! 

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    elena_c
    Hi Heet, how did your understanding of domain adaptation evolve as you researched more, and how do you think it will impact the final performance of your model?
    heet_d
    Hey Elena, throughout the past week as I modified my code for domain adaptation I learned a lot about how the data flows through various layers and ultimately is used to make predictions. Additionally, domain adaptation allows for CLIP to work better with the specific style and properties of medical images that other images may not have. It's like tailoring the model specifically for the task I need, so I think that my results will definitely benefit from domain adaptation.

Leave a Reply

Your email address will not be published. Required fields are marked *