Publication: Improving Data-Scarce Medical Diagnosis by Healthy Image Pre-Training
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Accurate medical image classification using computer vision remains a challenge in clinical radiology, particularly in low-data settings where labelled examples are scarce or expensive to obtain. This thesis evaluates 20 separate model configurations across 3 binary chest X-ray classification tasks, to determine the impact of pre-training, base architecture, and fine-tuning strategies on diagnostic accuracy. The models' variations include convolutional (ResNet-18, ResNet-50) and transformer base models (ViT, DINOv2), application of pre-training or not on a large corpus of healthy chest X-rays images, and method of fine-tuning (transfer learning, full fine-tuning, Low-Rank Adaptation, Chain of Low-Rank Adaptation, and Weight-Decomposed Low-Rank Adaptation).
The findings prove that the effect of domain-specific pre-training significantly boosts downstream performance of CNNs and models trained on small datasets (< 2,500 diseased images) by an average of 2.55% and 3.9% respectively. Convolutional architectures consistently outperform the top transformer-based models by an average of 9.5%. Almost all pre-trained ResNet models match or exceeded benchmark standards for public datasets, achieving up to 96.1% accuracy on the largest dataset (8,716 labelled examples) and up to 79.4% average accuracy across the 3 tasks, with a mean dataset size of 5,209 labelled images. As a result, this work comprehensively shows CNN-based models in computer vision-based medical diagnostics and pre-training on a large, related, healthy corpora improves downstream classification accuracy. The two of which should be adopted into routine use in critical fields such as radiology, where high-quality data is scarce and accuracy is paramount.