*WINNER* Estimating Uncertainty in Deep Image Classification
The application of deep learning to the medical diagnosis process has been an active area of research in recent years. Convolutional neural networks (CNNs) have made the computational processing of medical images realizable. CNNs automatically extract important features from images to be processed by an Artificial Neural Network to produce a final classification or regression result. Despite their high performance, CNNs are a “black-box”. Thus, deploying CNNs in a clinical setting requires a measure of the uncertainty of its prediction. A theoretically grounded approach to determining uncertainty is to utilize dropout layers during training and testing and computing the standard deviation of a sample of predictions. A naïve approach to uncertainty quantification is to use the difference of the absolute value of the predicted probabilities of binary classes from a standard CNN. The first network is fine-tuning a VGG-16 CNN with ImageNet weights. Our second model is a VGG-16 model augmented to measure Bayesian uncertainty. We evaluate our models on a dataset of 85,000 images to determine the presence or absence of Tumor-Infiltrating Lymphocytes which are the infiltration of the immune system into a cancerous tumor. We evaluate the effectiveness of these uncertainty measures with two experiments. The first experiment evaluates classification accuracy when removing the most uncertain images from evaluation. The second experiment replaces the most uncertain labels with the correct labels to simulate a clinical setting with a pathologist. From our analysis, the best increase in accuracy occurs when uncertainty is measured from the model performing the predictions.