COMPARISON OF DEEP LEARNING MODELS AND DROPOUT RATES ON COLORECTAL CANCER HISTOLOGY

Irmakçı İ., Kuyucuoğlu F.

EGE 13th INTERNATIONAL CONFERENCE ON APPLIED SCIENCES, İzmir, Türkiye, 13 - 15 Haziran 2025, ss.961-969, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: İzmir
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.961-969
Manisa Celal Bayar Üniversitesi Adresli: Hayır

Özet

This study investigates the effects of dropout regularization on the performance of deep learning
models in the context of medical image classification using a constrained data setting. Specifically,
we examine three widely used convolutional neural network (CNN) architectures—MobileNetV3
Large, DenseNet121, and ConvNeXt Tiny—on a subset of the PathMNIST dataset consisting of
histological images. Recognizing that training deep learning models on limited datasets often leads
to overfitting and poor generalization, we apply dropout at three levels (0.0, 0.2, 0.4) to assess its
role in mitigating these effects. The study employs only 20% of the total available data for each of
the training, validation, and testing stages to emulate a real-world scenario where labeled data is
scarce. Each model is evaluated in terms of test accuracy, area under the receiver operating
characteristic curve (AUC), and macro F1-score. Results reveal that MobileNetV3 Large with no
dropout achieves the highest test accuracy (0.8684), while ConvNeXt Tiny with 0.2 dropout yields
the highest AUC (0.987), indicating its superior capability to generalize across class boundaries.
The macro F1-score shows competitive results across all models, with MobileNetV3 Large
reaching 0.819 in its best configurations. These findings emphasize the necessity of tailoring both
architecture and regularization techniques to the available dataset size and complexity of the
classification task. Visual plots of accuracy, AUC, F1-score, precision, recall, and learning curves

further support our conclusions and can guide practitioners in selecting appropriate models under

data-constrained environments.