Swin Transformer for Skin Cancer Diagnosis
Abstract
Kunwar Ranjeet, Subham Saha and M. Ratnakumari
Skin is the largest organ in the human body and serves critical physiological and protective functions. Skin cancer, also referred to as cancer mortis, is among the most prevalent and rapidly increasing types of cancer worldwide. Timely and accurate diagnosis is essential for effective treatment. However, traditional diagnostic methods often rely heavily on expert interpretation, specialized equipment, and time-consuming procedures—factors that can delay early detection and treatment. To overcome these limitations, this study presents a deep learning-based skin lesion classification model utilizing the Swin Transformer architecture. This state- of-the-art vision model leverages a hierarchical structure and shifted window self-attention mechanism to extract both local and global features from dermatoscopic images. The model is trained on the HAM10000 dataset, a comprehensive collection of labeled skin lesion images, ensuring diversity and robustness in learning. Model performance is assessed using standard classification metrics. The Swin Transformer-based model demonstrates strong performance in classifying skin lesions, with high values in accuracy, precision, recall, and F1-score. These results indicate the model’s potential in supporting dermatological diagnostics with minimal delay and reduced dependence on human interpretation. The proposed deep learning approach offers a promising solution for enhancing the early detection of skin cancer. By integrating advanced visual feature extraction and robust classification capabilities, this model has the potential to improve diagnostic accessibility and accuracy across healthcare systems worldwide.
