Interciencia Journal

Paper Details


DOI LINK: https://doi.org/10.82044/BGd9P
Paper ID:BGd9P
Volume:50
Issue:3
Title:TVITA: A Lightweight Transfer Learning-Based Vision Transformer for Accurate Plant Disease Identification
Abstract:Timely and accurate identification of plant diseases is essential for increasing plant productivity. The widely used state-of-the-art CNN-based models still face challenges and limitations on leaf images under complex backgrounds due to a lack of a global receptive field and self-attention mechanism. This study proposes a lightweight transfer learning-based vision transformer architecture (TVITA) for the automatic identification of plant leaf diseases without using any convolution. Two popular PlantVillage datasets—the original dataset (OD) with 55,448 images and the augmented dataset (AD) with 61,486 images—were used for model training (70%), validation (20%), and testing (10%). The proposed TVITA model, leveraging transfer learning by fine-tuning the pre-trained VITSO model on the augmented dataset, outperformed both the VITSO and VITSA models, which were trained from scratch on the OD and AD, respectively. The TVITA achieved a recognition accuracy of 97.85%, precision of 96.58%, recall of 97.50%, F1 score of 96.95%, and AUC of 98.76% on the OD testing set, and 97.50%, 97.00%, 97.07%, 96.99%, and 98.74% on the AD testing set. The results indicate that the proposed TVITA model achieves stateof-the-art accuracy, greater robustness, and lower computational cost compared with popular stateof-the-art CNN-based architectures. This study highlights the efficacy of transfer learning in enhancing ViT models' performance for plant disease identification, suggesting potential applications in other domains requiring high classification accuracy.
Keywords:plant disease identification; convolutional neural networks (CNN); vision transformer; transfer learning; plant leaf images; precision agriculture.
Authors:Mingzhuo Hao, Shouke Wei, Yuwen Huang, Qinghe Zheng
Paper PDF Link: View full PDF