|
Abstract:Timely and accurate identification of plant diseases is essential for increasing plant productivity.
The widely used state-of-the-art CNN-based models still face challenges and limitations on leaf
images under complex backgrounds due to a lack of a global receptive field and self-attention
mechanism. This study proposes a lightweight transfer learning-based vision transformer
architecture (TVITA) for the automatic identification of plant leaf diseases without using any
convolution. Two popular PlantVillage datasets—the original dataset (OD) with 55,448 images
and the augmented dataset (AD) with 61,486 images—were used for model training (70%),
validation (20%), and testing (10%). The proposed TVITA model, leveraging transfer learning by
fine-tuning the pre-trained VITSO model on the augmented dataset, outperformed both the VITSO
and VITSA models, which were trained from scratch on the OD and AD, respectively. The TVITA
achieved a recognition accuracy of 97.85%, precision of 96.58%, recall of 97.50%, F1 score of
96.95%, and AUC of 98.76% on the OD testing set, and 97.50%, 97.00%, 97.07%, 96.99%, and
98.74% on the AD testing set. The results indicate that the proposed TVITA model achieves stateof-the-art accuracy, greater robustness, and lower computational cost compared with popular stateof-the-art CNN-based architectures. This study highlights the efficacy of transfer learning in
enhancing ViT models' performance for plant disease identification, suggesting potential
applications in other domains requiring high classification accuracy.
|