546_ieee pdf eXpress compitability
546_ieee pdf eXpress compitability
Shamim Ripon*
Computer Science and Engineering
East West University
Dhaka, Bangladesh
[email protected]
Abstract— Plants are crucial to society, the environment, and detection; however, limitations remain. Extensive feature
the economy but are susceptible to diseases that threaten engineering is required, and generalization across diverse
agricultural productivity and food security. Early detection of datasets may be problematic [5]. Pre-trained deep models, like
plant diseases is essential to minimize crop losses. Traditional EfficientNet and DenseNet, are used to extract quality features
methods are often time-consuming and error-prone, particularly from images, taking advantage of their capabilities of
in the early stages. Recently, Artificial Intelligence (AI) capturing complex patterns even on a limited amount of data.
techniques, particularly machine learning and transfer learning, The ML models then use these features to enhance the
have become pivotal for early leaf disease diagnosis. However, performance of classification. This decoupling lets HTL avoid
these techniques require further improvement. This paper
the issues related to overfitting while reducing the
proposes a Hybrid Transfer Learning (HTL) model that combines
Transfer Learning (TL) for feature extraction with Machine
computational complexity of the end-to-end deep learning
Learning (ML) for classification to enhance accuracy and models at the same time [6]. Furthermore, machine learning
efficiency. Using 18,160 images from the PlantVillage dataset, classifiers have improved interpretability and generalization,
covering nine tomato disease classes, various image enhancement especially in cases where deep learning models could struggle
techniques were applied to improve image quality and disease with imbalanced classes.
detection performance. The experimental results show that the This study will not only improve the accuracy of the
proposed method achieves a significant accuracy of 99.56%,
detection of diseases in tomato leaves but also reduce the
surpassing that of the current state-of-the-art methods. This
computational demands and improve the generalization on
approach offers a practical solution for farmers, enabling faster
and more accurate disease identification and better crop
imbalanced datasets. We propose the DenseNet201 + SVM
management. model, which leverages the advantages of DenseNet201 in
hierarchical feature reuse together with the high-dimensional
Keywords—plant disease, transfer learning, machine learning generalization capabilities of SVM. This model increases the
agricultural efficiency, tomato leaf classification, leaf health accuracy while optimizing computational efficiency. The
assurance, farming yield, food security design meets the requirements of variability in environmental
conditions and resource limitations, thus being appropriate for
I. INTRODUCTION agricultural applications. We used the PlantVillage dataset,
Agriculture is a vital part of the global economy, which containing ten classes of diseases of tomato leaves. The
contributes to rural development and food security. Tomatoes methodology commences with image preprocessing
are of critical importance worldwide, with a market value of techniques, including resizing, contrast adjustment, histogram
$230 billion and annual production of approximately 182 equalization, median filtering, and data augmentation
million metric tons [1]. They are rich in essential vitamins, are encompassing rotations, shifts, and flips to enhance image
extensively utilized in sauces, salads, soups, and juices, and quality and diversity. Pre-trained models, namely
play a crucial role in global nutrition. Tomato cultivation faces EfficientNetB3, XceptionNet, InceptionResNetV2, and
challenges, particularly leaf diseases, which significantly DenseNet201, were employed for feature extraction, and the
affect crop yields [2], production costs, and fruit quality. extracted features were subsequently input into an SVM
These diseases result in substantial economic losses, classifier for robust classification. The proposed hybrid
especially in major producing countries, such as Bangladesh, models demonstrated superior classification accuracy and
China, the United States, and Spain. Farmers, who typically computational efficiency compared to the other models
earn between $2,000 and $4,000 annually, often experience evaluated. This study makes several key contributions.
reduced earnings owing to disease management costs [3]. The • Developing an HTL framework that enhances the
prevalence of these diseases will always threaten production precision and generalization of classification, even
rates and, hence, the need for early detection to contain their when dealing with imbalanced datasets.
spread for crop health.
• Implementation of optimized image preprocessing and
Conventional disease detection is carried out through augmentation techniques to improve robustness and
manual diagnosis by experts and chemical testing. This is feature extraction under real-world conditions.
time-consuming, reliant on manual input, and prone to error
[4]. ML models have improved the performance of early
C. State-of-the-art Comparison
The proposed DenseNet201 + SVM model achieved an
impressive accuracy of 99.56%, outperforming MobileNetV2
(92.5%) and VGGNet (98%). Its hybrid architecture strikes a
balance between computational efficiency and accuracy,
setting it apart from existing approaches. Advanced
preprocessing techniques bolster their robustness against
noise and environmental variations, making them highly
suitable for practical agricultural applications. Accuracy
compared with existing methods, as shown in Table IV. Our
approach achieved an impressive accuracy of 99.56% across
10 disease classes, utilizing the DenseNet201+SVM model
and a dataset comprising 18,160 samples. Our model exhibits
strong alignment with large datasets, as in [7],[10], and [14].
Fig. 8. Confusion matrix of the highest-performing classifier
The model demonstrated the highest recorded in the
comparison, despite the substantial volume of data involved.
The learning curve in Fig. 9 illustrates the progress of the The proposed model outperformed other models such as
model over 20 epochs, with consistent accuracy MobileNetV2 [10] with an accuracy of 92.5%, and VGGNet
improvements and a steady reduction in loss. Both the training [9] with an accuracy of 98% and was limited to two classes,
and validation loss curves decreased uniformly, with the along with the CNN-RNN [8] model, with only 2000 data. In
validation loss stabilizing, indicating effective learning and addition, it surpassed DenseNet121 in [11], which collected
generalization. The accuracy curves for training and six classes of disease, with an accuracy of 98.62%. The
validation rose sharply, converging near maximum values by enhanced performance of our model was attributed to the
the end, all exceeding 99%, signifying strong model systematic combination of DenseNet201 and SVM. All these
performance. The close convergence of these curves suggests studies utilized the Plant Village tomato disease dataset. Our
good generalization without overfitting. approach not only surpasses these methods in terms of
accuracy, but also demonstrates superior performance with a
larger dataset, emphasizing its effectiveness in classifying and conditions. The HTL model, improved further, thus is
tomato leaf diseases. another leap in AI-driven agrarian solutions that support
sustainable farming for global food security. Efforts will be in
T ABL E I V. COMPARISON WITH THE STATE-OF-THE-ART improving computational efficiency and its application to a
Data
greater number of crops and conditions.
Reference Model Classes Accuracy Year
instances
REFERENCES
[7] 17,500 CNN 10 91.2% 2000
CNN-RNN [1] C. C. Patel, N. M. Gohel, C. B. Dhobi, A. K. Maru, R. G. Parmar, and
[8] 2000 10 98.25% 2022 D. M. Korat, “Plant Health Management”.
model
[2] K. B. Vegesana, “A Framework and System for a Multi-Model
EfficientNet Decision Aid for Sustainable Farming Practices,” Computational
[14] 9027 4 98.7% 2022
B7 +LR Modeling & Simulation Engineering Theses & Dissertations, Jan.
[9] 2,475 VGGNet 2 98% 2023 2015, doi: 10.25777/n7kx-f612.
[10] 17,500 MobileNetV2 10 92.5% 2024 [3] E. C. D. Todd, “Economic Loss from Foodborne Disease and Non-
Illness Related Recalls Because of Mishandling by Food Processors,”
[11] 6.000 DenseNet121 6 98.62% 2024 J Food Prot, vol. 48, no. 7, pp. 621–633, Jul. 1985, doi: 10.4315/0362-
Proposed DenseNet201 028X-48.7.621.
18,160 10 99.56% 2024
Method + SVM [4] K. Weiss, T. M. Khoshgoftaar, and D. D. Wang, “A survey of transfer
learning,” J Big Data, vol. 3, no. 1, pp. 1–40, Dec. 2016, doi:
V. DISCUSSION 10.1186/S40537-016-0043-6/TABLES/6.
[5] V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos,
Our study shows that combining transfer learning (TL) “Recent advances and emerging challenges of feature selection in the
with machine learning (ML) outperforms TL models alone. context of big data,” Knowl Based Syst, vol. 86, pp. 33–45, Sep. 2015,
doi: 10.1016/J.KNOSYS.2015.05.014.
The SVM approach effectively addresses overfitting in high- [6] M. R. Ahmed et al., “Towards Automated Detection of Tomato Leaf
dimensional spaces, thereby enhancing the accuracy and Diseases,” Proceedings - 6th International Conference on Electrical
generalization of various diseases. The DenseNet201 + SVM Engineering and Information and Communication Technology,
model distinguishes overlapping disease symptoms with ICEEICT 2024, pp. 387–392, 2024, doi:
significant implications for agriculture by enabling timely 10.1109/ICEEICT62016.2024.10534559.
[7] M. Agarwal, A. Singh, S. Arjaria, A. Sinha, and S. Gupta, “ToLeD:
actions to reduce crop losses and costs, thereby enhancing Tomato Leaf Disease Detection using Convolution Neural Network,”
food security and sustainability. Their versatility across Procedia Comput Sci, vol. 167, pp. 293–301, 2020, doi:
different crops and diseases makes them valuable for smart 10.1016/J.PROCS.2020.03.225.
farming and automated disease control. However, this study [8] H. E. David, K. Ramalakshmi, R. Venkatesan, and G. Hemalatha,
had several limitations. Relying on the PlantVillage dataset “Tomato leaf disease detection using hybrid CNN-RNN model,”
Advances in Parallel Computing, vol. 38, pp. 593–597, 2021, doi:
may not fully represent the diverse range of global tomato leaf 10.3233/APC210108.
diseases, considering environmental and genetic variations. [9] P. Kumar Das, “Leaf Disease Classification in Bell Pepper Plant using
The controlled conditions of the dataset might not accurately VGGNet,” Journal of Innovative Image Processing, vol. 5, no. 1, pp.
reflect real-life factors such as noise, lighting, and background 36–46, Mar. 2023, doi: 10.36548/JIIP.2023.1.003.
[10] A. Abdullah, G. A. Amran, S. M. A. Tahmid, A. Alabrah, A. A. AL-
differences, potentially impacting real-world effectiveness. Bakhrani, and A. Ali, “A Deep-Learning-Based Model for the
Additionally, we did not assess the adaptability of the model Detection of Diseased Tomato Leaves,” Agronomy 2024, Vol. 14,
to various weather conditions, soil types, or farming methods. Page 1593, vol. 14, no. 7, p. 1593, Jul. 2024, doi:
Using DenseNet201 with an SVM could be resource- 10.3390/AGRONOMY14071593.
intensive, limiting its widespread adoption. Future work could [11] M. M. Billah, A. Sultana, R. Sad Aftab, M. M. Ahmed, and M. Shorif
Uddin, “Leaf disease detection using convolutional neural networks:
involve incorporating more crops and diseases with hybrid a proposed model using tomato plant leaves,” Neural Comput Appl,
transfer learning, exploring enhancements, such as model pp. 1–11, Aug. 2024, doi: 10.1007/S00521-024-10283-2/METRICS.
quantization and low-rank matrix decomposition, and using [12] Z. Ullah, N. Alsubaie, M. Jamjoom, S. H. Alajmani, and F. Saleem,
edge processing or cloud computing for real-time disease “EffiMob-Net: A Deep Learning-Based Hybrid Model for Detection
diagnosis in limited-capacity environments. and Identification of Tomato Diseases Using Leaf Images,”
Agriculture (Switzerland), vol. 13, no. 3, p. 737, Mar. 2023, doi:
10.3390/AGRICULTURE13030737/S1.
VI. CONCLUSIONS [13] B. Khan et al., “Bayesian optimized multimodal deep hybrid learning
This work presents a hybrid transfer learning approach for approach for tomato leaf disease classification,” Scientific Reports
enhancing the accuracy in detecting diseases of tomato leaves. 2024 14:1, vol. 14, no. 1, pp. 1–30, Sep. 2024, doi: 10.1038/s41598-
024-72237-x.
The DenseNet201 + Support Vector Machine model showed [14] P. Kaur et al., “Recognition of Leaf Disease Using Hybrid
the best performance with an accuracy of 99.56%, which is Convolutional Neural Network by Applying Feature Reduction,”
scalable for practical applications. It outperforms other Sensors 2022, Vol. 22, Page 575, vol. 22, no. 2, p. 575, Jan. 2022, doi:
traditional transfer-learning models on efficient handling of 10.3390/S22020575.
[15] “PlantVillage-Dataset/raw/color at master · spMohanty/PlantVillage-
high-dimensional feature space and reducing overfitting in Dataset · GitHub.” Accessed: Sep. 01, 2024. [Online]. Available:
imbalanced datasets. The model generalizes very well across https://github.com/spMohanty/PlantVillage-
a wide range of disease categories with the help of advanced Dataset/tree/master/raw/color
image preprocessing and data augmentation techniques, hence [16] B. R. Jana, H. Thotakura, A. Baliyan, M. Sankararao, R. G.
making it feasible for early agricultural disease diagnosis. The Deshmukh, and S. R. Karanam, “Pixel density based trimmed median
filter for removal of noise from surface image,” Applied Nanoscience
PlantVillage dataset limits its applicability in diverse (Switzerland), vol. 13, no. 2, pp. 1017–1028, Feb. 2023, doi:
environments. Future research should use diverse datasets, 10.1007/S13204-021-01950-0/METRICS
optimize model efficiency, and extend its use to other crops