Identification of Selected Ethiopian Traditional Medicinal Plants Using Digital Image Processing and Deep Learning Techniques

Habtamu Alebel; Million Meshesha; Wabi Jifara; Girma Asefa

Insights of Herbal Medicine(IHM)

ISSN: 2834-7749 | DOI: 10.33140/IHM

Researchers and authors can directly submit their manuscript online through this link Online Manuscript Submission.

Track Your Submission

Share this page:

Indexing

Open Access Journals

Research Article - (2026) Volume 5, Issue 1

View PDF Download PDF

Identification of Selected Ethiopian Traditional Medicinal Plants Using Digital Image Processing and Deep Learning Techniques

Habtamu Alebel ¹ ^*, Million Meshesha ² , Wabi Jifara ³ and Girma Asefa ⁴

¹School of Rural development and Agricultural Inavation, College of Agriculture and Environmental Sciences, Haramaya University, Ethiopia
²School of Information Sciences, Addis Ababa University, Ethiopia
³School of Information Science College of Computing and Informatics, Haramaya University, Ethiopia
⁴School of Natural Resource Management and Environmental Sciences, College of Agriculture and Environmental Sciences, Haramaya University, Ethiopia

^*Corresponding Author: Habtamu Alebel, School of Rural development and Agricultural Inavation, Ethiopia

Received Date: Feb 13, 2026 / Accepted Date: Mar 23, 2026 / Published Date: May 18, 2026

Copyright: ©2026 Habtamu Alebel, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Citation: Alebel, H., Meshesha, M., Jifara, W., Asefa, G. (2026). Identification of Selected Ethiopian Traditional Medicinal Plants Using Digital Image Processing and Deep Learning Techniques. Insights Herbal Med, 5(1), 01-07.

Abstract

Ethiopia is rich in biodiversity, and its vast landscape offers a variety of medicinal plants with significant cultural and therapeutic importance. The study is conducted to identify medicinal plants using image processing and deep learning techniques. This study explores both pre-trained models (such as VGG-16 and ResNet-50) and CNN sequential for creating models from scratch. Experimental results show that the Medicinal Plant Identification (MPI) model constructed from scratch performs better with test accuracy of 99.90% and an F-score of 99.00% performance as compared to pre-trained models such as VGG-16 and ResNet-50. The proposed model achieves high accuracy in image classification by utilizing multiple convolutional layers that capture fine- grained patterns in images.

Keywords

Deep Learning, Digital Image Preprocessing, MPI, CNN, ResNet-50, VGG-16

Introduction

Ethiopia is rich in biodiversity, and its vast landscape offers a variety of medicinal plants with significant cultural and therapeutic importance. These plants are used in traditional medicine, and many have been passed down through generations as part of the Ethiopian people’s cultural heritage [1]. Up to 80% of Ethiopians rely on traditional remedies as their main source of healthcare since they are culturally embedded, easily available, and reasonably priced. Furthermore, those seeking curative procedures continue to rely on indigenous medicine as their main source of healthcare, while western medicine has shifted its focus to prevention measures [2].

Traditional methods of medicinal plant identification often require expert knowledge, fieldwork, or the use of reference books, which are time-consuming and prone to human error [3]. To identify medicinal plants, a skilled expert looks for all of the plant’s characteristics, including leaves, flowers, seeds, roots, and stems. However, they find it challenging to distinguish between plant species with similar leaf strictures [4]. The study was planned to collect a broader range of plant images from different regions of Ethiopia to capture variations in plant appearances due to environmental conditions, but this may not have been completed due to time and budget concern.

Different scholars use deep learning to identify medicinal plants from other plant species. Deep learning is inspired by the structure and functioning of the human brain, with artificial neural networks designed to recognize patterns, make decisions, and learns from large amounts of data. Deep learning models are ideal for tasks like picture recognition, speech recognition, and natural language processing because they can automatically extract characteristics from the data [5]. The necessity for identifying medicinal plants among the thousands of plant species is increasing due to advancements in computer vision and machine learning [6]. The purpose of this work was to use deep learning and digital image processing to investigate and build a convolutional deep learning model that can recognize a subset of Ethiopian medicinal plants.

This study was introducing a novel approach to preprocessing using Gaussian filtering, which enhances image segmentation by effectively reducing image noise while preserving critical image features. Building on this, we propose MPI a custom deep learning model designed from scratch to address domain-specific challenges. MPI model outperforms state-of-the-art pre-trained architectures like ResNet-50, VGG-16 and Mobile Net in image segmentation accuracy and generalization ability. Comprehensive benchmarking demonstrates significant improvements in both performance and computational efficiency, filling a critical gap in previous works that struggled to adapt to noisy or complex datasets.

Methodology

Data Preparation

The collected leaf images were pre-processed before using the images to train and test the model. The leaf images pass through noise removal and enhancement to increase image quality. Following image noise removal, normalization, and resizing, images were segmented using various segmentation techniques (edge-based, K-means, and threshold) to identify the region of interest. Feature extraction was performed using multiple techniques such as shape, color, and texture-based features, and feature selection to identify and classify medicinal plants and develop a model. Once we prepared the image dataset well, percentage split methods were applied to split the data. Accordingly, 80% of the total data set was used for training, and from the remaining 20%, 10% was used for testing and 10% for validation purposes.

The Proposed Architecture

Medicinal plant identification model was developed in this study by combining image processing methods with convolutional neural network technique such as CNN, VGG-16, and ResNe-50. The proposed architecture is presented below in Figure 1. Medicinal plant images first have been preprocessed, which includes resizing, normalizing, and filtering them. In the second stage, the leave images are segmented to find regions of interest. Data splitting and the usage of the training data set by applying image augmentation to build the classification model using the soft max classifier come next. Following creation, the model is optimized by modifying hyper parameters using a validation data set. The model’s predictive efficacy is then assessed using unseen plant images from the test image collections.

Figure 1: Over all Architecture of Proposed Models

Image Acquisition

10,372 medicinal plant leave images were collected for this study from real world. The researcher employed 1,086 images for validation, 1,086 images for testing, and 8,200 images for training in this study endeavor. The image was taken against a different background. The sample image is prepared in the well-defined JPG image format. The images were captured at a resolution of 2448*3264 pixels. We gathered 103 leaf images for each type of medicinal plant species.

Image Pre-Processing

In this study the researcher follows basic image pre-processing techniques like image normalization, resize, image noise removal with filtering, image segmentation (Threshold Seg0mentation, K-Means based Segmentation and Edge-based Segmentation)

Image Resining

Image resize is the process of changing the dimensions of an image. This can involve either reducing or enlarging the image’s width and height while trying to maintain its aspect ratio (the proportional relationship between width and height) or adjusting it freely [7]. When resizing, different algorithms or methods (like nearest neighbor, bilinear, or cubic interpolation) are used to maintain the quality and clarity of the image. However, resizing an image too much (especially enlarging it beyond its original size) can result in a loss of sharpness or pixilation. For this study the original image is resized to 128×128 pixels from original sizes.

Image Noise Removal with Filtering

Removal of noise is an essential operation in image processing; therefore, before performing any process in the image, noise removal is important. Images may be corrupted by noise during image acquisition and transmission [8]. There are different types of noise in the image, like impulse noise, adaptive white Gaussian noise, short noise, quantization noise, and film grain. These or more are coupled together to form a mixed noise, and different filtering methods, such as filtered2d, bilateral, and box filter and Gaussian filtering are used to reduce their effect noise on the image in this study.

Image Segmentation

Image segmentation is a computer vision technique that partitions an image into multiple segments to simplify analysis by identifying objects, boundaries, or regions of interest. It is commonly performed using deep learning models like Convolutional Neural Networks (CNNs) and U-Netb [9]. Segmentation can be categorized into semantic segmentation (classifying each pixel into a category), instance segmentation (distinguishing different objects of the same class), and panoptic segmentation (combining both). Techniques like thresholding, edge detection, region-based methods, and deep learning-based approaches help in accurate segmentation. In this study Edge-based, K-means and Threshold segmentations are applied.

Feature Extraction

For the purpose of feature extraction in this study the researcher used 5 x 5 and 3 x 3 filter sizes at a single layer. The ReLu activation function was used in the activation layer throughout our model. The ReLu activation function returns zero if the values in the input layer are negative; otherwise, it returns the value itself. Researcher was applied max pooling for subsampling layers, which are also used to down sample image data extracted from convolutional layers to reduce the dimensionality of the feature map to decrease processing time. It progressively reduces the width and height of the input volume. Pooling operation requires two parameters. Since our input image size is huge (128 x 128 pixels), we had applied a pooling size of two (2 x 2) after the consecutive convolution layers. Stride size of two was used to each time when did a pooling operation which determine the number of pixels we skipped while doing the pooling operation. The dense (fully connected) layer was used to flatten the high-level features that are learned by convolutional layers and combining all the features.

Classification Algorithm

Our selected classification algorithm works by first extracting features from input medicinal plant leaf images such as (edges, textures, shapes, or pixel intensities) and then using those features to learn patterns associated with different image categories during training. Typically, convolutional neural networks are used, as they automatically learn hierarchical features from raw image pixels. During training, the algorithm adjusts its internal parameters to minimize the difference between its predicted labels and the actual labels. Once trained, the model can take a new plant leaf image, process it through its learned layers, and output the most probable class label (e.g., “Abalo,” “Grawa,”. “Limich”) with theirdescription based on the visual patterns.

Experiment and Result Discussion

In this study, a deep learning algorithm called the CNN model was utilized for classifying of medicinal plants. For this study, we were creating a CNN model from scratch and comparing its performance with pre-trained deep learning models such as VGG-16, and ResNet-50. Hereunder we present experiments conducted using segmentation and filtering methods. This proposed MPI model experiment has been tested by splitting the whole image dataset (10,372) into 80% for training, 10% for validation and 10% for testing. Gaussian blur filtering algorithm and different segmentation algorithms (threshold, k-means, and edge-based) were applied to the experimental results of the selected models, such as VGG-16, and ResNet-50 and the proposed MIP model.

Segmentation	Epoch	Batch- size	Filter	Optimizer	Training accuracy	Validation accuracy	Testing accuracy
Threshold	20	32	Gaussian blur	Adam	98.91%	99.03%	99.40%
	30	64	Gaussian blur	Adam	99.84%	99.46%	99.70%
	50	128	Gaussian blur	Adam	99.96%	99.56%	99.90%

Table 1: Experimental Result of MPI Model with Threshold Segmentation

As shown in the table above experiments with threshold segmentation achieves the highest result of 99.90% test accuracy when experimented with epoch number 50 and batch size 128. As shown in figure 2, the training and validation accuracy as well as training and validation loss are close to each other. This further show that the MPI model constructed after applying Threshold Segmentation and Gaussian blur filter do not show over-fitting.

Figure 2: Training Accuracy and Loss Plot of MPI Model with Threshold Segmentation

Experiments with VGG-16 Model Using Different Segmentation Algorithms

Different experiments that were conducted fortransfer learning with VGG-16 pretrained model using different image segmentation algorithms with Gaussian noise filtering techniques.in this experiment experimental results of VGG-16 model with k-means Segmentation produce good result when trained on batch size 64 and epoch 30 with the Gaussian blur filtering on Adam optimizer. The experimental results show that, 98.15%, 97.50% and 98.54% training, validation and testing accuracies respectively which is better than the other combinations. The accuracy of VGG-16 with k-means segmentation is lower than 30 with 64 when compared to the findings from epochs of 20, 50, and batch sizes of 32, 128.

Segmentation	Epoch	Batch- size	Filter	Optimizer	Training accuracy%	Validation accuracy%	Testing accuracy%
k-means	20	32	Gaussian blur	Adam	97.22	97.93	97.35
	30	64	Gaussian blur	Adam	98.15	97.50	98.54
	50	128	Gaussian blur	Adam	96.41	95.72	95.82

Table 2: Results of VGG-16 Model Using K-Means Segmentation

Experimenting ResNet-50 Model with Different Segmentation

Different experiments were conducted for transfer learning with ResNet-50 pre trained model using different image segmentation. During with Gaussian noise filtering techniques in this study. In this experimental result, the combination of filtering, epoch and batch-size with segmentation algorithm on ResNet-50 model has been measured.

ResNet-50 model with threshold segmentation achieves better result.

Table3 below indicates that, ResNet-50 with epoch 50 and batch-size 128 with threshold segmentation provides better performance with training accuracy of 98.36%, validation accuracy of 97.25% and testing accuracy of 99.50% respectively.

Segmentation	Epoch	Batch- size	Filter	Optimizer	Training accuracy	Validation accuracy	Testing accuracy
Threshold	20	32	Gaussian blur	Adam	98.19%	97.68%	98.80%
	30	64	Gaussian blur	Adam	98.38%	97.63%	98.90%
	50	128	Gaussian blur	Adam	98.36%	97.25%	99.50%

Table 3: ResNet-50 Training, Validation and Testing Accuracy with Threshold Segmentation

Discussion of Results

In this study the selected three models VGG-16, ResNet-50, and MPI models gave different results because they have distinct architectures, depths, and ways of handling features during training. VGG-16 is a deep, sequential model with uniform convolution layers, making it effective for extracting hierarchical features but computationally heavy.

ResNet-50 introduces residual connections, allowing very deep networks to train without vanishing gradients, which often results in better performance on complex datasets. MPI model created from scratch, typically shallower and simpler, have the capacity to capture intricate patterns as effectively. These architectural differences lead to variations in learning, feature representation, and ultimately, model accuracy and generalization.

Based on the study conducted, the main goal of this proposed work was to create a model for identifying and classifying Ethiopian medicinal plants. The way the study addresses the research questions outlined at the start of the work:

• Employing various image preprocessing techniques to enhance the quality and suitability of datasets for precise medicinal plant identification.

After applying image preprocessing techniques such as denoising, normalization, resizing, Edge-based segmentation, and k- means segmentation, along with data augmentation and dropout, the trained model with Gaussian blur filtering and Threshold segmentation, is the most suitable identifier for determining the medicinal plant species classes. Because Gaussian blur works based on moving a kernel over a small matrix of values that will be convolved with the image to compute the weighted average of pixel intensities. The Gaussian kernel represents the weights assigned to each pixel in the neighborhood of a target pixel.

• Optimize deep learning models for accurate medicinal plant identification and classification.

While comparing proposed MPI model with the VGG-16, the model improves testing performance by 10.13%, the reason behind this is that VGG-16 uses a single large filter size of 3x3, which leads higher training time, and some important or descriptive features may be missed because of this larger filter size and computationally heavy, memory-intensive, and less efficient. The next comparison was done with ResNet-50 even if it better than VGG-16 due to its residual convolution; the MPI model improves classification accuracy by 5.40, Because of the reason that the model uses a single filter size all necessary feature is not extracted. Researcher was utilized 5x5 filters to extract image features in proposed MPI model. Based on this, the researcher said that the MPI model constructed from Scratch is the best of the pre-trained VGG16 and ResNet-50 models for this problem.

• Assessing the performance of the developed model and its potential impact on improving identification and classification of medicinal plants.

In this research, researcher evaluated the performance of the classifier model using a confusion matrix. The author also applied precision, recall, and F1-measure as the main evaluation metrics for the classifier model. The model, built from scratch with filtering and segmentation, achieved a testing accuracy of 99.90% and showed promising results for classifying the medicinal plant species compared to transfer learning. The MPI model confuses only on one plant species it miss classify only one image wrongly from 1,002 testing image datasets the left 1,001 images classified correctly with their true classes.

Finally, when the researcher tests the proposed MPI model with unseen medicinal plant leaf images, the model predicts and classifies the image exactly to the correct classes with their descriptions with very few errors. The figures listed below shows some of the predicted medicinal plants with their descriptions.

Figure 3: Medicinal Plant Predicted Results of the Proposed MPI Model

Conclusion and Recommendations

Medicinal plant identification is a critical task with applications in healthcare, biodiversity conservation, and ethnobotany [10]. This study addresses the challenge of accurately identifying medicinal plants, a task of significant importance in ethnobotany, healthcare, and pharmacology. The problem stems from the difficulty of distinguishing plants with similar morphological features, which often requires expert knowledge [3]. The researcher developed the model using experimental research design methodology, training and testing it on images of medicinal plants.

A custom deep learning MPI model was developed from scratch, explicitly tailored to medicinal plant identification nuances, and benchmarked against pretrained models. The The developed MPI model uses a CNN Soft Max classifier to classify and detect images. The researcher compares the performance of the created MPI model with pre-trained models such as VGG-16 and ResNet-50 used for the experiment. The developed model was measured using a confusion matrix, accuracy, recall, precision, F1-score, and support. The testing accuracies of the developed models by VGG-16, MPI, and ResNet-50 are 97.63%, 99.90%, and 99.50%, respectively. Therefore, Researcher concluded that the proposed MPI model with Gaussian filters and threshold segmentation algorithm scores better performance than other models and image processing techniques. For the future researchers we recommend that they include user interface and user acceptance testing on a model to increase the acceptability of model by experts and users more.

Insights of Herbal Medicine(IHM)

ISSN: 2834-7749 | DOI: 10.33140/IHM

Insights of Herbal Medicine

Indexing

Open Access Journals

Identification of Selected Ethiopian Traditional Medicinal Plants Using Digital Image Processing and Deep Learning Techniques

Abstract

Keywords

Introduction

Methodology

Experiment and Result Discussion

Conclusion and Recommendations

References

Important Links

Locate Us