Biometric Analysis of Leaf Venation Density Based on Digital Image

The density level in the leaf venation type has different characteristics. These different characteristics explain the environment in which plants grow, such as habitat, vegetation, physiology and climate. This research aims to measure of leaf venation density, leaf venation feature analysis and then identifying plants based on venation type. Stages of this research include leaf image data collection, segmentation, vein detection, feature extraction, feature selection, classification, evaluation and ending with analysis. The results of this study indicate that the level of leaf venation density is quite good is the type of venation paralellodromous, acrodromous and pinnate. Based on the selection of features using Boruta Algorithm, obtained 19 most important features that represent the type of leaf venation. This is reinforced by the average of accuracy produced at the time of classification using SVM, which amounted to 77.57%.


Introduction
Plant morphology is the study of physical form and body structure of plants [1].It is useful for identifying plants visually so that the vast diversity of plants can be identified and classified and named for each group formed.Plant morphology not only describes the shape and structure of the plant body but also to determine the function of each part in the plant life, then can be known where the origin and the composition of the body that formed [2].Morphological information is needed in the understanding of life cycle, geographic spread, ecology, evolution, conservation, and defining the species [3].
Parts of plants that have different characteristics between one plant with another plant and often used to identify species are leaves [4].Leaves have the main features that distinguish each type of plant, among others the form, the structure of the veins (venation) [5] [6], and texture [6].Among the features, leaf venation has a unique diversity that can describe plant characteristics in more detail, although some plant species do not exhibit clear patterns of venation [7].The leaf venation network provides an integrative relationship between plant shape, function and climate, including temperature, precipitate and water availability [8].
Associated with the growth of plants, leaf venation has a very important role.Plant growth will experience water release during transpiration.Transpiration itself requires water supply provided by leaf venation.If the growth rate and water release rate of the plant depends on the environment, then only a few types of leaf venation with a certain density can survive in each different environment [8].Zalenski observed that the venation density of plants in dry habitats was higher than that of plants in mesic habitats (habitat types with adequate moisture or water levels) [9].Therefore, to know the relationship of physiology and climate where the plant grows, the measurement of the leaf venation density is required.Density is one of the main characteristics of leaf venation because it is directly related to venous function [10].
This study analyzes leaf venation density using plant biometric features.To obtain the leaf venation density features, it is necessary to extract the feature of leaf venation.The results of leaf venation feature extraction will produce some features, such as straightness, different angle, length ratio, scale projection, skeleton length, number of the segment, total skeleton  ISSN: 1693-6930 length, projected leaf area, number of branching points and number of ending points.In some extracted features, doing the calculation of mean, variance and standard deviation were performed [11].Of all the features that obtain, then the feature selection was performed to get the most important features.The result of feature selection used to classify the leaves based on the venation type.The classification technique used is the Support Vector Machine (SVM).In many cases such as pattern recognition and regression estimation, SVM performance (i.e., error rate at the time of data testing) is significantly better than other methods [12].
As for the benefits of this research that is in addition to texture features and shape of the leaf, leaf venation feature obtained can be used as an additional identifier to identify plants.Then, the results of this study are also expected to facilitate the work of botanists in identifying plants and also predict the environment where the plant grows, of course, with further research.

Research Method 2.1. Leaf image dataset
The leaf image dataset used was derived from Computational Intelligence Laboratory, Department of Computer Science Bogor Agricultural University.The number of collected leaf image data as much 271 leaves.This dataset consists of 53 plant species and has grouped according to the type of leaf venation as shown in Figure 1.

Methodology
In general, the process stages in this study shown in Figure 2. Explanations for each stage are described in subsequent chapters.

Segmentation
Segmentation was done using the Hessian matrix [13] to obtain leaf venation images and thresholding to get leaf shape images.The leaf image data of segmentation result that is in the form of binary image data.Figure 3 illustrates the segmentation process.The basic idea of branch-point and end-point determination is that when a pixel has only one or two neighboring pixels, the pixel is an end-point (illustrated as in Figure 5a).If a pixel has three or more neighboring pixels, then the pixel is a branch-point (illustrated as in Figure 5b).

Straightness
Straightness is the measurement of the straightness value of a segment.From the Figure 6, the straightness value can be calculated using equation 1.
str ightness l j d j (1) where l j is the length of the segment j represented by the number of pixels connected by the points j and point l, d j is the distance between the pixel x s j , y s j coordinates with the pixel x e j , y e j coordinates.The coordinate values were determined from the segment extraction process.This value indicates the distance of straightness that has by the leaf venation.The value of d j can be calculated using equation 2.
where x s , y s is the absys, ordinate from the initial pixel and x e , y e is the absys, ordinate from the final pixel.

Different Angle
The different angle is the measurement of the angular difference between segments that coincide.From Figure 6, different angle ( ) values can be calculated using equation 3.
E ch segment will c lcul te the ngle ( ) formed from the center of the segment with equation 4.
The resulting angle will h ve r nge of "0 ≤ ≤ π."To produce i j nd k v lues the ngle should h ve the r nge "0 < ≤ π" .The ngle v lues of that range are obtained by equation 5.
i { i (y e -y s ) 0 (y e -y s ) < 0 (5) If the value (y e -y s ) < 0, then the v lue of i can be determined using equation After the angle value changed with the range 0 < ≤ π, it can be determined the value of the different angle between the segments that coincide using equation 3.

Length Ratio
The length ratio value measured by comparing the length of each segment to the maximum length of the segment in a leaf venation image.From Figure 6, the value of the length ratio can be calculated using equation 7.
where R i is the length ratio of the segment i, l i is the length of the segment i and l ⃗ is the segment vector in the leaf image.

Scale Projection
Scale projection is the measurement of projection length between segments that coincide.In Figure 6, to determine the length of the projection between segments i and j, can use equation 8 and derived to Equation 9.
(x e ix s i )(x e jx s j ) (y e iy s i )(y e jy s j ) The value of skeleton length, straightness, different angle, length ratio, and scale projection on each leaf data has more than one value.So in this research, the five features were measured density using the statistical approach that is mean, variance, and standard deviation that refer the research of Plotze and Bruno [11].In addition to these features, also obtained other features such as skeleton length, number of segments and features used to measure the leaf venation density.

The Leaf Venation Density Measurement
The leaf venation density measurement consists of three, which is leaf venation density, branch point density, and end point density.Leaf venation density, density of the branching point and density of the branching end points can be calculated using the equation 10-12 [14].

Feature Selection of the Leaf Venation
At the feature selection stage used Boruta Algorithm.Boruta is a feature selection algorithm that works as a wrapped algorithm around the Random Forest [15].Boruta iteratively compares the original attribute with the shadow attribute (i.e., random data from a copy of all attributes).Attributes that have lower importance than the shadow attribute are marked as Rejected and removed from the system.On the other hand, attributes that have a higher importance than the shadow attribute are marked as Confirmed.The shadow attribute is recreated on each iteration.The algorithm stops when only the remaining Confirmed attributes are present, or when it reaches the specified iteration threshold.

SVM Classification
Model development was done by using SVM classifier with RBF Gaussian kernel function.In the F G ussi n kernel function required p r meters C nd γ.The v lue of C parameter tested were [2 0 , 2 ] nd the v lue of γ p r meter tested were ].To get the best parameter value, each value of C parameter is combined with the v lue of γ p r meter then in e ch combin tion w s pplied 4-Fold Cross Validation method.The best C nd γ combin tion is which h s the gre test ccur cy value.

Results and Analysis
The collected leaf image data consisted of 5 types of venation, with the details of acrodromous as much 60 leaf images, actinodromous as much 32 leaf images, campylodromous as much 32 leaf images, parallelodromous as much 11 leaf images, and pinnate as much 136 leaf images.From this leaf images data is then done segmentation and vein detection.After that feature extraction of the leaf venation and obtained as many as 23 features, among others mean, variance, standard deviation of (straightness; different angle; length ratio; scale projection; skeleton length), number of segment, total skeleton length, projected leaf area, number of branching points, and number of ending points.

Analysis of Leaf Venation Density
The density analysis on leaf venation was performed to see the distribution of density and clustering based on leaf venation type.The leaf venation density measurement consists of three, which is leaf venation density, branch point density, and end point density.The result of measurement of leaf venation density level of 5 types of venation shown in Figure 7.
According to Figure 7, of the five types of venations having an enough good density are the paralellodromous and acrodromous types.The pinnate type is relatively good; this is because the variety of data is the most.The venation type of actinodromous and campylodromous have a density less good because the density spreads from the range of 0.0 to 1.0.Data in the range 1.0 means having a high density, while the data in the range of 0.0 means having a low density.The examples of vein detection images for each venation type with high and low venation density levels shown in Table 1.

The Result of Feature Selection Using Boruta Algorithm
Based on the features selection using Boruta Algorithm, from 23 leaf venation features obtained 19 features that are best or relevant to the venation type.The best features selection is base on the mean, median, min and max values of each leaf venation feature.The result of feature selection using Boruta Algorithm shown in Table 2.

Classification Results using SVM
The initial stage of the SVM classification is the parameter selection in the RBF kernel.Then the dataset was divided into training data and testing data, with portion 75% of training data and 25% of testing data.So that obtained as much as 203 of training data and 68 of testing data.In this research, we tested two types of datasets, which are the dataset with all features (23 features) and the dataset the result of feature selection (19 features).From each of these datasets, the best combination of C and γ p r meters th t selected for cl ssific tion modeling shown in Table 3.After the best parameters obtained, the next step is the selection of the classification model.The selection of the cl ssific tion model of the four models built using the best C nd γ value pairs is to look at the accuracy of each model produced.In this research, we selected the best classification model with the best accuracy, i.e., in the 3rd fold with 85.29% accuracy for the dataset with Boruta feature selection and 83.82% for the dataset with all feature.From the selected classification model is then evaluated.Evaluation of classification model was done by testing the accuracy of each class.The total of test class is 68 data for the whole class, with the portion of acrodromous as much 15, actinodromous as much 8, campylodromous as much 8, parallelodromous as much 3, and pinnate as much 34.5, the average of accuracy obtained from the test results for each type of leaf venation on the dataset with Boruta feature selection is 77.57%, and the dataset with all features is 76.82%.This shows that the dataset with feature selection using Boruta Algorithm has the highest level of accuracy.Although by reducing the original features to 19 features, but the information contained in the data is still maintained.In addition to the calculation of accuracy, to measure the performance of a classification model also calculated precision and recall.Comparison of the average value of precision and recall from each dataset among others on the dataset with all features, the precision of 89.33% and recall of 76.82%.While in the dataset with Boruta feature selection, the precision of 88.76% and recall of 77.57%.The difference in precision and recall values between the dataset with all features and the dataset with Boruta feature selection is not too much.This suggests that the effectiveness or performance of the classification model built using these two datasets is equally good.So it can be said that the use of feature selection with Boruta Algorithm is effective, because of the better accuracy.Besides, the use of feature selection on SVM classification can speed up computing time.

Conclusion
Of the five types of leaf venation, acrodromous and paralellodromous types have the density of the venation is good, whereas the pinnate type of density is relatively good compared with the actinodromous and campylodromous whose venation density is most diverse.The leaf venation features of the extraction result, which has the most important information on the venation type are as much as 19 features.The result of classification using Support Vector Machine (SVM) for leaf identification based on type of venation, obtained the highest accuracy rate of 77.57%, that is when using the selected feature with Boruta Algorithm.

Figure 4 .
Figure 4.A segment with two pairs of point coordinates

Figure 5 .Figure 6 .
Figure 5. (a) illustration of end-point determination, (b) illustration of branch-point determination Biometric Analysis of Leaf Venation Density Based on Digital Image (Agus Ambarwari) 1739 cos ( x e -x s √ (x e -x s ) (y e -y s ))

Figure 7 .
Figure 7. Value of leaf venation density of 5 types of venation

TELKOMNIKA
ISSN: 1693-6930  Biometric Analysis of Leaf Venation Density Based on Digital Image (Agus Ambarwari) 1743 Based on Table

Table 1 .
Comparison of Leaf Venation Density on Each Type of Venation TELKOMNIKA ISSN: 1693-6930  Biometric Analysis of Leaf Venation Density Based on Digital Image (Agus Ambarwari) 1741

Table 2 .
The Result of Feature Selection Using Boruta Algorithm

Table 1 .
The

Table 4 .
Details of Accuracy (in %) for Each Model That Formed from the Best C nd γ ir

Table 5 .
The Percentage Values of Precision, Recall, and Accuracy from Each Dataset