Identifying Citronella Plants From UAV Imagery Using Support Vector Machine

High-resolution imagery taken from Unmanned Aerial Vehicle (UAV) is now often used as an alternative in monitoring the agronomic plants compared to satellite imagery. This paper presents a method to identify Citronella among other plants based on UAV imagery. The method utilizes Support Vector Machine (SVM) to classify Citronella among other plants according to the extraction of texture feature. The implementation of the method was evaluated using two group of datasets: 1) consists of Citronella, Kaffir Lime, other green plants, vacant soil, and buildings, and 2) consists of Citronella and paddy rice plants. The evaluation results show that the proposed method can identify Citronella on the first group of datasets with an accuracy 94.23% and Kappa value 88.48%, whereas on the second group of datasets with an accuracy 100% and Kappa value 100%.


Introduction
Citronella is one of essential oil plants, which has already developed. From the distillation of the leaves obtained Citronella Oil. Citronella oil from Indonesia in the world of trading is famously known as "Citronella Oil of Java." These plants can be cultivated easily and can grow on marginal and ex-mining land. However, similar to the other essential plants, it is not easy to identify the location and validate the area of Citronella crop due to poor of data collection management. One of the causes of this difficulty is the site of plants are usually quite far on the hill, in the valley, or the ex-mining area. Remote sensing using satellite imagery or aerial photographs is often used to gather such data efficiently. Information obtained from the image can be processed for various purposes such as identifying essential plants from other objects or plants, calculating the area, the mapping of land and so forth.
Based on field observation there are some problems if the process of citronella plant identification is done using satellite data or aerial photograph. Firstly, the citronella plant is sometimes planted in small area and it's required specific type of image that has the appropriate resolution in the identification process. Secondly, citronella plants is low plants with small canopies and have the same color as other plants, so proper techniques are required in the feature extraction process. Therefore, the purpose of this research are: 1) to know the optimal feature and window size in the process of identification of citronella plant, 2) to classify UAV image using SVM to recognize citronella plants among other objects and plants, 3) to develop the simulation program to identify citronella plants. Furthermore, measurement of test results using accuracy can be used to find out how good the method offered. The results of the identification of citronella crops and their extent can be used to monitor the availability of raw materials for the citronella-based production.
Satellite imagery has been broadly used for various activities. An example is for denoising satellite image using Discrete Cosine Transform [1], mapping agricultural land using Landsat ETM+ satellite imagery [2], Landsat 8 OLI multispectral imagery [3], [4], [5], MODIS imagery [6]. The results showed that the use of Landsat and MODIS imagery with medium resolution could produce high accuracy when implemented on a large area. Reality in the field, the planting of the essential plant, especially Citronella, occasionally occupy a small area and located in an unproductive area. Thus the use of medium-resolution satellite data is less suitable for detecting Citronella [7]. With a resolution of 30 meter square or more, the images are still hard to distinguish plants in an area. Moreover, if in a field less than 30 meter square is planted more than one type of plants, so the information stored on each pixel is a mixture of several objects. This situation, of course, will give less accurate results [8]. Therefore, many studies use high-resolution imagery such as using IKONOS [9], [10], IKONOS and WorldView-2 [11] to get better accuracy. However, high-resolution satellite data incurred a high cost and are less effective when used by institutions with small funds.
Currently, there has been emerging aerial photography technology that utilizes UAV and produces very high-resolution imagery (centimeters). Another plus is the cost of producing a cheaper image. With the very high-resolution, the identification process can be accurately done [12] [13] and can be used to distinguish plants in an area [14] [15]. Therefore, in this research is using UAV imagery to identify Citronella plant. However, images captured using UAVs is highly dependent on weather conditions such as sunlight exposure, the height of image retrieval, etc. This situation will affect the brightness and picture resolution. The UAV imagery data needs to be pre-processed to calibrate brightness and equalize the resolution. Furthermore, it is necessary to extract features to obtain proper identifier as the input to the classification process.
Based on visual observations, Citronella plants can be distinguished from buildings and vacant soil using the color feature. However, this color feature is less useful when used to identify essential plants with other plants such as rice, corn, and trees. Further observation found that some of these essential plants have different canopy shapes and plant densities. Therefore, the texture feature is more useful for the process of feature extraction. This research used the first-order of statistic feature and lacunarity feature.
The first-order statistic feature calculated through a histogram-based approach, so that the typical value of feature will show the particular characteristics of an object. Statistic feature is relevant for clinical staging of cervical cancers [16] and as a feature on discrete wavelet transformation for the classification of microscopic images of hardwood species [17]. While the lacunarity feature is an image fractal measure that measures non-homogeneous images based on the density information of an image. The lacunarity feature was successfully used to detect the structure and irregularities of fried batters images [18] and to describe the soil structure [19].
After that, this research used Support Vector Machine (SVM) classification method to obtain optimal recognition results based on the selected features. By finding the best hyperplane, it will be optimal to classify two classes. If the recognized class is more than two, then kernel technique can be used. SVM is effectively used in several problems using imagery data such as image denoising [20], rectangular-and circular-shape buildings extraction from high-resolution optical spaceborne images [21], cloud image detection on a large data volume [22]. Research conducted by [23] managed to classify tree species on individual tree crown using SVM method, hyperspectral imagery and laser scanning data from two boreal forest. The use of SVM is also done by [24] on the localization of subcellular protein using fluorescence microscopy images. SVM with kernel function can also be used to classify parasites and detect thrips [25] as well as tumor classification with Computed Tomography (CT) image [26] and facial expression recognition [27]. Figure 1 shows the flow of identification process of Citronella plants. To distinguish Citronella plants among other objects or plants requires several stages of feature extraction, training with SVM and classification testing. The input takes imagery data from UAV in a set window size. The first step of identification process starts with the extraction of features from the image of entry to obtain particular information from each image pixel. Some of the features extracted are the average pixel value, standard deviation, 1st order texture, and lacunarity.

Research Method
The features obtained are then used as inputs to the training process using SVM. This training process is done to get the best parameters from SVM by adjusting some parameter values such as window size, sigma, lambda, learning rate and maximum iteration. The optimal parameters of this training process are then used in the testing process to obtain the class label of each test data.

Image Data
Image data used in this study was taken using UAV around the garden of Citronella exemplary of Institut Atsiri in Kesamben, Blitar and paddy fields in Kepanjen, Malang. The captured image resolution is 11 cm and stored in a TIF extension file to keep the geographic coordinates of the image stored. An example of a fetching image taken with UAV in Kesamben is shown in Figure 2.  (5), and Citronella (6). In the testing process, the data of each object used as a test sample in 30x30 pixels. In addition to the kaffir lime, a total of 100 data of each object used in the training process. While in the testing process, the total data is 25 for each object. Table 1 details the data used in this study.

Image Feature Extraction
The texture of the image consists of pixels or a group of pixels (texels) that are related to each other that can give the surface or structure of the object or area of the image (Selverajah, 2011). Some of the features used in this research include features of 1st-order statistic and lacunarity. The 1st-order statistical texture is calculated from the original pixel value of the image regardless of the pixel neighbor relationship in the image. The 1st-order statistical texture obtained through a histogram-based approach. Features derived from this technique include average pixel intensity, standard deviation, energy, entropy, and skewness. a. Histogram An image's histogram represents the frequency of the appearance of a pixel intensity value in an image, where the greater the value represents many pixels that have the intensity value (Kadir, 2013). The histogram contains the 1st-order statistical information of an image or image area. b. Mean of Pixel Intensity The first feature on the 1st-order statistical texture is the mean of pixel intensity. This feature produces the brightness average of the object. The mean pixel intensity feature is calculated using equation 1. (1) wherei is the gray level of the image, p(i) states the probability of occurrence of i and L is the maximum gray level in the image.

c. Standard Deviation of Pixel Intensity
The next feature is the standard deviation, this feature results in the size of the contrast of the object. The standard deviation feature is calculated using equation 2.
The energy feature represents the intensity distribution of pixels to the gray level range. An image has a maximum energy value of 1 if the image has only one gray level value in all its pixels. Images that have a slight gray level have a greater energy value than those with a lot of gray levels. Therefore this feature is also often referred to as homogeneity. Energy feature can be calculated using equation 3. The entropy feature represents how complex an image is, the higher the entropy an image has, the more complex the image will be. Entropy tends to be the opposite of energy. Entropy also shows the amount of information contained in the data distribution. The entropy feature can be calculated using equation 4.  (4) f. Skewness The skewness feature expresses the asymmetric value of the mean intensity. The negative value states that the brightness distribution is leaning toward the mean intensity, while the positive value is leaning to the right. The Skewness feature can be calculated using equation 5. To normalize the skewness value then the value must be divided by (L-1).  (5)

g. Lacunarity
The lacunarity feature is an image fractal measure that measures the nonhomogeneity of an image. Lacunarity provided information on the density of an image and expressed as the ratio of variance per average value of a function. Lacunarity defined by equations 6, 7 and 8 (Petrou, 2006).
where P is the pixel value at a given position, M is the length of the image, N is the image width, and L is the lacunarity value.

Feature Data Extraction
The extraction data features aim to obtain a feature vector in each region of the image read through moving window. This feature vector used in the training process and classification performed using the SVM sequential. The first stage in this process is to read the image using a moving window of n x n from left to right, from top to bottom so that all parts of the image are read. Each step of the moving window is calculated to get the features used in the training and classification process. The window shift can run with different overlap values. Examples of movements in the moving window with an overlap of 100% can be seen in Figure 3.

Support Vector Machine (SVM)
After gathering the data feature from the feature extraction process, the next step is to identify the object contained in the image. In this research used SVM, a method for classification and data regression. SVM is a supervised classification which required a training process. The basic concept of SVM is to find the best separator hyperplane in two data classes. Hyperplane that has been obtained in the training process will be used to classify data on the final process of classification. The hyperplane can be a line in a two-dimensional space as well as a flat plane in a multi-dimensional space.
The optimal hyperplane that can divide a set of data into two classes defined as a hyperplane that can separate the data vectors without class error and the distance between the closest vectors to the maximal hyperplane (maximal margin). In the case of data vectors that can not be separated linearly on the dimension of the input data vector space, kernels can be used to map data from the vector of input space into a higher-dimensional vector space. Some commonly used kernel functions are linear, Gaussian (RBF), exponential, polynomial, hybrid kernel, and sigmoidal kernels.
After mapping to a new vector space, a training process can be done as in the case of a linear classification. In this research used sequential learning method (Vijayakumar, 1999). SVM sequential training algorithms performed in the following steps: 1. Initialization hi=0 2. Calculate hessian matrix for , = 1, … , which defined using equation 9 where D ij is hessian matrix, y is data class, x is data, K is the kernel functionand is scalar variable 2. 3. Afterward, for every vector, i = 1 to l calculate: (11) c.
where E is error value, δα i is single variable, γ is learning rate and C is slack variable. 4. If the convergence or maximum iteration has been reached, then stop and if not then go back to step 2. From that process, we get α and support vector which is a vector having α> 0. The bias values used for the classification process can be calculated using equation 13 (Vijayakumar, 1999).
Once the bias obtained, then the classification of data can be done using equation 14. If f(x) equals 1, then the data entered into the positive class, whereas if f(x) equals -1 then the data entered into the negative class.

Results and Analysis
In this research, we developed simulation programusing the C# programming language to perform testing the proposed method. The test is done on two groups of datasets: 1. The first group consists of the data of Citronella, and non-Citronella (kaffir lime, green plants, buildings and vacant soils). 2. The second group consists of Citronella and Paddy.
The number of test data in the first data group is 25 for each object, so the total test data is 125 data. While in the second data group, the test data of paddy plants were taken at the age of 1 month (phase 1), 2 months (phase 2) and 3 months (phase 3) with each data as much 25 and the number of Citronella test data of 50 data.
The training process is performed before testing with both data. The purpose of the learning process is to obtain the optimal parameters of the proposed method. These parameters include window sizes of UAV imagery, features used and SVM parameters that include lambda value, learning rate, and sigma. From the training of window sizes ranging from 5 to 29, the best size founded is 29, while the optimal feature is the first order texture or lacunarity. For lambda parameters, the experiments were done ranging from 0.00001 to 0.4 and obtained the optimal value is 0.001, while the best learning rate is 0.01 and the best sigma value is 0.4. This parameter value is then used to set the parameter values in the test process. This research performs the testing of overall accuracy and Kappa's value to determine the performance of proposed method.

The Result of the First Dataset Testing
The test was performed using the best parameters obtained from the training process, namely the window size of 29 pixels, the sigma value 0.4, lambda parameter 0.001, learning rate 0.01 and the maximum number of iterations 100. Some of the tested features include pixel average, pixel deviation standard, texture order 1 (histogram, pixel intensity average, pixel intensity deviation standard), lacunarity. Feature test results using the optimal SVM parameter values are shown in Table 2. Test results with this data show that the highest accuracy obtained is 94.23 with the kappa value is 88.48. This result suggests that by using texture features, it can be used to identify Citronella among other objects such as kaffir lime, other green plants, buildings and vacant soil. Further analysis of the results, it is also known that to obtain optimal results there is no need to use all the features (without the mean) because using only the texture feature of 1storder statistic or only lacunarity produces an equally good accuracy. Use of the mean feature is not effective enough because it gives little precision.

The Result of the Second Dataset Testing
As well as testing in the complete dataset, the second dataset test also uses the best parameters of SVM, i.e., on window size 29, sigma value 0.4, lambda parameter 0.001, learning rate 0.01 and the maximum number of iterations 100. Feature test results using optimal SVM parameters are shown in Table 3.
The evaluation result using Citronella and paddy plants data revealed that Citronella plants could be distinguished very well from paddy plants. The result shows an accuracy of 100% and kappa value is also 100%. In each phase, the Citronella is well identified. Although visually it can be said that the forms of Citronella plants and paddy are almost the same, but both have a different texture. This difference is seen more clearly when the Citronella plants reach the age above three months where the leaves will curve down. Test results also show that to distinguish citronella and paddy do not have to use all the features. Table 3 indicates that using only the texture feature of 1st-order statistic or lacunarity, produces accurate and kappa values that are also optimal. The use of the mean feature is also not efficient enough because it gives low accuracy and kappa value.
The maximum accuracy of this second test is likely because both images are captured in different areas and times. So it is very likely both images have different lighting and resolution. Thus further testing is required using Citronella and paddy plant image data captured in adjacent areas and the same time.

Conclusion
The increasing need for Citronella plant and limited land cultivated, it requires an effort to find out how much are of this plant is planted. In this research, UAV imagery data used for classification of Citronella plants among other objects, especially Paddy plants that have almost the same shape. The proposed method identified the different objects through texture features. The evaluation results show that the features that are quite effective using the texture on the 1st-order statistic or lacunarity with 94.23% accuracy and 88.48 kappa value on the data of multiple objects at once. The evaluation results also showed that the proposed method used to distinguish Citronella and paddy with 100% accuracy and 100% kappa value. These results indicate that the method is quite effective to identify Citronella plants. However, further testing needs to be done primarily on Citronella and paddy data to obtain more optimal results.