Implementation of Gray Level Co-occurrence Matrix on the Leaves of Rice Crops

Received 08 Mei 2020, Revised 17 June 2020, Accepted 25 July 2020. Rice is one of the cultivation plants that are very important for human survival. The success of rice harvesting affects the level of farmers' income. However, farmers often suffer losses as a result of illness in rice. Rice plants infected with the disease will show symptoms in the form of patches that have certain patterns and colors on some parts of the body of rice plants, such as stems, leaves, and roots. Disease symptoms that emerge on the leaves are most easily identified because the leaves have a wider cross-section than other body parts of rice. Therefore, in this study, the leaf was used as an initial step parameter for disease detection in rice. This research aimed to identify diseases that exist in rice plants using the method of Gray Level Co-occurrence Matrix (GLCM). The GLCM method is a feature extraction method. The disease detection process on the leaves of the rice plants was done by retrieving the original image for the initial step; then, the original image was segmented before converted to greyscale imagery. After that, feature extraction was carried out using the GLCM features: Entropy, Eccentricity, Contrast, Energy, Correlation, Homogeneity. The results showed 90% accuracy results using GLCM extraction. The recognition of the emerging diseases on rice leaves can help to identify the type of disease infecting the rice plants.


INTRODUCTION
For farmers with rice, planting is their livelihood. The success rate of rice harvesting affects the level of farmers' income because most farmers rely on their life from the wheat harvest. Paddy has a scientific name of Oryza sativa and belongs to the tribe of paddy-paddy or Poaceae. Paddy produces rice, which is the staple food of most of our nation, so rice is one of the fields of agriculture that affects daily life. However, farmers often suffer losses [1] [2]; the main factor caused loss is the disease of rice. Although farmers have already gained some training and knowledge of how to care for and know the disease of paddy plants, mistakes remain occasionally occur in determining the disease [3] [4]. This error occurs when human abilities are limited in knowing the disease visually [5]. Also, the characteristics of paddy disease are almost identical between one disease and another. This issue is in accordance with some previous research. Dadi Rosadi and Asril held research with the title "Rice Crop Diagnosis System Using Forward Chaining Method," which was designed using the Borland Delphi 7 programming language [6]. Meanwhile, Sri Wulandari et al. held another one with the title of the research "System for Diagnosis of Pests and Diseases of Rice Crops Using Bayes Method" [7].
One of the factors for the decline in rice production is the disease that affects rice crops, especially on the leaves. Types of diseases on the leaves of rice crops are Blast, leaf blight, Tungro, and leaf burn [5]. The symptoms of diseases that arise in rice leaves are most easily identifiable because rice leaves have a wider cross-section than other parts of the rice body So that discoloration and spot-shape can be visible [4]. Therefore, the rice leaves can be used as the first step of disease detection in rice. For these problems, the researchers created a disease recognition system for rice plants using leaf parameters. For this research, the GLCM (Gray-Level Co-occurrence Matrix) method was applied. GLCM was used for the extraction of rice leaf characteristics by using Matlab. This research develops the image implementation using the GLCM method with feature extraction of six features: contrast, eccentricity, energy, homogeneity, entropy, and correlation with angles of 0 °, 45 °, 90 °, 135 °; and detect four types of rice diseases: Blast, Leaf Blight, Leaf Burn, and Tungro.

RESEARCH METHOD 2.1. System Design
First, the user will open the built-system application in Matlab [8] [9]. The steps to be used in this study are 1) image capturing (Image Acquisition), 2) image improvement (preprocessing), 3) feature extraction, and 4) object identification [10]. It can be shown in Figure 1. In image acquisition, the object will be taken in the form of a digital image in the "JPG" format [11]. Image improvement will be made by changing the pixel's intensity by greyscaling and resizing the picture. Then, the result of the image improvement process will get its feature extracted with GLCM, and finally, will undergo the object identification process by the calculation of closest distance(Euclidean Distance) [12]. The value of each object's characteristics will be calculated: Contrast, Eccentricity, Correlation, Entropy, Homogeneity, and Energy [13][14] [15]. Euclidean Distance is imagery premeasurement most used for measuring similarity and usually calculated from raw data, not from standardized data.

Fig. 1. System Design
To configure GLCM, the co-occurrence matrix ( Figure 2) for gray levels often calculates the intensity of the pixel (gray level) of in spatial relation to a pixel of m, primarily the spatial relationship defined as a pixel of interest and its adjacent pixel horizontally on the right direction. The element (n, m) produced in the co-occurrence matrix is simply the sum of the number of pixels that have a value of n in the spatial relationship specified for the pixel of m in the input image [16] [17]. Processing requires a calculation of the co-occurrence matrix for the full variable range in the image; this is not desirable. So the number of density values in gray images needs to be reduced from 256 to 8 since the number of gray levels determines the size of the co-occurrence matrix [17] [18]. Texture applications are divided into two categories, first is for segmentation purposes, where texture is used to perform separation between one object with another object, Second is for texture classification, which uses texture features for object classification [19]. It can be seen that the number of values from column 1 and column 2 and so on are inserted into the coocurrency matrix according to rows and columns. Some GLCM features are described as follows. Energy, used to measure texture uniformity, will be of high value when the value of a pixel is similar to each other. Otherwise, a small value signifies the value of GLCM normalization is heterogeneous. The maximum energy value is 1, which means that the pixel distribution is in a constant or the periodic (not random) shape. The equation of energy is where indicates row, indicates column, and ( , ) declares the value that belongs to line and column of the Co-Occurrence Matrix.
Entropy shows the amount of information of the image that is needed for image compression. Entropy measures the loss of information or message in a transmitted signal and also measures the image information. The entropy can be seen as (2) The contrast shows special frequencies of the imagery and differences of the resulting GLCM moment. The difference is the difference between the high and the low pixels. Contrast is 0 if the pixel's thickness value is equal. The contrast equation is written as ( Homogeneity is also known as Inverse Difference Moment. Homogeneity is used to measure the level of image homogenization. This value is used because it is susceptible to the value generated by the same pixel or uniform; it will be of high value. Contradictory to the contrast energy, a big value of the energy will be shown if the pixel values are similar.
Correlation measures the linear dependency of grey levels of neighboring pixels. Digital Image Correlation is an optical method that employs tracking & image registration techniques for accurate 2D and 3D measurements of changes in images. This is often used to measure deformation, displacement, strain, and optical flow, but it is widely applied in many areas of science and engineering. One widespread application is for measuring the motion of an optical mouse [20] [15].
The formulation and extraction of the Enamt image feature rendered extracted using Matlab to calculate GLCM as images cannot be directly provided as inputs to be implemented using FPGAs. The extraction method of the image feature used in this paper is given in Figure 3 [19].

Training Process
Digital Image Acquisition aims to determine the data needed and choose digital image recording methods. In this stage, the researchers used a digital image recording method by searching for herbal plant image data through the Google search engine.
The preprocessing phase aims to simplify the process of image identification. This stage consists of changing the pixel size of the original image to 688x800 pixels, changing the background color of the image in the segmentation process, and changing the color of the RGB image to grayscale, LAB, and binary image to get the extraction of the shape value from the image [12] [8].
At the stage of labeling, each image is labeled in training, and then data is done. Labeling aims to separate data based on labels that will be used in segmentation and classification.
GLCM, at this stage, texture analysis is performed using the GLCM feature. This process is related to the quantization of image characteristics into a group of corresponding characteristic values. Texture analysis is generally used as an intermediary process for image classification and interpretation. The extracted features are Entropy, Energy, Eccentricity, contrast, correlation, and homogeneity [15].
At the stage of training, the training process is carried out using a set of training data that contains parameter features or features that are used to differentiate between one object and another object. The characteristics used are texture analysis with GLCM and leaf shape recognition. The training process maps training data towards the training target through an algorithm formulation, image identification using GLCM features, and image classification using k-means clustering segmentation.
The formula used for testing the results of this system is demonstrated by

Disease in Rice Plants
Rice leaves infected with the disease have different texture patterns, color, and shape of leaves between one disease to another. The types of disease discussed in this study are: a. Blast (Figure 4) is showed on the leaves as brown spots with a gray-white center. Over time, the panicle's neck began to rot or break so that the process of filling the panicle was interrupted, and many were hollow in rice.

Fig. 4 Blast Disease
b. Leaf blight: the attacked leaves will be green-gray folding and rolling. In a severe state, it is capable of causing leaves to curl, wither, and die. It can be shown as in Figure 5.

RESULT AND DISCUSSION
Feature Extraction Flowcharts can be shown in Figure 8. Feature extraction is the process of obtaining and transforming the main characteristics contained in the imagery: contrast, eccentricity, energy, homogeneity, entropy, and correlation with angles of 0 °, 45 °, 90 °, 135 °. Once all values are obtained, it will be averaged. Here is the extraction flow feature: Grayscale will produce a quantized grayscale matrix used at this stage; this stage will calculate 5 statistical values of the co-occurrence matrix.
Feature extraction values to be searched are: Contrast, Eccentricity, Homogeneity, Energy, Entropy, Correlation. With a 3x3 pixel size, it has 16 degrees of grey with a range of 0-15. The training process used the GLCM method to calculate each extraction feature for each disease. It was thereby generating the features extraction values: Entropy, Contrast, Eccentricity, Correlation, Energy, Homogeneity. To detect 4 types of diseases using each of 20 training data for each type of disease, by producing the following training data as shown in Table 1. Here are the stages of training GLCM implementation. This Matlab Program has 3 process buttons after Matlab Run: select image, process segmentation, and feature extraction. The first process is the "select image". The image of the leaf that will be extracted is selected, taken from the existing folder on the computer. For a faster searching for image input data parameters, the initial input image is resized from the original image size 32x32 pixels using the RGB color method. The application MATLAB is used to detect diseases of the rice leaves. The tools used are Select Image (Pilih Gambar), GLCM Extraction (Ekstraksi GLCM), Segmentation (Segmentasi), Feature Extraction (Ekstraksi Fitur), the result (Hasil), and Reset. It can be shown in Figure 9. The button "Segmentation" is used to see the image segmentation results from the original image. At this stage, the conversion to greyscale imagery is carried out. Image segmentation is a binary image in which the desired object is white (1), while the background to be eliminated is black (0). The first step of the segmentation process is to take the original image, then do the image filtering with λ = 4 and θ = 45. After that, a thresholding operation is performed against the magnitude image with a threshold value of 1000. Once a segmented binary image is obtained, then visualizing the segmentation results against the original image. In Matlab, it can be done by giving the command dst=rgb2gray(img); "dst" is output variables from image conversion to gray, and "src" is the input image variable RGB. It can be shown in Figure 10.
After processing the image input from the folder in the computer, and through the segmentation process and resizing stage, the next step is to change the RGB image to greyscale. This process aims to simplify each pixel value in an image that initially has three values, which converts RGB to one gray value. The equation used for the greyscaling process is: = 0.12 * + 0.72 * + 0.07 * Next is pressing the Button "extraction GLCM" to display the extraction result of each feature. It can be shown in Figure 11.

CONCLUSION
Based on the results of research and system testing carried out from the feature extraction process using the GLCM method, the authors conclude the results of disease detection in the leaves of rice plants with a total sample data of 40 image data with 40 training data images and 20 image test data showing an accuracy level of 90%. The result is not enough 100% due to several factors that affect the extraction result of which is the error in the detection process caused by a low image color detection factor. The advantage of this research with previous research is using the GLCM method, which is one of the accurate methods to perform the feature extraction process in a texture. The suggestion that the authors convey from the results of this research is necessary to do further research using other methods to perform the extraction of the characteristics on the leaves of the rice crop.