Image Analysis using Color Co-occurrence Matrix Textural Features for Predicting Nitrogen Content in Spinach

This study aimed to determine the nitrogen content of spinach leaves by using computer imaging technology. The application of Color Co-occurrence Matrix (CCM) texture analysis was used to recognize the pattern of nitrogen content in spinach leaves. The texture analysis consisted of 40 CCM textural features constructed from RGB and grey colors. From the 40 textural features, the best features-subset was selected by using features selection method. Features selection method can increase the accuracy of image analysis using ANN model to predict nitrogen content of spinach leaves. The combination of ANN with Ant Colony Optimization resulted in the most optimal modelling with mean square error validation value of 0.0000083 and the R2 testing-set data = 0.99 by using 10 CCM textural features as the input of ANN. The computer vision method using ANN model which has been developed can be used as non-invasive sensing device to predict nitrogen content of spinach and for guiding farmers in the accurate application of their nitrogen fertilization strategies using low cost computer imaging technology.


Introduction
Leaf nitrogen content is an important parameter that affects photosynthesis and net primary production.The development of an accurate system predicting plant nitrogen status is critical for effective nitrogen application management in precision farming.The methods that can be used for non-destructive measurements of leaf nitrogen content are spectrometric measurements using spectral indices [1] and digital image analysis, where leaves or other plant parts are analyzed using digital image analysis algorithms.Spectrometers for field measurements are hard to come by and too expensive for real implementation in the field.Digital cameras are now widely available.Computer vision technology has also been widely developed in agriculture.
Digital image processing has been used with success in crop management and nitrogen stress detection [2,3,4].Jia [5] and Pagola [6] successfully developed a digital camera for assessing nitrogen status of the winter wheat and barley.Baresel [7] conducted a study on the use of digital consumer cameras for the detection of non-destructive N nutrient status.The results showed similar measurement accuracy between image analysis and spectrometry.Borhan [8] developed a computer-imaging system to predict chlorophyll content in potato leaves.Li [9] tested the performance of spectral indices to predict nitrogen status in real-time in the field and was applied to an accurate nitrogen fertilization system.Of all the research that has been done, there has never been any research that designed a machine vision system that can be used in real time in the field to measure the nitrogen content of spinach leaves (Amaranthus sp.).Development of low cost machine vision system is needed by spinach farmers to predict the needs of fertilizer easily and accurately.Generally, digital images are stored in 24-bit RGB form, from RGB color can be converted to grey color, where many researchers were using the grey color in analyzing the texture [10,11].In this research, 40 kinds of textural features are built from RGB and grey Color Co-occurrence Matrix (CCM).The 2713 use of CCM texture analysis has never been used to predict the nitrogen content in spinach leaves.So it is necessary to observe the use of RGB and grey CCM texture analysis to determine its effectiveness in predicting nitrogen content in spinach leaves.
The selection of appropriate techniques in obtaining a strong prediction model to be used for leaf nitrogen content is an important task.Compared with statistical methods, Artificial Neural Networks (ANN) has a greater capacity to analyze data, especially when the feature is complex and all data does not follow the same distribution pattern [12].Moghaddam [13] developed ANN model to estimate nitrogen in sugar beet leaves with reasonable accuracy using conventional digital cameras.The results showed that the ANN model has a higher accuracy than the linear regression model.In the area of leaf nitrogen content assessment for precision farming, ANN has been successfully applied to predict leaf nitrogen content in various plants [14,15,16,17,18].ANN performance can be enhanced by selecting appropriate input parameters.Selection of the proper input parameters will result in a better-modelling of ANN.Selection of this input parameter called features selection.Individual low-performing features can be highly relevant to show high performance when combined with other features [19].Many studies have proved the benefit of using bio-inspired algorithms for features selection techniques [20,21,22,23].The development of hybrid methods by combining ANN methods for modeling and bio-inspired optimization methods to optimize the selection of input parameters on ANN has never been applied to model the nitrogen content of spinach leaves.The results of the model obtained can be used for the development of a low cost nitrogen content detection tool in spinach plants.
The aim of this work was to develop a suitable computer-imaging method as noninvasive sensing for estimating nitrogen content in spinach leaves, which is easy to apply and use.RGB and grey CCM textural features were used as the image parameters, ANN was used for modeling, and three bio-inspired features selection optimization methods consist of: Ant Colony Optimization, Discrete Particle Swarm Optimization, and Genetic Algorithm were compared.Vakilian and Massah [24] has developed a farmer-assistant robot for nitrogen fertilizing management, but it is costly and still only applicable for cucumber plants.Du [25] have developed nitrogen content analysis tool using hyperspectral LiDAR, but this tool is also costly and cannot be applied real time.There has not been any research that develops a nitrogen content measurement tool for spinach by using low cost conventional digital camera.Chlingaryan [26] have reviewed the use of machine learning in estimating the amount of yield and nitrogen status in plants.But there has never been a specific study to analyze the use of machine learning to predict the nitrogen content in spinach leaves.This research has a novelty in the utilization of machine vision to predict the nitrogen content in spinach leaves using ANN as machine learning.Sharif [27] has developed features selection algorithms to select the best textural features that can be used to detect disease in citrus.The features selection method used is the filter method.Meanwhile, according to Galiano [28], wrapper method is more effective than filter method.This study presented the use of bio-inspired features selection (wrapper method) to select relevant CCM textural features combined with ANN to predict the nitrogen content in spinach leaves.

Research Method 2.1. Materials and equipments
Spinach leaves were collected from an experimental field of Department of Agricultural Engineering, Universitas Brawijaya, Indonesia.A total 300 samples of spinach leaves were randomly selected from spinach plants grown in an experimental plot with various nitrogen content.As a mean of manipulating their physiological status, spinach plants in polybags were divided into parts of the treatment of single urea fertilizer as a source of nitrogen.The dose given to the plant was 20%; 40%; 60%; 80% and 100%.The fertilizer was given three weeks after planting with a feeding once every week until the plant was four weeks old.Fertilization application was done by dissolving the fertilizer into 40 ml of water per percent, then sprayed on each polybag.Spinach leaves images were taken using digital camera (Nikon Coolpix A10, 16 megapixels, Japan) placed in a black box, white surface background, with constant fluorescent lighting and evenly distributed throughout the spinach leaf, directly under vertically mounted camera as shown in Figure 1.

Model of Study
First process was image acquisition, in which the images of spinach leaf (total 300 images) were captured using digital camera placed at 300 mm perpendicular to the sample surface.Images were captured with its maximum resolution (4608 x 3456 pixels).Uniform lighting conditions were one of the important factors [29].Imaging was done under controlled and well distributed light conditions without any shadow.Light was provided by two 22W lamps (EFD25N/22, National Corporation, Japan).Light intensity over the spinach leaf surface was uniform at 300 lux in the centre of the region during image acquisition.Image processing software were developed using Visual Basic 6.0 to separate the object from the background.Image analysis was performed according to a software specially developed for this purpose using Visual Basic 6.0 to extract image features.Image features which consisted of RGB and grey CCM textural features were extracted from each image data.Analysis of nitrogen content was done by kjeldahl method [30].
Modelling was done using Back Propagation Neural Network (BPNN) to describe the relationship between CCM textural features and nitrogen content of spinach leaves.Supervised prediction was used to predict nitrogen content of spinach leaves using these image features.The present work was dealing with the assessment of nitrogen in spinach leaves by comparing two analytical methods i.e. image analysis (non-destructive sensing) and conventional nitrogen measurement (destructive sensing).Selection process for selecting relevant textural features was done using wrapper methods [31,32].Wrapper methods consisted of 1. Neural-Ant Colony Optimization (N-ACO); 2. Neural-Discrete Particle Swarm Optimization (N-DPSO); 3. Neural-Genetic Algorithms (N-GA).Multi Objectives Optimization (MOO) concerned optimization problems with multiple objectives [33].The fitness was calculated as follows: where MSE(x) was the Mean Square Error of validation-set data of BPNN using only the expression values of the selected image features in a subset x, where IF(x) was the number of selected image features in x. ft was the total number of image features, weight1 and weight2 were two priority weights corresponding to the importance of the accuracy and the number of selected image features, respectively, where weight1 = 0.6 and weight2 = 0.4.In this study, the accuracy was more important than the number of selected image features in a feature-subset.

Texture Analysis
The texture analysis can be considered as one of applicable techniques for extracting image features.The CCM procedure consists of three primary mathematical processes: 1) images are converted from RGB color spaces to grey color representation [34], 2) development of Spatial grey-Level Dependence Matrices (SGDMs) [35], resulting in one

2715
CCM for each color space (RGB and grey), the CCM is calculated based on normalization value; and 3) determination of ten Haralick Textural Features [36].
Based on the results of preliminary observation in various combination of angle (θ = 0, θ = 45, θ = 90, θ = 135) and distance (d = 1, d = 2, d = 3), it was showed that combination of angle (θ = 0) and distance (d =1) performed better than the other combination of θ and d to identify total nitrogen content in spinach leaves.Therefore, in this study, CCM textural features were extracted at those values of θ and d.A total of 40 CCM textural features were extracted (10 textural features each for R, G, B, and grey).

Neural-Ant Colony Optimization (N-ACO)
Ant Colony Optimization (ACO) [37] is inspired by the foraging behaviour of real ants.The steps of N-ACO were as follows: 1. Set the initial parameters i.e. the number of ant population (a1, a2, a3,….ana) in which na = 60; global iterations = 500; heuristic (ηιψ) which was defined as the inverse of the BPNN MSE between two features (ι, ψ); intensity of pheromone trail level (τ = 100); the best selected ants (k = 4); pheromone constant (α = 1); heuristic constant (β = 1) and evaporation rate of pheromone ρ[0, 1]. 2. Generating ants for solution generation.For ant a, the probability pιψ of moving from state ι to state ψ depend on the combination of two values i.e. the heuristic η of the move and the pheromone trail level τ of the move.Probabilities were computed as follows: pιψ was equal to 0 for all moves which were infeasible, otherwise it was computed by means of the following formula, where α and β were user-defined parameter (0 < α < 1; 0 < β < 1).Parameter α and β controlled the relative importance of the trail and the attractiveness, respectively.
If an ant was not able to decrease the MSE in ten successive steps, it must finish its work and exit.Each ant consisted of feature-subset with selected features as ant paths (e.g.a1: 0,1,1,0,0,0,0,1,0,0,1,0,…..m), where m was the number of total features which equals to 40 features.Each ant in the population represented a candidate solution to the feature subset selection problem.A value of 0 indicated that the corresponding feature was not selected and was not added as the input of BPNN, while a value of 1 mean that the feature was selected and was added as the input of BPNN. 3. Evaluation of ants (a).Evaluate ants (feature-subsets) using BPNN.Each ant solution (T ant ) was calculated according to the objective function of the evolved subset of features.The values of the BPNN inputs were the feature-subsets.4. Update the global best solution (T best ) by the current ant solution (T ant ).The objective function was MOO.
5. Pheromone updating.An iteration was defined here as the interval in (t, t+1) where each of the ant moves once.The epoch was defined to be every n iteration, when each ant had completed a tour.After each epoch the pheromone trails intensity were updated according to the following formula: where ∆τιψ represented the sum of the contributions of all best k ants that used move (ι, ψ) to construct their solution between time t and t+1.Using the feature subsets of the best k ants, the pheromone trails intensity was updated using the following equation: In the first iteration, each ant randomly choosed a feature subset of m features.Only the best k subsets, k < na, was used to update the pheromone trail and influence the feature subsets of the next iteration.6.Generation of new ants.In this step previous ants were removed and new ants were generated.7. Stopping criterion: the algorithm stoped with the total-best solution T TB .The search terminated if the global iteration had been reached.

Neural-Discrete Particle Swarm Optimization (N-DPSO)
Pan [38] have presented a Discrete Particle Swarm Optimization (DPSO) algorithm to tackle the binary/discrete spaces, the steps of N-DPSO were as follows: 1. Generate a population of particles, pso n = [pso1 n , pso2 n ,…, psonp n ] where np was the number of particles (np = 60) and n was global iteration which equals to 500.Each particle defines a features-subset as binary vector with dimension 40.Each particle in the swarm population had the following attributes: a current position represented as psoi n = [psoi1 n , psoi2 n ,…, psoim n ] e.g.pso1 n = 0,1,1,0,0,0,0,1,0,0,1,0,…..m, where m was the number of total features which equals to 40 features; a current personal best position represented as pi n = [pi1 n , pi2 n ,…, pim n ]; and a current global best position represented as gi n = [gi1 n , gi2 n ,…, gim n ]. 2. Evaluate particles of population.Each particle (psoi n ) was evaluated according to the objective function using BPNN of the evolved subset of features.3. Find personal best position.The personal best position of each particle was updated using: 4. Find global best position.Feature-subset (pso n ) was evaluated.The fitness function (pso n ) was defined as the MOO.The global best position was updated using: The update particles of population consisted of three components: The first component was

 
, which was the social part of the particle representing the collaboration among particles.CR represented the crossover operator between bi n and g n with the probability of c2[0, 1].Here crossover was performed by two points crossover.Two points crossover (point1 and point2) were selected randomly, where point1<point2, and point1>1, point2<m.

Neural-Genetic Algorithms (N-GA)
N-GA [39] was a combination between Genetic Algorithms (GA) and ANN method for predicting nitrogen content according to the selected CCM textural features.The steps were as follows: 1. Generate population randomly in which individuals (number of individuals ni = 60) characterized by chromosomes represented a set of possible solutions (e.g.ga1: 0,1,1,0,0,0,0,1,0,0,1,0,…..m), where m was the number of total features which equals to 40 features.The chromosome defined contained 40 genes, one gene for each feature, which can take two values [0; 1]. 2. Compute the fitness function which reflected the degree of goodness of the individuals for the problem and evaluated the fitness of all individuals of the population by using MOO. 3. Select the fittest individuals to be parents for reproducing offspring using roulette wheel selection strategy.4. Create offspring with two points crossover (crossover rate [0, 1]) and mutation operators (mutation rate = 0.1) by changing the selected individuals during the mating periods.Two points crossover (point1 and point2) were selected randomly, where point1<point2, and point1>1, point2<m.5. Displace the parents with good offspring to compose the subsequent generation according to probability best chromosome which was set to 0.2.6. Stopping criterion.The search terminated if the iteration had reached 500 iterations.The optimal individual was defined by the best individual in the last population.

Results and Analysis
The main assumption underlying this study was that changes in the external textural appearances of spinach leaves which is caused by nitrogen content can be detected by visible light imaging techniques.Bangun [40] had proved the effectiveness of using external appearance through image analysis for measuring biological features (identifying Jabon's leaf pathogen).While Ahmad [41] also proved the effectiveness of using image analysis for predicting ripeness of melon.Figure 2 shows some examples of spinach leaves in various nitrogen content.The color characteristics of the leaves were almost similar.Therefore, textural features could be the best method to analyse the nitrogen content in spinach leaves.
Figure 3 shows the ANN modeling results for predicting nitrogen content in spinach leaves using CCM textural features without feature selection.The results showed that 40 CCM textural features used as ANN inputs provide the best accuracy results with an average MSE value for training was 0.000185 and validation was 0.00505.However, from 40 CCM textural features extracted from RGB and grey color, it was important to select only some relevant features-subset to increase the ANN model accuracy.In this study three features selection methods were compared.

Neural-Ant Colony Optimization
N-ACO is features selection method which is a combination between ANN for modelling and ACO for optimization algorithm.Figure 4 shows the best result of N-ACO from the sensitivity analysis which had been done in preliminary research using various learning rate α[0,1] and momentum value m[0,1] of BPNN and various evaporation rate of pheromone ρ[0,1] of ACO.The fitness function was MOO to minimize validation MSE and minimize number of features-subset.The result showed N-ACO worked effective to optimize the fitness function.The fitness was getting better through all iterations.It is inline with the research conducted by Zhang and Li [42] which stated that ACO has a fast convergence speed and that ACO achieves optimal condition within 50~100 generations.
The best performance of N-ACO was showed by 10 CCM textural features as featuressubset i.e. green sum mean, grey contrast, red sum mean, red entropy, red maximum probability, red inverse difference moment, blue contrast, green maximum probability, green homogeneity, and red cluster tendency.The selected features-subset showed the training MSE was 0.000094 and the validation MSE was 0.0000083.Figure 5 indicates that the learning process in BPNN was effective as long as the resulting error decreased as the iteration increased.Figure 6 shows the result of testing data.It shows the relationship between actual and predicted data obtained from N-ACO.The results also showed that N-ACO had a high accuracy, with R 2 = 0.998.

Neural-Discrete Particle Swarm Optimization
N-DPSO is features selection method which is a combination between ANN for modelling and DPSO for optimization algorithm.In Particle Swarm Optimization (PSO), the swarm size significantly affects the performance of optimization.A large number of swarm size may increase time complexity of PSO while a small number of swarm size snares a particle into local optima.Alfarisy [43] observed the best swarm size should be above 50.Therefore, in this study the swarm size was 60. Figure 7 shows the best result of N-DPSO from the sensitivity analysis which had been done in preliminary research using various learning rate α[0,1] and momentum value m[0,1] of BPNN and various crossover rate [0.1, 0.9] and mutation probability [0.1, 0.9] of DPSO. Figure 7 proves that the optimization process on N-DPSO was effective because the fitness value was seen to decrease as the epoch increased.
The best performance of N-DPSO was showed by using 9 CCM textural features as features-subset i.e. green maximum probability, red sum mean, green sum mean, blue contrast, grey correlation, grey contrast, green homogeneity, blue entropy, and red maximum probability.The selected features-subset showed the training MSE was 0.00018 and the validation MSE was 0.000011.The BPNN learning process showed effective as shown in Figure 8. Figure 9 shows the result of testing-set data.It shows the relationship between actual and predicted data obtained from N-DPSO with high accuracy R 2 = 0.997, but slightly below N-ACO.

Neural-Genetic Algorithms
N-GA is features selection method which can be categorized as wrapper method.Figure 10 shows the best result of N-GA from the sensitivity analysis which had been done in preliminary research using various learning rate α[0,1] and momentum value m[0,1] of BPNN and various crossover rate [0.1, 0.9] and mutation rate [0.1, 0.9] of GA.The fitness function used on N-GA was same as N-ACO and N-DPSO.The result showed N-GA worked effective to optimize the fitness function as long as the fitness value get smaller along with the iterations.
The best performance of N-GA was showed by using 8 CCM textural features as features-subset i.e. blue contrast, green maximum probability, red sum mean, grey correlation, grey contrast, red inverse difference moment, blue sum mean, and grey cluster tendency.The selected features-subset showed the training MSE was 0.00039 and the validation MSE was 0.000038.The learning process on BPNN showed effective since the MSE was getting smaller through all iterations as shown in Figure 11. Figure 12 shows the relationship between actual and predicted data obtained from N-GA.The results also showed that N-GA had a high accuracy, with R 2 = 0.993, slightly below N-DPSO and N-ACO.
Based on statistical analysis, there is a significant difference of ANN models before using features selection and after using feature selection.Features selection methods can increase the performance of image analysis [44].Based on the result, it showed that the lowest features-subset was reached by N-GA with total 8 features, followed by N-DPSO and N-ACO with total number of features were 9 and 10, respectively.The best prediction performance was N-ACO with training MSE of 0.000094 and validation MSE of 0.0000083, followed by N-DPSO (training MSE of 0.00018 and validation MSE of 0.000011) and N-GA (training MSE of 0.00039 and validation MSE of 0.000038), respectively.N-ACO was superior to other wrapper methods because only N-ACO involved heuristic information in selecting feature-subset.This heuristic information factor greatly helped N-ACO in finding the best feature-subsets.A research conducted by Zhou and Hu [45] also showed superiority of ACO to solve image clustering optimization problem in which ACO has fastest optimization speed with high optimization quality.Based on the research to solve MOO problem by Li and Tian [46], it showed that ACO had better performance compared to evolutionary algorithms such as GA and PSO.ACO can converge the fitness function of MOO more quickly and accurately, and can also maintain the distributivity of the better solution.Baizal [47] had made a research to compare single objective and multi objective optimization for generating travel itinerary.The result showed that ACO can solve both single and multi objective optimization problem with no significant different.The features-subset obtained from N-ACO can be applied to develop ANN structure to model the relationship between CCM textural features-subset and nitrogen content in spinach leaves.The result of ANN structure can be seen in Figure 13 using the weights value in Table 1 and Table 2. Table 1 showed the results of the optimum weights from the input layer to the hidden layer which were obtained from the BPNN model.Table 2 showed the results of the optimum weights from the hidden layer to the output layer which were obtained from the BPNN model.BPNN model performance was tested successfully to describe the relationship between nitrogen content in spinach leaves and CCM textural features.It indicated that CCM textural features can be good indicators to predict nitrogen content in spinach leaves.It is inline with the research conducted by Li [48] which had proved that textural features gave a good performance which promote the practicality of the machine vision technology on the grade determining of the Dendrobium Officinale.Combination of image analysis and ANN gave good result in pattern recognition [49].Furthermore, the research conducted by Liping [50] which observed the use of image analysis to model aerating classification in swage treatment concluded that image analysis technique combining with ANN method is one of effective means for prediction biological measurement.The ANN model showed a high accuracy result.

Conclusion
A study was conducted to evaluate the effectiveness of computer imaging technology as a non-invasive sensing for predicting leaf nitrogen content.In this study, textural features from RGB and grey color spaces were used to model nitrogen content in spinach leaves.Back-propagation Neural Network (BPNN) had been tested successfully to describe relationship between RGB and grey color co-occurrence matrix (CCM) textural features and nitrogen content in spinach leaves.Some features selection methods had shown good performance for selecting best features-subset.Based on the validation results, there were significant improvements to the ANN model before using features selection and after using features selection.Neural-Ant Colony Optimization had the best performance to select relevant CCM textural features.The artificial neural network model obtained from this study can be applied for real-time monitoring of nitrogen content in spinach leaves for optimizing fertilization strategies.

Figure 1 .
Figure 1.Design of image acquisition system.


ISSN: 1693-6930 TELKOMNIKA Vol. 16, No. 6, December 2018: 2712-2724 2716 For j = 1 to k the velocity of the particle.Fρ represented the mutation operator with the mutation strength of ρ and the mutation probability of w (w = 0.5).The second component was the particle representing the private thinking of the particle itself.CR represented the crossover operator between ai n and pi n-1 with the probability of c1[0, 1

6 .
Color Co-occurrence Matrix Textural Features... (Yusuf Hendrawan) 2717 Stopping criterion.The search terminated if the global iteration n had been reached.

Figure 4 .
Figure 4. Performance of N-ACO for features selection.

Figure 5 .
Figure 5. BPNN learning iteration process using the input of 10 features selected by N-ACO.

Figure 6 .
Figure 6.Testing result between actual and predicted data using N-ACO.

Figure 7 .
Figure 7. Performance of N-DPSO for features selection.

Figure 8 .Figure 9 .
Figure 8. BPNN learning iteration process using the input of 10 features selected by N-DPSO

Figure 10 .
Figure 10.Performance of N-GA for features selection.

Figure 11 .
Figure 11.BPNN learning iteration process using the input of 10 features selected by N-GA feature selection.

Figure 12 .
Figure 12.Testing result between actual and predicted data using N-GA feature selection.

Figure 13 .
Figure 13.ANN structure with the input: (a) normalized green sum mean; (b) normalized grey contrast; (c) normalized red sum mean; (d) normalized red entropy; (e) normalized red maximum probability; (f) normalized red inverse difference moment;(g) normalized blue contrast; (h) normalized green maximum probability; (i) normalized green homogeneity; (j) normalized red cluster tendency; the ANN output:(y) normalized nitrogen content.