An improvement of Gram-negative bacteria identification using convolutional neural network with fine tuning

ABSTRACT


INTRODUCTION
Pneumonia, known as the wet lung, is an infection that results in inflammation in the airbag in one or both lungs.In pneumonia patients, a collection of small air sacs at the end of the respiratory tract in the lungs (alveoli) will become infected and filled with fluid or pus.As a result, sufferers experience shortness of breath, cough with phlegm, fever, or chills [1].Pneumonia is one of the respiratory diseases that attacks the lower part of the lung.This disease is mostly caused by bacteria, which results in inflammation of the lung airbag.Gram-negative pneumonia is a lung infection caused by Gram-negative bacteria [2].The use of the word Gram comes from the name of its inventor Hans Christian Gram [3].
The Indonesian Health Research and Development Agency in 2014 conducted a national survey of causes of death called the survey registration sample (SRS).Based on the writings of Widowati in 2015, her research stated that this data collected in Indonesia covered 41,590 deaths throughout 2014.Data shows that the ten most Ischemic heart disease, complications of diabetes mellitus, respiratory tuberculosis, hypertension followed by high blood pressure with complications Infection in the respiratory tract, especially chronic obstructive pulmonary disease (COPD), liver disease, traffic accidents, pneumonia, and diarrhea or gastroenteritis are originating from the appearance of the infection [4].The results of primary health research in 2018 show an increase in the prevalence of pneumonia from 1.6% to 2% [5].As written by Luna et al. in 2001, to find out the cause of pneumonia, firstly doctors diagnose patients using X-rays, doctors can see parts of the lungs affected by the disease.Secondly, through blood tests or sputum tests, the bacteria or viruses that cause this health disorder will be seen.Thirdly, the examination of blood oxygen levels.If there are some severe symptoms, the doctor will request an analysis through a CT scan and take a lung fluid culture [6].
On the other hand, visual observations by medical analysts needed to identify bacteria.Based on these findings, the contribution of this research is the use of image processing to replace visual representations using machine learning.GAP with previous research is not yet obtained optimal accuracy at the stage of bacterial identification.This research uses primary patient data from Soetomo Hospital because there is not enough secondary data available from the internet.The convolutional neural network algorithm selects three stages, including dropout, data augmentation, and finding the right fine-tuning.The technical novelty is to optimize the number of parameters and get the accuracy improvement used by CNN.Besides, this research uses lightweight software-based on TensorFlow and Keras using python, with support from the graphics processing unit.

THE PROPOSED METHOD AND ALGORITHM
This section consists of biological instruments used to obtain photographs and methods used.

Gram staining
Gram-negative bacteria are bacteria when Gram stained, cannot maintain the crystal purple dye so that the bacteria remain red when observed using a microscope [7].The Gram-negative difference with Gram-positive is based on differences in the cell wall structure and can be applied using the Gram staining procedure [8].

Extraction of shape features in image processing
Digital image processing, as said by Cromey in 2013 in his research, is a field of image processing research that studies how an image is obtained, processed, and analyzed so that it can form information that can be understood by humans [9].Perimeter is an object boundary that is calculated based on the number of pixels around the object.It calculated using a ratio between circumference (P) to length (Lp) and width (Wp).
The number of pixels in the object calculated.It to get the area value.The shape of the object is in the sample image.Metric is a form factor/roundness circle.Slimness is a ratio of length and width.
A is the area of the object and p is the circumference [10].Eccentricity is the number of spatial values of the minor ellipse with the focus distance of the central oval on the circle object.Eccentricity range values range between 0 and 1 [11].The method for calculating the Eccentricity illustrated value is to look at the illustration Figure 1.(3) with e = eccentricity, c is the distance from the center of the circle to focus,  2 =  2 −  2 , a = major axis, b = minor axis.By paying attention to the shape of objects that are oval and elongated to form linear lines.If the eccentricity value approaches one then the object has an oval or elongated shape, while the purpose has a round shape, the eccentricity value is close to 0 [12].

Convolution neural network VGG-16
CNN is a form of developing multi-layer perceptron that used to compile and process two-dimensional data.It consists of neurons that have substantial functions, bias, and activation as shown in Figure 2 [13].
In the fully connected layer architecture in VGG16, 3x4096 neurons are in a hidden layer [14].The kernel core inside CNN always shifts with the convolution of the image matrix input.The number of pixel changes that occur in the kernel is a skipping factor [15].The output value of the mapping process shown in (4).The purpose of using this layer is to make the feature map have a lower resolution [16].The new max-pooling feature map resolution can be obtained by: with   = value of pooling map, (, ) = window function, and   = value of input map.The entire connected layer connects each neuron in one layer with each neuron in the other layer [17].In general, it has the same work with traditional multi-layer perceptron (MLP) neural networks [18].The matrix that has been normalizing and passed on the fully connected layer is then used to distinguish images [19].The next step is to "align" or rebuild the feature map into a vector shape so that it used as input to the fully connected layer [20].

Dropouts
Dropout is a process of preventing overfitting and also accelerating the learning process.Dropout refers to removing neurons in the form of hidden or visible layers in the network [21].The neurons to be removed will be randomly selected.Each neuron gave the probability p, which is between 0 and 1 [22].

Data augmentation
Data augmentation makes a modified copy of each training dataset.In the Augmentation process, data changed for translation, point of view, size or lighting, or a combination [23].The data augmentation process showed in Figure 3.

Fine-tuning
Fine-tuning means taking all the weights from a neural network that previously trained and using it as an initialization for a new model that prepared using data from the same domain by regulating the learning rate [24].The process of fine-tuning as shown in Figure 4.The difference between fine-tuning and feature extraction is that it uses only the weights of the last newly added layer that changes during the training phase [25].

RESEARCH METHODS
This research uses cross-sectional data where data collection collected at a particular time and place.The research stage begins by getting a sample from Dr. Soetomo Surabaya.Data collection takes one year for 50 patients exposed to Gram-negative pneumonia.The example is the primary data obtained after passing an ethics test.Preprocessing data is carried out at the beginning, following procedures in force in the microbiology laboratory.First, it ensured that the sample is a Gram-negative pathogenic bacterium.Samples were separated from the original specimens and bred using Mac Conkey Media for a maximum of 12 hours.A day later, the sample was stained with Gram staining and heated until it was ready to be used and observed on glass objects.Observations made under a fluorescence microscope require immersion oil as an intermediary medium.
The inspection of sample images involves knowledge of the use of lenses and rules for viewing sputum from each patient, 30 photos taken in the same way, and standard.Pictures taken using the CX 23 Optilab view camera have dimensions of 2560x2048 pixels, resolution of 96 dpi, with a depth of 24 bits.Preprocessing in the image processing stage breaks the image into 256x256 pixels, 96 dpi, and a depth of 24 bits.There are three main folders, including the training data folder, validation folder, and data testing folder.Within each folder, there are two different classes, each having 420 different images, so a total of 2,520 photos are needed.The software used to compile is Anaconda 4.7.5 with Python 3.6.8,supported by the Python library for machine learning, Keras version 2.2.4,and the artificial intelligence framework TensorFlow with version 1.14.0.The hardware uses an Intel core i-7 laptop, 8GB RAM, and Nvidia Ge force GTX1050 4GB GPU.The stages of image identification carried out with a convolutional neural network using three phases of the process.Firstly, the dropouts used for regularization.Secondly, the data augmentation configuration to add data variation and thirdly use of fine-tuning to improve accuracy performance.The research method showed in Figure 5.
After the image prepared, the primary process is image processing with augmentation and fitting.1. Table 1 used to get the selection feature.Image Pre-processing and segmentation in convolution neural networks do not require special treatment of image objects obtained because this process already includes in the hidden layer area.By using more numbers of samples affect the accuracy produced.In this research, the total images needed are 2520 different photos contained in 3 data folders for each class of bacteria.Furthermore, three stages of the process ran, and a trial carried out by considering the number of iterations, learning rate, computational time, lose value, and accuracy following the following stages.

Use of dropouts for regularization
Dropout is a neural network regularization technique in which several neurons will be randomly selected and not used during the training process.Dropout include dense layer configuration, relay and sigmoid function activation, epoch = 30, batch size = 20 and learning rate = 2. 10 −5 .In the 30 th iteration of 30 iterations performed for 3 seconds with 1 ms/step, the value of training process loss is 0.1213, with 96% testing accuracy, the amount of failure in the validation process: 0.3325, and the validation accuracy value is 83%.

Use of data augmentation
The technique used is to take pictures in the dataset and apply random transformations, namely rotation, and shift, to produce additional training data.The aim is to prevent overfitting.augmentation include epoch = 30, steps per epoch = 100, batch size = 20, validation_steps = 50 and learning rate = 2. 10 −5 .In the 30 th iteration of 30 iterations for 121 seconds with a loss of 0.1890, an accuracy value of 92.95%, a validation_loss value of 0.2368, and a validation_accuracy value of 89.90%.The addition of a dense layer to run a full connection neural network adds to the number of parameters needed.

Fine tuning
The first step in the fine-tuning stage is removing the softmax classifier and replacing it with a new softmax by using a random value.A back-propagation algorithm used to train this new layer.Furthermore, the back-propagation algorithm in the fine-tuning process correctly regulates the level of learning at each layer.This layer requires a significant level of learning because it initialized using random values.In the next remains layer, the network maintains the parameters that have used on the previous network.It required the value of the learning rate that is not too large or slow because the network will transfer knowledge that has previously learned to the new system.Setting the learning rate level is not directed fast with values close to zero in all layers by optimizing again at a slower speed.
In the fine-tuning strategy, all weights changed when running new training except at the last layer weights for the original assignment.Fine-tuning consists of several layers of models that used to extract features and together train the two parts of the newly added model (with a fully connected classifier).Fine-tuning include epoch = 100, step for each epoch = 100, validation_steps = 50 and learning rate = 1. 10 −5 .From this step process, the total parameters can be reduced from 16,812,353 at the data augmentation stage to 14,714,688.From 100 iterations, it takes 154 seconds with 2 seconds/step, the value of the loss is 0.0227, and the amount of testing accuracy is 99.20%, the cost of validation is lost 0.0814, and the cost of validation accuracy is 97.20% shown in Figure 7.

Time comsuming and comparison between method
There are four scenarios used to get the average value of time required to run a program with a three-stage process.The stages include dropout, data augmentation, and fine-tuning-the time needed for each of these stages shown in Table 2 in the fine-tuning stage-trained using 25, 50, 75, and 100 of an iteration.The comparison of TELKOMNIKA Telecommun Comput El Control  An improvement of Gram-negative bacteria identification using convolutional neural… (Budi Dwi Satoto) 1403 precision shown in Table 3. Data obtained from the feature extraction output tested using WEKA Tools in the classification process.Also, this table refers to a comparative paper written by Giovanni Turra, entitled identification of hyperspectral bacteria using convolutional neural networks in the problem of Digital Microbiology, Springer Link, 2017.From Table 3, it appears that CNN is at the forefront of performance accuracy.However, another thing to consider is the amount of data used.The more data and iteration used, the higher the accuracy value.Another consideration needed is computational time in the training process.The advent of graphics processing unit technology makes image processing easier.Besides, not all business processes require high accuracy, but instead, fast service is needed.

CONCLUSION
This research aimed to design and evaluate a Gram-negative bacterial identification model using an image processing approach.The contrast with previous research is that the identification process still uses manual methods, namely visual observation.The contribution proposed by this research is the selection of the right machine learning when compared with previous methods with high accuracy.This research uses primary samples of Gram-negative bacterial pneumonia patients for training, testing, and validation of each class totaling 840 images so that the total images used by 2,520 images differ with an image size of 256x256 pixels, 96 dpi, 24 depth of bits.The three-stages process used dropout, data augmentation, and fine-tuning.The Technical results of this research indicate that the value of the performance of convolutional neural networks using dropouts at the data testing stage shows a value of 96%, the accuracy of the data augmentation process is 92.95%.The significance of the results obtained after using fine-tuning is 99.20%.These findings enhance improvement CNN using fine-tuning.The computational time for each training process is around one hour per scenario.Until now, there has been no paper that discusses the comparison of the training process time.Further work needs to do is establish whether there are improvements methods to

Figure 4 .
Figure 4. Original model vs fine tuning model

Figure 7 .
Figure 7. Accuracy of the training process and validation with fine tuning


ISSN: 1693-6930 TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 3, June 2020: 1397 -1405 1404 minimize the computational time of the training process by reducing the number of parameters or making layer modifications to the convolutional neural network layer.

Table 1 .
Extraction of shape features in bacterial images

Table 2 .
Time consuming

Table 3 .
Comparison between method