Blood image analysis to detect malaria using filtering image edges and classification

Malaria is a most dangerous mosquito borne disease and its infection spread through the infected mosquito. It especially affects the pregnant females and Children less than 5 years age. Malarial species commonly occur in five different shapes, Therefore, to avoid this crucial disease the contemporary researchers have proposed image analysis based solutions to mitigate this death causing disease. In this work, we propose diagnosis algorithm for malaria which is implemented for testing and evaluation in Matlab. We use Filtering and classification along with median filter and SVM classifier. Our proposed method identifies the infected cells from rest of blood images. The Median filtering smoothing technique is used to remove the noise. The feature vectors have been proposed to find out the abnormalities in blood cells. Feature vectors include (Form factor, measurement of roundness, shape, count total number of red cells and parasites). Primary aim of this research is to diagnose malaria by finding out infected cells. However, many techniques and algorithm have been implemented in this field using image processing but accuracy is not up to the point. Our proposed algorithm got more efficient results along with high accuracy as compared to NCC and Fuzzy classifier used by the researchers recently.


Introduction
Malaria is one of those kind of dangerous disease that causes a death because it directly affects to liver and then move towards red blood cells from there the life cycle of malaria also starts [1]. The majority of deaths are caused by Plasmodium falciparum disease that spreads globally. The common symptoms of malaria disease leads to headache, fever, vomiting, tiredness which may leads to coma and deaths. According to world health organization (WHO), there are many cases of malaria, estimated 200 million cases of malaria fever yearly [2]. Majority of malarial cases are found in poor countries where the pollution is high. When a person is bitten by an infected mosquito, then its parasites undergo inside the human body and destroy the red blood cells [3].
After that, it is recommended by the contemporary researchers to utilize an image processing technology [4,5]. Various methods have been proposed for malaria testing and several classification techniques have been applied. Some of them are Minimum distance classifier, Naive baye's classifier and neural networks [6]. Minimum distance classifier works well when the distance between means of different classes is large. The limitation of Naive baye classifier is that the statistical properties of pattern classes are unknown [7]. In Neural network classification the accuracy decreases by the increasing number of features. However, its performance can be better by using minimum features.
In this work, initially the images are loaded. Median filter is applied for remove noise, it is also the best smoothing technique. Feature vectors have been achieved and used to detect abnormalities in blood cells. The support vector machine (SVM) classifier is then applied in order to classify infected blood cell images from normal blood cells. SVM classifier is not used for detection of malaria infection till now to the best of our knowledge.
In this section we demonstrate the basics of related work in the area of malaria diagnosis. Rose et al., [8], suggested a diagnostic process of malaria using light microscopy. In this technique the images are used for preprocessing and then accuracy of species are determined by the Artificial Neural Network (ANN) classifier. The accuracy is not more than 73%. In 2014, Kareem [9] lodged an application to detect malaria using blood images. This application is based on Annual ring ratio (ARR) method and estimates the infected cells from blood images. Rahman [10], presented a method for detection of malaria parasites from thin blood smears. In first part of a system Morphological operation is used to extract RBC from an image with 95% accuracy and in further part it is able to detect and classify the malaria species along with 100% accuracy. Bhatt and Prabha [11], proposed an approach to detect and count abnormalities in red blood cells efficiently. The main purpose of this application is to overcome the time management. Form factor threshold is applied to find the abnormalities in red cells. Sreekumar [12] proposed an approach for counting total no of red cells and the shape of red cells.
However, in this paper the we propose the filtering mechanism followed by the feature vector extraction which is helpful to find the area, shape and roundness of infected blood cell stage and finally to apply the SVM classifier to separate the infected images from the normal ones, which is curial step being applied and finds the optimal and efficient way of classifying infections and detection of malaria at various stages.

Proposed Technique
The aim of this research work is to use SVM classifier to classify infected cells from non-infected one. First images are pre-processed and resized. Next it is converted into gray scale image, then median filtering operation is applied on RGB image. Furthermore feature vectors have been used and the last step is plasmodium detection that has been performed by using suitable SVM classifier. Figure 1 shown below defines the flow of the proposed work of detecting plasmodium parasites. Each of these steps are elaborated in the following subsections.

Pre-processing
The first step is to load the image. The image preprocessing involve the operation that has many basic features. It helps to resize the image in order to maintain standard size of all images, to speed up processing. In this work blood image samples have been taken from two image resources Centers for Disease Control (CDC) that contains images with 300*300 magnification [13].  [14]. It is necessary to reshape these samples with the same size 300*300. RGB images are processed from both resources. Figure 2 (a) shows CDC image and Figure 2 (b) shows RGB image processed from CDC image library at CE LAB, similarly Figure 3 (a) and Figure 3 (b) show the images before and after application of Median filter respectively, and discussed in next subsection.

Median Filter
Median filter is a nonlinear digital filtering technique. It helps to remove the noise. In this work median filter operation has been applied on RGB images in order to preserve the edges, retaining useful information. Figure 3 (a) and Figure (b) shows the results of median filter operation is applied before and after RGB image. After applying median filter to RGB image some sort of noise is being removed. The next step is conversion to grayscale which is brief in next subsection.

Conversion in Grayscale
Conversion of gray scale is performed on the resultant image discussed in the last sub section. Grayscaled image is then converted to binary image along with 0.9 intensity value as shown in Figure 4 (a) and Figure 4 (b). Background value is converted into foreground pixels by filling holes, shown in Figure 4 (c). The noise is removed by using bwareaopen shown by Figure 4 (d). This operation is use to remove the small objects whose values are less than 300 pixels. After median filter the RGB image is converted into grayscale image and some operations have been applied on grayscale image in order to further proceed to remove the noise. The results of these operations are shown in Figure 4.

Feature Extraction
Feature extraction is applied to segment erythrocyte green component images, shown in Figure 5 (a). It produces better results of finding out parasites. Malaria infected cells are shown by purple dots in it, shown in Figure 5 (b). Figure 6 (a) and Figure 6 (b) show the red number of cells and parasites respectively. The next step is to prepare feature vector, which are useful in detection of disease

Feature Vector
Feature vectors have been used for detection of infected cells, which are Form factor, Roundness and Area, those are discussed in Table 1. Definition Equation

Form factor
The form factor is used to measure shape metric. Form factor threshold is fixed, its value is equal to 1 for perfect circle. For all other non circular cells its value varies and is less than 1.

Roundness
Normal red blood cells are round in shape and abnormal cells are having different variations in size.
For a perfect circle the value must be equal to 1.

Area
Shape of blood cells becomes rigid due to presence of infection. Normally the size of infected cells are larger than normal cells [15]. Then obviously the surface area increases because of larger cells producing larger surface area and volume. Area = regionprops(BW2,'area')

Plasmodium Detection
Detection of plasmodium parasites has been done by SVM. This classifier is trained with some feature vectors, which are discussed in section 3.5, those feature vectors found to be the most appropriate method for detection of plasmodium parasites. This method is able to count total number of cells shown in Figure 6 (a) are 52 cells and parasites shown in Figure 6 (b) are 37.
Life stages of species can be detected by using different feature values. However, in this work Plasmodium falciparum Life stages has been detected along with some feature values are shown in Table 2.

Expected Values of Uninfected Cells
Number of red cells in normal/uninfected human are high in comparison of infected ones, number of RBC are significantly lower in malaria patients [16]. Uninfected cells having 0 parasites. In some cases minimum number of parasites may also be acceptable [17]. If the value of form factor is equal to 1, then it is said to be normal cell having 0.9 value is also adjustable [18]. Normal cells are round in shape, for this purpose roundness is measured, having 0.9 or 0.8 roundness value is said to be normal cells [19]. Surface area of normal cells are smooth, when cells having even a small amount of infection the area of those cells becomes rigid [20].

Results Analysis
These feature vector values have been tested on 100 images. Table 3 shows the summarization of results, achieved through our detection mechanism. The total number of parasites amongst the total number of cells, along with form factor, roundness, mean area and shape are shown, which detect the infected cells and its stage, already detailed in Table 2. Four species along with four life stages that are 4*4=16 images. The performance of the proposed method is evaluated using statistical properties shown in Table 4, where TP denotes true positive, TN denotes true negative, FP denotes false positive and FN denotes false negative respectively. Summary of the results based on these statistical properties are shown in the shape of confusion matrix in Table 5. The sensitivity (Se) of a test is defined as probability of positive test result, when disease is present and test is positive.
(TP) Se= *100 (TP+FP) Specificity The specificity (Sp) of a test is defined as probability of negative test result, when disease is absent and test is negative.
(TN) Sp = *100 (TN+FN) Accuracy Accuracy (A) is measured by adding all those instances whose predicated output values match up the ground truth. Accuracy is all about that, how much efficient the tests are.
It is clear from the Efficiency analysis results compared in Figure 7 that SVM classifier achieves a better performance which is having 97% accuracy compared with previous methods of NCC and Fuzzy.  Figure 8, various techniques have been deployed but time is again a non-existent service for such dangerous diseases. Efficiency of this proposed system have been compared with other previous methods by estimating the time. This Figure 8 clearly shows that SVM required less time for implementing 100 images. Our proposed method supersede all the concurrent methods applied to detect infected blood cells.

Conclusion
In this paper an attempt has been made to detect malaria through SVM classifier by using microscopic blood images. The aim of this work is to detect malaria by finding out abnormalities in red cells and to determine the life stages of malaria. Infected images have been processed along with suitable feature vectors that include, total number of red cells and number of parasites present. Form factor threshold value is fixed, for all circular objects the value must be equal to 1 otherwise it varies for all non-circular objects. Normal cells are round in shape, however, for a perfect circle the value of roundness is 1 or 0.9 is also considered as a circle. SVM classifier is being trained with some data along with these feature vectors. Implementation has been performed on almost 100 images after that, the sensitivity of this proposed system is 83.3%. Specificity is 97.8%. Overall accuracy of the system is 97%.