Biometric identification using augmented database

Androgenic hair pattern is one of the newest soft biometric trait that can be used to identify criminals when their faces are covered in the evidences of criminal investigation. In real-life situation, sometimes the available evidence is limited thus creating problems for authorities to identify criminal based on the limited data. This research developed the recognition system to identify individuals based on their androgenic hair pattern in a limited data situation in such a way that the limited images were expanded by the augmentation process. There were 50 images studied and expanded into 2.000 images from the augmentation process of rotating, reflecting, adjusting color and intensity. Furthermore, the effect of human skin color extraction was investigated by employing HSV and YCbCr color spaces. The scale-space hierarchy was built among the images with Gaussian function and produced 70% recognition precision that was around more than 2 times higher compared to system of recognition with only limited data.


Introduction
Biometrics is the knowledge to certify the identity of people based on their physiological, chemical and behavioral characteristics [1][2].The purpose of biometrics discussed in this research was to identify criminals based on their androgenic hair pattern.The androgenic hair pattern became popular since it was first introduced as the soft biometric trait in [3].Androgenic hair is the hair that grows on human body since a person reaches his or her puberty era and is influenced by androgens hormone [4].
When the evidence of criminals' identity such as their face are captured in the digital pictures and videos, they may be covered and thus creating problems for the authorities to do the identification procedure.Normally, parts of their hands and legs are not covered and they turn into a potential information to be learnt.Furthermore, the limited numbers of evidence arise as other issues to be solved.There have been pursuits to overcome these two difficulties.In [3], [5] the authors started to study the pattern by using the Gabor orientation histogram.In [6][7][8], the authors applied transformation such as wavelet Haar, principal component analysis and hierarchical Gaussian scale space as a method to help the recognize the androgenic hair pattern and to produce better recognition precision.The results were produced by analyzing 400 images in the database.These studies are not for recognition system with limited dataset.Though the rate of precisions is quite satisfactory, the case in the real-life situation of limited data are not yet implemented in the recognition system.In [9], the performance of color spaces was investigated to extract human skin color area.In order to improve the recognition rate, the rules of extracting human skin color area by using color spaces such as RGB, HSV, YCbCr, CIE Lab and YIQ were studied.In [10], the authors applied class-specific partial least squares (PLS) models to utilize the features of androgenic hair pattern.The research on androgenic hair pattern is expanding and in the next development in [11], the authors learned to develop the recognition system based on real-life situation and limited the data for the system.This research attempted to design a recognition system to identify individuals based on their androgenic hair pattern in a limited data situation in such a way that the limited images were expanded by the augmentation process.The scale-space hierarchy was built among the  ISSN: 1693-6930 TELKOMNIKA Vol.17, No. 1, February 2019: 103-109 104 augmented images by using Gaussian function.Moreover, we also studied the effect of HSV and YCbCr color spaces to the augmented data to extract human skin color area in the images.The rest of this paper is defined as follow: section 2 describes the research method conducted in this paper.In section 3, the results are discussed and analyzed.Finally, conclusion is summed up in section 4.

Research Method
The research method implemented in this research was built based upon three parts processes.The first part was to build the augmented data base from limited data in the system.The second part was to build the scaling hierarchical structure using Gaussian filter and the third part was to find the closest matching of the training and the testing data.Figure 1 below explains the three part processes of research method executed in this research.

Augmented Database
In real life condition, sometimes there are limited data that can be acquired.The limited data explains as only a few images acquired for the same person from the acquiring device such as digital camera.In this research there were only two images from the same person in the system.The two images varied in pose, lighting condition, angle, background information and noise.There were 25 male respondents with two images each, there were total 50 images in the early limited data base.In this research, the data augmentation technique was applied to the limited data base, producing more images derived from the original one.The technique to augment the data was based on geometric transformation such as rotation (1), reflection (2), color adjustment and intensity adjustment [12].The color adjustment limits and enhances the contrast of the images while the intensity adjustment is proposed to make the images darker than earlier.
The two original images in the early database were divided into training and testing images.The augmented data were derived from both division but the augmented images from testing images were notincluded to the training phase.There were 39 augmented images from 1 original image.Figure 2 describes the augmented process that took place in this research.All original images and augmented images were 203x352 in pixels.The skin color extraction in this research ran on two color spaces, HSV and YCbCr that was studied before in [9].These color spaces worked best for extracting human skin color component.Both HSV and YCbCr are the color spaces that separate luminance and chrominance components.It is shown that pixels with the range of human skin color have similarity in chrominance component and it is best to discriminate color of skin and not skin area [13].The rules of HSV skin color area extraction is shown in (3) while the YCbCr in (4) [9].The extraction process removed the background (area that was not detected as human skin color) and replaced it with black and white color.

Scale-space with Hierarchical Gaussian Filter
By building scale-space of an image, it permits us to analyse an image at multi resolutions [14].The different resolution improves the system of identification by representing original images in different scale.The idea behind Gaussian scale-space [15] is to filter the original image with Gaussian function of desired width and decimates the last output from the filter process to start again as the next scale from the same image.
The process began with convolving the input image with Gaussian function with the width  as it can be seen in ( 5) until (8).These equations can be seen in more detail explanation in previous work in [8].The width of the Gaussian was 0 = 1.6 and   = 0.5 as it was used before in [8], [16].The total level for this research was V=3 while vvaried in each level.There were also 4 octaves (U) that was constructed.In one octave, the image was convolved by the Gaussian with the width in ( 7) and (8).If it reached the last level, the octave went higher and the base level for the next level was to decimate the image by the factor of 2. The process continued until it reached the 4 th octave with the 3 rd level.The process of building Gaussian scale space can be seen in Figure 3 below.For the base octave (u=0) and level (v=0), the input image was convolved by Gaussian with the width in (7).To create the next level (v=1,2,3) within the same octave (u=0), the base level was convolved with the Gaussian with the width in (8).For the base level in the next octave, the process just decimated (by the factor of two) the last level from the previous octave and did not convolve the image with the Gaussian.To build the next level (v=1,2,3) for the next octave (u≠ 0), the image on the base level in each octave was convolved with the Gaussian with the width in (8).
In [8], the authors adopted the method Hierarchical Gaussian Scale-Space for androgenic hair pattern recognition.There were 400 original varied images were studied without any augmented images.In this research the same method was applied to study the performance of the augmented images in the data base for a limited data recognition system.

Matching Algorithm
The matching algorithm in this research employed the nearest distance calculation using Euclidean distance.It matched the closest data of testing set to the training set.The configuration of the testing set and the training set are illustrated in Figure 4.Both testing and training images were augmented for 39 augmented images for each original image.When the testing image was being processed, the augmented images derived from the testing set were not included into the system while the 39 augmented images each from remaining 49 images of training images and the original 49 images themselves were included into the matching process.The total testing set for one matching process was 1 image while the training set was 49x39 augmented images+49 original images=1.960images.There were 50 matching processes in total.

Results and Analysis
The results are shown in Table 1 and Figure 5 and Figure 6.Table 1 presents the recognition system precision in percentage of how much accurate the system identifies the testing images to the right class of training images.The alphabet on each row represents the type of the dataset meanwhile the number on each column represents the type of augmented data base that were created.All type of databases were 2000 images in total with 50 original images and 1950 augmented images.The A and B types means the database with the variety of original images with different pose, lighting condition, background different noise.The number 1 until 5 for type A and B means the original augmented database for A1 and B1, the skin extracted using HSV rule from (3) with black background for A2 and B2 and white background for A3 and B3, the skin extracted using YCbCr rule from (4) with black background for A4 and B4 and white background for A5 and B5.Meanwhile type C until L is the database of scale-space from A1 to B5 with 4 octaves and 3 levels of hierarchical Gaussian scale-space.The C1-C13 is the scale-space images from A1 with C1 is the base octave and base level and C13 is the 4 th octave and the 3 rd level.The L1-L13 is the scale-spaces images from B5 with L1 is the base octave and base level and L13 is the 4 th octave and the 3 rd level.Figure 6 shows the performance comparison for the best recognition result from each type of data base.
From Figure 5 and Figure 6, we can see the performance of scale-space images to the recognition system.The performance of recognition using the scale-space database (C-L) was better than using the database only with augmented images (A-B).The best performance result came from using the base octave and the level 4 th database images with the type augmentation D4, F4 and K4 which was 70%.The D4, F4 and K4 types of database were respectively the scale-space version of the A2, A4 and B4 from the base octave and the last level.The recognition precision from A2, A4 and B4 were 42%, 24% and 24% respectively.The A2 database was the database of augmented images with skin extraction from the background using the rule (3) of HSV color space.The background of non-human skin color was converted into black color.The A4 database was the database of augmented images with also skin extraction background process but with the rule (4) from YCbCr color space.The background of non-human skin color was also converted into black color.Meanwhile, the B4 database was also augmented database but with different type of noise compare to A database with human skin color extraction process using YCbCr in rule (4) and converted into black color.From here, we examined that the best recognition rate came from the images from the same octave in the scale-space which meant that the decimation process to create the next octave lowered the recognition rate.The process to decimate the image or to lower the resolution of the image took adverse effect to the recognition system.Especially from Figure 5, we can see this as the abrupt changes from 4 to 5 on the database type C, D, F, H, I and K.While for the database type E, G, J and L, the decimation process of going from the base octave to the next octave by changing the image resolution took advantage for the recognition system.This was studied as the effect of the removal of background of non-human skin color and changed it to white color.The white color needed the decimation process and reduced the images in resolution to give the beneficial effect to the recognition system.The case for database type E, G J and L although did not produce recognition rate as high as the type C, D, F, H, I and K. Figure 7 shows the example of augmented images for each database.As it was explained earlier, there were 39 augmented images in each type of database.After further investigation, we found that the type of the augmented images that gave the closest match to the testing processes were mostly the type of the images with the augmented process of intensity adjustment and with rotation transformation.In [11], the authors studied the performance of several methods for limited data recognition system.It was shown that the SIFT algorithm was the best method and gave higher performance of recognition precision which was 38% compared to Haar wavelet transformation, principal component analysis and hierarchical Gaussian scale-space which each of them produced around 30-32%.The total images that were studied in the research were 50 images with one testing image and one training image only from the same class / person.By developing the augmented database for the recognition system, we experimented on the images and multiplied the numbers into 39 times bigger.The recognition precision with the augmented database went more than two times higher than the recognition system with limited training TELKOMNIKA ISSN: 1693-6930  data.The obstacle for getting higher precision was coming from choosing the type of augmented process that used in the recognition system.

Conclusion
In this research, the augmented databases were created with different type of human skin color extraction by employing HSV, YCbCr color spaces and scale-space using Gaussian function.The best recognition performance which was 70%, obtained from D4, F4 and K4, the database that scaled-spaces the images with base octave and 4 th level of Gaussian function and was extracted the human skin color component using HSV and YCbCr and changed it into the black background images.The decimation process of reducing the resolution of images in the scale-space structures gave adverse effect on the black background extracted human skin color.On the contrary, it gave beneficial effect on white background extracted human skin color.The type of augmented process that produced the best recognition mostly from intensity adjustment process and rotation transformation.The augmented database compared to limited data recognition system resulted in advantageous outcome on the recognition performance.It improves the precision rate until more than 2 times higher.

Figure 1 .
Figure 1.Research method for biometric identification

Figure 4 .
Figure 4.The configuration of training and testing set

Figure 7 .
Figure 7.The Examples of 39 Augmented Images from Type A and B Database

Table 1 .
Recognition Precision for Augmented Database