Improved Key Frame Extraction Using Discrete Wavelet Transform with Modified Threshold Factor

Video summarization used for a different application like video object recognition and classification. In video processing, numerous frames containing similar information, this leads to time consumption and slow processing speed and complexity. By using key frames reducing the amount of memory needed for video data processing and complexity greatly. In this paper key frame extraction of Arabic isolated word using discrete wavelet transform (DWT) with modified threshold factor is proposed with different bases. The results for different wavelet basis db, sym and coif show the best result for numbers of key frames at the threshold factor value (0.75).


Introduction
The videos of the dynamic signs consist of large number of frames are not essential in order to determine the meaning of the performed sign, rather, only a few important frames from the video are sufficient.These most important and thus distinguishing frames are known as key frames [1].Through the key frame sequence detection and identification, the sign language can be rapidly recognized.At the same time, sign language is a way for deaf people to communicate and exchange ideas through finger alphabet and gestures instead of language [2].Extraction of key frames from the video and to analyze only these frames instead of all the frames present in the video can greatly improve the performance of the system.Analysis of these key frames can help in forming the annotations for the video.
Different techniques for key frame extraction were reported, a key frame extraction method for video copyright protection was presented in [3].In this technique, a two-stage method is used to extract accurate key frames to cover the content for the whole video sequence.key frame extraction method based on unsupervised clustering and mutual comparison were developed in [4].A method of key frame extraction using thresholding of absolute difference of histogram of consecutive frames of video data is proposed [5].The brief representation and comparison of effective key frame extraction methods like cluster-base analysis, generalized Gaussian density method(GGD), General-Purpose Graphical Processing Unit (GPGPU), Histogram difference was presented in [6].A square histogram based model using frame segmentation and automatic threshold calculation was developed in [7].Key frame extraction method using wavelet statistics was proposed in [8] and the edge change ratio algorithm for detecting shots of the video and key frames are extracted from these shots [9].
Video summarization has been proposed to improve faster browsing of large video collections and more efficient content indexing and access.As the name implies, video summarization is a mechanism to produce a short summary of a video to give the user a synthetic and useful visual abstract of a video sequence, it can either be key frames or video skims [10].By using the key-frame it is able to express the main content of video data clearly and reduce the amount of memory needed for video data processing and complexity greatly.so that could make the storage organization, retrieval and recognition of video information more convenient and efficient, thus key frame extraction is an efficient method for video summarization [5].In this study, an improved method for key frame detection using discrete wavelet transform (DWT) with modified threshold factor is proposed to obtain near an optimum number of frames in order to decrease the required processing time.This paper is organized as

Uses of Key-Frame Extraction
In this section, the applications that used for the key frame ex traction are described as:

Video transmission
In order to reduce the transferred stress in the network and invalid information, the transmission, storage and management techniques of Video information become more and more important [11].When a video is being transmitted, the use of key frames reduces the amount of data required in video indexing and provides the framework for dealing with the video content [12].A key frame based online coding video transmission is proposed.Each frame can only choose the latest coded and reconstructed key frame as its reference frame.After coding and packetisation, compressed video packets are transmitted with differentiated service classes.Key frame along with difference values are sent from the source, using the key frame picture and difference values the picture is reconstructed at the destination [13].

Video annotation
Video annotation is the extraction of the information about Video, adding this information to the video which can help in browsing, searching, analysis .retrieval,comparison, and categorization.Annotation is to attach data to some other piece of data [14].

Video indexing
Key frames reduce the amount of data required in video indexing and provide framework for dealing with the video content.Before downloading any video over the internet, if key frames are shown beside it .userscan predict it content of the video and decide whether it pertinent to his search.Other applications such as creating chapter titles in DVDs and prints from video [15].

Video Summarization
Video summarization is a compact representation of a video sequence.It is useful for various video applications such as video browsing and retrieval systems.A video summarization can be a preview sequence which can be a collect of key frames which is a set of chosen frames of a video.Key frame based video summarization may lose the spatio-temporal properties and audio content in the original video sequence, it is the simplest and the most common method [10].

Proposed Method
The proposed method comprises of four steps.The first, the video frames are read and, then two consecutive frames transformed with DWT to obtain four subbands, LL, HL, LH, and HH.Only three sub-bands, HL, LH, and HH are used to detect key frame.For each sub band different value is estimated by subtracting detail component values of current and next frame as shown in Equation 1.
where i and j are numbers of row and column, are the HL, LH and HH band of gray images and are the HL, LH and HH band of gray images .In the second step, mean and standard deviation are calculated from step one using difference values of sub bands Equation 2 and Equation 3. The third step, threshold value for each sub band is computed by Eq.4. the last step a comparison between threshold and difference values and if the difference between the present values and the previous value greater than the related threshold ,the last frame considered as a key frame [7].(4) where is the threshold value at level and is the modified threshold factor.The Figure 1 shows the flow chart for key frame extraction using modified threshold factor.

Experimental Results and Discussion
In this section of research is indicative words belong to Deaf and dumb people, these words were collected in Arabic language and Iraqi dialect and individual words in collaboration with Iraqi Ministry of labour and Social Affairs.More than forty words signalled task in dealing two words is used, water and drug are used as inputs to the proposed system also been using MATLAB program 2013b in the implementation of the proposed algorithm.
In Table 1 db wavelet basis is used with different values of modified factor and applied to the drug video that 2.96 Mhz and 30 frames/sec the contain 43 frames and the results show that as the increased the numbers of key frames reduced otherwise if the threshold factor smaller value chosen this lead to increase in key frames and for this reason at equal to 0.75 can be considered suitable value.Furthermore, with different values of scale db1, db2 and db4 no effect appears in the numbers of key frames.2 Coif wavelet basis is used and observe the change in the number of key frames by changing the threshold factor value and obviously when the value is 0.75 the number of key frames an efficient.The key frames not influenced by changing the wavelet basis as there's not an apparent effect when changing the wavelet basis.In Table 3 sym wavelet basis is used ,the same results and conclusions as mentioned above for Table 1 and Table 2 In Figure 2 show the frames that contains in drug.mp4video clip with the size of 2.96 MB and frame rate 30 frame/sec is 43 image.In Figure 3 show the key frame images after applying the algorithm based on wavelet bases db1 and β=0.5 about 8 images .In Figure 4 show the frames that contain in a water.mp4video clip with the size of 3.56 MB and frame rate 30 frames/sec is 52 image.In Figure 5 show the key frame images after applying the algorithm based on wavelet bases db1 and β=0.5 about 15 images.

Conclusion
In this paper, an improved key frame extraction using discrete wavelet transform with modified threshold factor method with different wavelet basis db, sym and coif presented.The aim of this method is to reduce the redundant frames that can lead dimensionality reduction of feature vector classification of an isolated Arabic word for deaf and dump peoples.According to the experimental results, key frames affected by the change values of threshold factor .As our future work, we will continue in our research for pattern recognition of sign language based on the resulted key frames from a used video clips.

Figure 1 .
Figure 1.Flowchart for Key Frame Extraction using Modified Threshold Factor

Figure 2 .Figure 3 .
Figure 2. Frames of the Input Video of Drug Word

Figure 4 . 5 
Figure 4. Frames of Input Video of Waterward

Table 1 .
. Key Frame Numbers with a Different Value of Modified Threshold Value using db1, db2 and db3

Table 2 .
Key Frame Numbers with a Different Value of Modified Threshold Value using

Table 3 .
Key Frame Numbers with a Different Value of Modified Threshold Value using Sym1, Sym2 and Sym3 βNo. of Frames N Sym2(key frame t) Sym3(key frame t)