Autism Spectrum Disorders Gait Identification Using Ground Reaction Forces

Autism spectrum disorders (ASD) are a permanent neurodevelopmental disorder that can be identified during the first few years of life and are currently associated with the abnormal walking pattern. Earlier identification of this pervasive disorder could provide assistance in diagnosis and establish rapid quantitative clinical judgment. This paper presents an automated approach which can be applied to identify ASD gait patterns using three-dimensional (3D) ground reaction forces (GRF). The study involved classification of gait patterns of children with ASD and typical healthy children. The GRF data were obtained using two force plates during self-determined barefoot walking. Time-series parameterization techniques were applied to the GRF waveforms to extract the important gait features. The most dominant and correct features for characterizing ASD gait were selected using statistical between-group tests and stepwise discriminant analysis (SWDA). The selected features were grouped into two groups which served as two input datasets to the k-nearest neighbor (KNN) classifier. This study demonstrates that the 3D GRF gait features selected using SWDA are reliable to be used in the identification of ASD gait using KNN classifier with 83.33% performance accuracy.


Introduction
Autism spectrum disorders (ASD) are characterized by a constant deficit in social communication, social interaction, and the presence of restricted and repetitive behaviors.This pervasive and permanent neurodevelopmental disorder can be recognized during the early stage of the developmental period of a child.One of the possible signs that could be used to identify ASD is the existence of motor deficits, which includes abnormal gait, clumsiness, and irregular motor signs [1].An abnormal gait is defined as an irregular style of walking and this unusual condition could cause deterioration in occupational and other daily activities of individuals with ASD.Previous studies have reported a wide range of abnormal gait patterns in temporal and spatial measurements, kinematic joint angles, kinetic joint moments and joint powers during walking in individuals with ASD [2,3].
The identification of gait abnormalities could be beneficial for the early detection and better treatment planning for children with ASD [4].Current gait assessment methods are often time-consuming and highly dependent on the clinician judgment, which leads to subjective interpretations.With the current advances in gait analysis and instrumentation, it not only provide new insights in understanding all aspects of movement patterns, but also support the evolution of automated diagnosis of pathological disorders.
Ground reaction force (GRF) is one of the kinetic measurements that has been effectively used for the assessment of normal and pathological movements and also for the comparisons between patients and normal groups [5].In routine gait analysis, force plates are used to measure the GRF in three dimensions, namely medial-lateral, anterior-posterior, and vertical directions.The three components of GRF provide a complete interpretation on how the body weight drops and moves across the supporting foot during walking [6].Therefore, by investigating the whole GRF components is expected to be more effective to identify specific locomotion characteristics that can be used for automated identification.To the best of our knowledge, the only study that investigates GRF components in children with ASD is done by Ambrosini et al [7].The study reported a decrease in the second peak of vertical force in most of their subjects.
In order to recognize gait abnormalities, machine learning models are used to classify and discover underlying patterns of the kinematic and kinetic measurements.The application of machine learning classifiers for automated recognition of gait pattern deviations and other various biomedical fields has grown enormously in the last decades.Artificial neural networks (ANN) and support vector machines (SVM) have been employed for recognition and classification of Parkinson's disease [8], young-old gait patterns [9], cerebral palsy children [10], and patients with neurological disorders [11].ANN was also successfully used for classification of gender in children [12] and post-stroke patients [13].Apart from that, k-nearest neighbor (KNN) was also used as a pattern classifier for gait pattern identification [14] and brain balancing classification [15].KNN is a supervised machine learning classifier which is simple but robust to be used in statistical estimation and pattern recognition.This non-parametric classification method predicted a class label to each member of the test sample based on voting rights of its k-nearest neighbors determined by a distance metric [16].
Recently, statistical feature selection techniques such as independent t-test [17], Mann-Whitney U [18], and stepwise method of discriminant analysis (SWDA) [13] were used to select significant features in gait research.The independent t-test and Mann-Whitney U test (TMWU) are the types of between-group tests that have the ability to select significant features by examining the mean score of gait features across two separate groups.Meanwhile, SWDA is frequently conducted to determine the optimum set of input features for group membership prediction and eliminate the least significant and unrelated features from the dataset [19].Previous studies in gait analysis have validated that SWDA is able to identify specific individual features that best determined group placement [13,20,21].
The scarcity of research and insufficient information regarding the 3D GRF in children with ASD are demonstrated globally.Until now, there is no published literature dealing with automated recognition of ASD gait patterns based on 3D GRF data.Thus, this study proposes an automated identification of ASD children using machine learning classifier based on the 3D GRF input features.These features were first extracted using time-series parameterization methods and then were selected using two statistical feature selection techniques.KNN is employed to model both input features and their classification performances with each input dataset were compared.The rest of this paper has been organized as follow.The next section explains the proposed method for the study.Section 3 presents the experimental results and the discussion.Finally, Section 4 concludes the study.

Research Method
The ASD gait identification is primarily generated based on the automatic gait classification system using statistical analysis and machine learning approaches.The proposed system consists of five sequence processes of data acquisition, preprocessing, feature extraction, feature selection, and gait pattern classification as illustrated in Figure 1.In this study, the 3D GRF data from single left limb stance from the selected valid trial was analyzed to represent the gait attributes of each participant [22,23].The time components of the 3D GRF were normalized to the percentage of stance phase time, whereas the 3D GRF amplitudes were normalized to the percentage of the participant's body weight [24,25].Normalization steps were essentially performed to eliminate variations among the participants with different height, body mass, and duration of stance phase [26,27].After normalization, the initial foot contact corresponds to 0% and the foot off event corresponded to 100% of the stance phase.
In routine gait analysis, the GRF during normal walking is generally measured in three directions (Fx: medial-lateral, Fy: anterior-posterior, and Fz: vertical).The 3D GRF patterns for a TD participant are shown in Figure 3(a), (b), and (c).These graphs also show 17 characteristic points that were extracted from the curves.Fy2 was excluded due to zero force value during mid-stance.Time-series parameterization techniques were applied to each waveform to extract the instantaneous values of amplitude and its relative time [24,28].This technique is considered one of the most common methods of gait data analysis, which is preferable, and clinically acceptable [29,30].
The following twenty GRF gait features were extracted: the local peaks and minimum values of the three GRF components (Fx1, Fx2, Fx3; Fy1, Fy3; and Fz1, Fz2, Fz3); the relative time of (Tx1, Tx2, Tx3; Ty1, Ty2, Ty3; and Tz1, Tz2, Tz3); loading rate, push-off rate, and peak ratio (Table 1) [5,24,28,31].Loading rate is defined as the amplitude of the first vertical peak force divided by its time occurrence.The push-off rate is computed as the amplitude of the second peak of vertical force divided by the time from the second peak of vertical force until the end of the stance phase [32].The peak ratio is calculated as the amplitude of the first peak of vertical force divided by the amplitude of the second vertical force peak [28].

Features Selection
Generally, some extracted features may contain redundant and least significance information which can lead to poor performance in the classification stage.Hence, a feature selection method was employed in order to select the most significant features as well as to enhance the classifier performance.In this study, between-group tests and stepwise discriminant analysis (SWDA) were used to select the most significant gait features.
Initially before conducting between-group test, the extracted gait features were explored for normality using the Shapiro-Wilk (SW) test since the sample size in each group is less than 50 [33].Features were normally distributed if the outcome of SW test (p-value) is greater than 0.05.For normally distributed features, the mean scores were examined using independent ttests (T), whereas Mann-Whitney U tests (MWU) for non-normal features.The significant difference between the two groups for both tests was defined as p < 0.05.Features that were statistically significant were chosen to be as input features in classification stage.
Another statistical method, SWDA was used to identify dominant features that made a significant contribution for group separation across the two groups.SWDA was performed using the Wilks' lambda method with the default setting criteria of the F value to enter is at least 0.05 and F value to remove is less than 0.10.Features within the range of F values are statistically significance of groups discrimination [29,34].Both statistical analyses were performed using the IBM SPSS Statistics version 21.0 (IBM, New York, USA).

Classification Model
Classification is a process of assigning each element in a set of data into target categories or classes.The ultimate goal of this process is to predict the target class for each case in the dataset accurately.The classification stage was performed using Statistics and Machine Learning Toolbox in Matlab version R2015a (The MathWorks Inc., USA).The selected features, namely 3DGRF-TMWU and 3DGRF-SWDA were fed into the KNN classifier.In this study, the classification tasks were explored using four types of distance metrics: cityblock, correlation, cosine, and Euclidean, while the k value was varied from 1 to 12.
In order to find the best model that characterizes the input dataset, it is important to implement cross-validation method for model evaluation.This method uses an independent test set which has not been used during the training process to evaluate the model performance [35].Due to small sample sizes used in this study, 10-fold cross validation method is chosen to estimate the generalization ability of KNN classifier [25,35].In 10-fold cross validation, the dataset is randomly divided into 10 equal or nearly equal-sized subsets or folds.Nine folds are used for training and the remaining one fold is used for testing.10-cross validation is repeated for ten iterations so that for each number of iterations, a different fold is held out for evaluation and the other nine folds are used for training.Then, the classification accuracy is calculated by averaging the accuracy for the ten folds [35].
The model performance with two types of input dataset and variations of model parameters was measured using confusion matrix with two classes, TD and ASD.In this study, true positive (TP) is the number of ASD cases correctly classified and true negative (TN) is the number of TD cases correctly classified.The effectiveness of the TMWU and SWDA feature selection were measured by calculating the classification accuracy which was defined as the correct classifications (TP and TN) rate made by the model over a dataset.

Results and Discussion
After completing the feature extraction using parameterization techniques, twenty GRF gait features were extracted as gait pattern to represent the gait profiles of each participant.Table 2 tabulates the means, standard deviations (SD), and the p-value distribution of each extracted gait feature.From the twenty gait features, only six significant features have been chosen using the independent t-test and Mann-Whitney U test (TMWU) for ASD gait classification.These significant gait features which have a p-value less than 0.05 are made bold in Table 2.The dominant features are Fy3, Ty2, Fz3, Tz3, push-off rate, and peak ratio.Pertaining to the mean values of the dominant features, children with ASD were found to exhibit significantly lower Fy3, Ty2, Fz3, Tz3, and push-off rate, but the peak ratio was significantly greater in ASD as compared to TD.Based on the SWDA approach, it is found that only three dominant gait features, namely Fz3, Ty2, and Tx1 have the ability to discriminate the ASD gait from the normal gait patterns.All these three dominant features have a high impact on the classification process with its p-value distribution less than 0.05.Table 3 summarizes the classification accuracy attained using KNN classifier with four distance metrics and its optimized k values for each 3DGRF-TMWU and 3DGRF-SWDA datasets.It was observed that the rates of correct classification were within the range 77% to 83%.For the 3DGRF-TMWU dataset with six input features, the cityblock distance with k=9 produces 81.67% accuracy as compared to the other distance.Meanwhile, the combination of three dominant features of 3DGRF-SWDA dataset and KNN classifier with Euclidean distance and k=11 demonstrated an improved performance for ASD gait identification with 83.33% accuracy.Results indicate the potential of using both statistical feature selection techniques for the determination of significant and dominant gait features prior to performing identification of ASD gait.In this particular case, the SWDA approach produces a much better set of predictors.This study also highlights the relevance of the 3D GRF measurements in ASD gait pattern identification.Future studies should explore another type of possible gait features and machine classifiers to enhance classification accuracy.

Conclusion
In this paper, an identification system of ASD gait based on the 3D GRF is presented.This study has evaluated that the 3D GRF gait features extracted using the time-series parameterization techniques and then selected using statistical feature selection methods could be utilized for identification of abnormalities in ASD gait.Apart from that, this study also introduces the importance of feature selection techniques for selecting dominant gait features prior to classification.Overall, the selected 3D GRF gait features using SWDA and the optimized KNN classifier were successfully discriminated the 3D GRF gait patterns into ASD and TD groups with 83.33% accuracy.These findings would be beneficial for automatic screening and diagnosis of ASD and also for the detection of gait abnormalities in individuals with ASD or other neurological gait disorders.

Figure 1 .
Figure 1.The overall process of ASD gait identification.

TELKOMNIKA
Vol. 15, No. 2, June 2017 : 791 -79x 906 order low-pass Butterworth filter with cutoff frequency of 30 Hz to reduce noise.Next, the 3D GRF data were extracted into the ASCII text format for data analysis.These processes were computed using the Vicon Nexus software version 1.8.5 (Vicon, Oxford, UK).

907Figure 3 .
Figure 3.The three ground reaction force components of a single left limb stance during the stance phase of a typically developing female participant.(a) Medial-lateral direction; (b) anterior-posterior direction; and (c) vertical direction.

Table 1 .
The extracted 3D GRF gait features and its abbreviations.

Table 2 .
Mean, standard deviation (SD), and p-value of the extracted 3D GRF gait features.

Table 3 .
Classification accuracy of KNN classifier with four distance metrics and its optimized k values for the TMWU and SWDA datasets.