Classification of EEG signals for facial expression and motor execution with deep learning

Recently, algorithms of machine learning are widely used with the field of electroencephalography (EEG) brain-computer interfaces (BCI). The preprocessing stage for the EEG signals is performed by applying the principle component analysis (PCA) algorithm to extract the important features and reducing the data redundancy. A model for classifying EEG, time series, signals for facial expression and some motor execution processes had been designed. A neural network of three hidden layers with deep learning classifier had been used in this work. Data of four different subjects were collected by using a 14 channels Emotiv EPOC+ device. EEG dataset samples including ten action classes for the facial expression and some motor execution movements are recorded. A classification results with accuracy range (91.25-95.75%) for the collected samples were obtained with respect to: number of samples for each class, total number of EEG dataset samples and type of activation function within the hidden and the output layer neurons. A time series EEG signal was taken as signal values not as image or histogram, analysed and classified with deep learning to obtain the satisfied results of accuracy. sigmoid, relu, softmax, tanh implementing those types within the hidden layer's neurons. The most acceptable accuracy level was obtained when using tanh(x) activation function, while the softmax(x) was used within the output layers neurons. Root mean square (RMS) optimizer was used to minimize the error while learning the neural network.


INTRODUCTION
It is well known that, the system which connects human brain signals with appliances or devices without requiring of any physical contact is called brain-computer interfaces (BCI). It has been seen as a new way for communication, where the brain activity has been used as a reflected form by electric brain signals to manage external system such as computers, wheelchairs, switches, or neuro prosthetic extensions [1]- [6].
Electroencephalography (EEG) is the process of fetching the electrical brain's signals and recording them, so the activity of human can be analyzed making the real processing of the brain clear to the user. Electrodes are put on the human scalp, in an easy way, to collect brain's electrical signals. An EEG signal is band limited in frequency (0.1-60 Hz), EEG signals are modeled and classified into five types: (theta, delta, beta, alpha, and gamma waves), which are responsible to capture different associated brain activities inside the brain [7], [8]. EEG signals contain a high redundancy in the collected data, so the important stage before being classifying those signals, is feature extraction stage. In fact, a feature illustrates a distinctive attribute, identifiable measure, and functional element getting from a segment of samples. Feature extraction used to TELKOMNIKA Telecommun Comput El Control 1589 maintain the significant information in the signal and minimizing their lost as much as possible, as well as to simplify the needed resources for describing the huge amount of data in an accurate manner. So, this will lead to a simple implementation that reduces the processing cost for the information, and eliminates the need for data compression [9]- [14]. In this work, principle component analysis (PCA) method is used for unsupervised feature extraction process. This method is a descriptive statistical technique which describes the differences between the samples of the dataset and the most correlated samples. PCA detects the principle component of dataset of the signal, so it will perform the dimension reduction of the data [15].
Algorithms for classifying EEG-based BCIs were classified into four main classes: matrix and tensor, adaptive, deep learning, and transfer learning classifiers as well as a few other diverse classifiers [2], [12], [16]- [20]. In EEG researches, machine learning had been used to discover the related information for neuroimagingý and neural classification. The advances in machine learning and the availability of huge EEG data sets led to deep learning deployment in analyzing EEG signals and in the field of understanding brain functionality by defining collected information inside it [6], [21]- [24]. The use of deep learning with EEG applications in genera,l fell into five groups: motor imagery,emotion recognition, mental task workload, seizure observation, event related potential (ERP) tasks detection, and sleep states recording [25].

RESEARCH METHOD
The work in this paper focuses on EEG signal features to identify the EEG signals for facial expressions (FEs) and some motor execution actions. FEs include: surprise, smile, left wink, right wink, and mouth opened. While, motor execution actions include: right hand lifting, left hand lifting, right rotating of head, left rotating of head , and clapping. All these signals first collected by Emotiv EPOC+ 14 channel mobile brainwear headset, and fetched by the licensed software of Emotiv Pro with python environment. A model for classifying those signals had been designed. Figure 1 shows the research methodology block diagram. The detail of each step will be explained in the next subsections.

Data collection
The first stage of research methodology begins with collecting dataset samples by using Emotiv Epoc+ head set device with 14 channels extended around the head. The data was collected from four subjects with different ages (10-50 years), males and females while they doing the required facial expressions and the motor execution actions. The EEG signals were recorded by the monthly licensed Emotiv software (Emotiv Pro) and saved as excel files (.csv files) to be used later in training the neural network within python environment. during the recording process about 6487 EEG samples were collected. Table 1 shows some samples of the collected EEG data for lifting left hand for one subject.

Data pre-processing
This stage is the artifacts removal of EEG signals, which is doing by the Emotiv headset itself, where the data is recorded directly as it is received from the headset. There is a good amount of signal processing and filtering in the headset to remove artifacts and harmonic frequencies. So, the signals appear clean when we gained a good contact quality. The signals had been sampled at 2048 Hz sampling frequency, and then applied to a dual notch filter at 50 Hz and 60 Hz as well as a low pass filter at 64 Hz cutoff frequency. Finally, the data was filtered down to 128 or 256 Hz.

Feature extraction
In this stage, the obtained preprocessed data from Emotiv headset is processed with PCA algorithm to improve the classifier's accuracy. PCA is a technique used for reduction of dimensionality of the large data sets. This can be achieved by converting the huge set of variables into a smaller one which contains most of the information in the large set [15], [26]. We have 6487 samples from each one of the 14 channels of the headset. To implement PCA, the mean values must be computed firstly, so that we can compute the standardization (Z) of the initial values of the dataset, as in (1), to transform all the variables to the same range [26].  The second step of PCA is to compute the covariance matrix, to check if there is any relationship or correlation between the variables of the dataset to reduce the information redundancy as much as possible. First of all, the covariance between all potential pairs of the initial dataset variables was computed using (2), in order to instruct the entries of the covariance matrix, which is a p×p symmetric matrix.
where; � means the mean value of variable X p is the dimension's number The third step of PCA is to compute the eigenvectors and eigenvalues for the dataset values, in order to locate their principal components. The principal components are the new uncorrelated variables and have the most of information about the dataset is compressed in the first components and it gradually descends. The fourth step is to find the feature vector, which is represented by matrix with columns of eigenvectors for the required component from the previous step. This will lead to keep only k components (eigenvectors) instead of the total number of them (p). The final step of PCA is the reformation of the original dataset axis to the axis of the selected principal components, by multiplying the transpose of feature vector as in (3):

Classification model development
In this work, a neural network with deep learning was built to classify the EEG signals for the ten actions including facial expression and motor execution. The main facility of applying deep learning mechanism is that, it often continues to improve as the size of the dataset increases. This task was implemented with spider3.3.1\Python environment by importing Keras libraries, which is a deep learning API written in Python. A Sequential model, which is a linear stack of layers, with 3 hidden layers which contain (1024, 512 and 256) neurons respectively was built, with activation function of type tanh(X). The output layer consists of 10 output neurons with activation function of type softmax(X). Figure 2 shows the sequential model of the work.

Preformance evaluation
The collected dataset samples are divided into two groups: 80% training dataset and 20% testing dataset to construct the sequential model of the classification to be tested. The performance is evaluated in each epoch with respect to two parameters: loss-values and accuracy of the classification. Accuracy calculates the percentage of predicted values (yPred) that match with actual values (yTrue). When running the model, important parameters effect must be observed since they significantly affect the accuracy and the processing time of the classification process. The parameters include: number of samples for each class, total number of samples, and the type of the activation function applied within the hidden and output layers neurons. When using an equal number of samples for each class, this will give better classification accuracy than those with a random number of samples per class as well as to the obvious reduction in the number of epochs required to train the neural network, and hence the overall processing time will be reduced, as shown in Figure 3.
The total number of samples is the size of the collected samples, as this size increases the deep learning will give a better classification results but this increase cannot be continued since the processing time will be increased as well as to the stability of the accuracy results to a specific value. Finally, there are many types of activation functions such as: sigmoid, relu, softmax, tanh and exponential activation function, so after implementing those types within the hidden layer's neurons. The most acceptable accuracy level was obtained when using tanh(x) activation function, while the softmax(x) was used within the output layers neurons. Root mean square (RMS) optimizer was used to minimize the error while learning the neural network.

RESULTS AND DISCUSION
Firstly, EEG signals classification of 10 classes of facial expressions and motor executions actions was implemented for four subjects. The performance of the classification model was evaluated, as mentioned in the previous section. The training accuracy is ranging from (91.25% to 95.75%), and the best results were obtained when training the model with 100 samples/class with 973 total number of samples. These results will be used in the future work with many applications such as binding those classes with specific tenses or words in order to help the speechless persons to represent their thoughts, so the main goal of this paper is to design a simple EEG classifier, to be utilized for helping the speechless persons, so that giving them the ability to represent their intended thoughts.

CONCLUSIONS
In this paper, ten classes of EEG time series signal values were classified by building deep neural network and implementing deep learning techniques. A specialized dataset samples was recorded. In the offline training, the classification accuracy results reached to 95.75% with minimizing cost of computation and storage requirements by applying only the PCA algorithm on EEG data set signals values without any other filtering as well as to feed the deep nueral network with EEG signal values not as image or histogram.