Classifying Confidential Data using SVM for Efficient Cloud Query Processing

Huda Kadhim Tayyeh, Ahmed Sabah Al-Jumaili


Nowadays, organizations are widely using a cloud database engine from the cloud service providers. Privacy still is the main concern for these organizations in which every organization is strictly looking forward more secure and private environment for their own data. For this purpose, several studies have proposed different types of encryption methods in order to protect the data over the cloud. However, the daily transactions represented by queries for such databases makes the encryption is inefficient solution. Therefore, recent studies have presented a mechanism for classifying the data prior to migrate into the cloud. This would reduce the need of encryption which enhance the efficiency. Yet, most of the classification methods used in the literature were based on string-based matching approach. Such approach suffers of the exact match of terms where the partial matching would not be considered. This paper aims to take the advantage of N-gram representation along with Support Vector Machine classification. A real-time data will used in the experiment. In addition, the N-gram approach will be used to represent the data. After conducting the classification, the Advanced Encryption Standard algorithm will be used to encrypt the confidential data. Results showed that the proposed method has outperformed the baseline encryption method. This emphasizes the usefulness of using the machine learning techniques for the process of classifying the data based on confidentiality.


Cloud Database, Cloud Query Processing, Support Vector Machine, Advanced Standard Encryption.

