Cover Image

Using Unlabeled Data Set for Mining Knowledge from DDB

Azhar F. Hassan

Abstract


In this paper, two algorithms were introduced to describe two algorithms to describe and compare the applying of the proposed technique in the two types of the distributed database system. The First Proposed Algorithm is Homogeneous Distributed Clustering for Classification (HOMDC4C), which aim to learn a classification model from unlabeled datasets distributed homogenously over the network, this is done by building a local clustering model on the datasets distributed over three sites in the network and then build a local classification model based on labeled data that produce from clustering model. In the one computer considered as a control computer, we build a global classification model and then use this model in the future predictive. The Second Proposed Algorithm in Heterogeneous Distributed Clustering for Classification (HETDC4C) aims to build a classification model over unlabeled datasets distributed heterogeneously over sites of the network, the datasets in this algorithm collected in one central computer and then build the clustering model and then classification model. The objective of this work is to use the unlabeled data to introduce a set of labeled data that are useful for build a classification model that can predict any unlabeled instance based on that classification model. This was done by using the Clustering for Classification technique. Then presented this technique in distributed database environment to reduce the execution time and storage space that is required.

Keywords


Knowledge Discovery in Databases (KDD); Data Mining (DM); Clustering Data Mining; Classification Data Mining; Distributed Database (DDB); Distributed Data Mining (DDM)

Full Text:

PDF

References


O. Maimon and L. Rokach, "Introduction to knowledge discovery and data mining," in Data mining and knowledge discovery handbook: Springer, 2009, pp. 1-15. https://doi.org/10.1007/978-0-387-09823-4_1

A. Dogan and D. J. E. S. w. A. Birant, "Machine learning and data mining in manufacturing," Expert Systems with Applications, vol. 166, pp. 114060, 2021. https://doi.org/10.1016/j.eswa.2020.114060

A. Y. Sun and B. R. J. E. R. L. Scanlon, "How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions," vol. 14, no. 7, pp. 073001, 2019. https://doi.org/10.1088/1748-9326/ab1b7d

P. Burggräf, J. Wagner, and T. J. E. S. w. A. X. Weißer, "Knowledge-based problem solving in physical product development––A methodological review," vol. 5, pp. 100025, 2020. https://doi.org/10.1016/j.eswax.2020.100025

O. H. Yahya, H. T. S. Alrikabi, and I. A. Aljazaery, "Reducing the data rate in internet of things applications by using wireless sensor network," International Journal of online and biomedical engineering, Article vol. 16, no. 3, pp. 107-116, 2020. https://doi.org/10.3991/ijoe.v16i03.13021

S. Darrab, D. Broneske, G. J. I. J. o. M. L. Saake, and Computing, "Modern Applications and Challenges for Rare Itemset Mining," vol. 11, no. 3, 2021.

M. Al-dabag, H. S. ALRikabi, and R. Al-Nima, "Anticipating Atrial Fibrillation Signal Using Efficient Algorithm," International Journal of Online and Biomedical Engineering (IJOE), vol. 17, no. 2, pp. 106-120, 2021. https://doi.org/10.3991/ijoe.v17i02.19183

N. S. Alseelawi, E. K. Adnan, H. T. Hazim, H. Alrikabi, and K. Nasser, "Design and Implementation of an E-learning Platform Using N-Tier Architecture," International Journal of Interactive Mobile Technologies, vol. 14, no. 6, pp. 171-185, 2020.

R. Nisbet, J. Elder, and G. Miner, Handbook of statistical analysis and data mining applications. Academic Press, 2009.

N. A. H. Hala A. Naman, Mohand Lokman Al-dabag, Haider Th. Salim Alrikabi, "Encryption System for Hiding Information Based on Internet of Things," International Journal of Interactive Mobile Technologies (iJIM), vol. 15, no. 2, 2021. https://doi.org/10.3991/ijim.v15i02.19869

T. Kliegr, Š. Bahník, and J. J. A. B. S. Fürnkranz, "Advances in machine learning for the behavioral sciences," vol. 64, no. 2, pp. 145-175, 2020. https://doi.org/10.1177/0002764219859639

A. S. Rostami, M. Badkoobe, F. Mohanna, A. A. R. Hosseinabadi, and A. K. J. T. J. o. S. Sangaiah, "Survey on clustering in heterogeneous and homogeneous wireless sensor networks," vol. 74, no. 1, pp. 277-323, 2018. https://doi.org/10.1007/s11227-017-2128-1

W. Gan, J. C. W. Lin, H. C. Chao, J. J. W. I. R. D. M. Zhan, and K. Discovery, "Data mining in distributed environment: a survey," vol. 7, no. 6, pp. e1216, 2017. https://doi.org/10.1002/widm.1216

G. Pio, F. Serafino, D. Malerba, and M. J. I. s. Ceci, "Multi-type clustering and classification from heterogeneous networks," vol. 425, pp. 107-126, 2018. https://doi.org/10.1016/j.ins.2017.10.021




DOI: http://dx.doi.org/10.26555/jiteki.v7i1.20164

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 Azhar F. Hassan

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


 
About the JournalJournal PoliciesAuthor Information
 


Jurnal Ilmiah Teknik Elektro Komputer dan Informatika
ISSN 2338-3070 (print) | 2338-3062 (online)
Organized by Electrical Engineering Department - Universitas Ahmad Dahlan
Published by Universitas Ahmad Dahlan
Website: http://journal.uad.ac.id/index.php/jiteki
Email 1: jiteki@ee.uad.ac.id
Email 2: alfianmaarif@ee.uad.ac.id
Office Address: Kantor Program Studi Teknik Elektro, Lantai 6 Sayap Barat, Kampus 4 UAD, Jl. Ringroad Selatan, Tamanan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55191, Indonesia