Using Unlabeled Data Set for Mining Knowledge from DDB

Authors

  • Azhar F. Hassan Al Nahrain University, Baghdad

DOI:

https://doi.org/10.26555/jiteki.v7i1.20164

Keywords:

Knowledge Discovery in Databases (KDD), Data Mining (DM), Clustering Data Mining, Classification Data Mining, Distributed Database (DDB), Distributed Data Mining (DDM)

Abstract

In this paper, two algorithms were introduced to describe two algorithms to describe and compare the applying of the proposed technique in the two types of the distributed database system. The First Proposed Algorithm is Homogeneous Distributed Clustering for Classification (HOMDC4C), which aim to learn a classification model from unlabeled datasets distributed homogenously over the network, this is done by building a local clustering model on the datasets distributed over three sites in the network and then build a local classification model based on labeled data that produce from clustering model. In the one computer considered as a control computer, we build a global classification model and then use this model in the future predictive. The Second Proposed Algorithm in Heterogeneous Distributed Clustering for Classification (HETDC4C) aims to build a classification model over unlabeled datasets distributed heterogeneously over sites of the network, the datasets in this algorithm collected in one central computer and then build the clustering model and then classification model. The objective of this work is to use the unlabeled data to introduce a set of labeled data that are useful for build a classification model that can predict any unlabeled instance based on that classification model. This was done by using the Clustering for Classification technique. Then presented this technique in distributed database environment to reduce the execution time and storage space that is required.

Author Biography

Azhar F. Hassan, Al Nahrain University, Baghdad

Department of Computer Science

References

O. Maimon and L. Rokach, "Introduction to knowledge discovery and data mining," in Data mining and knowledge discovery handbook: Springer, 2009, pp. 1-15. https://doi.org/10.1007/978-0-387-09823-4_1

A. Dogan and D. J. E. S. w. A. Birant, "Machine learning and data mining in manufacturing," Expert Systems with Applications, vol. 166, pp. 114060, 2021. https://doi.org/10.1016/j.eswa.2020.114060

A. Y. Sun and B. R. J. E. R. L. Scanlon, "How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions," vol. 14, no. 7, pp. 073001, 2019. https://doi.org/10.1088/1748-9326/ab1b7d

P. Burggräf, J. Wagner, and T. J. E. S. w. A. X. Weißer, "Knowledge-based problem solving in physical product development––A methodological review," vol. 5, pp. 100025, 2020. https://doi.org/10.1016/j.eswax.2020.100025

O. H. Yahya, H. T. S. Alrikabi, and I. A. Aljazaery, "Reducing the data rate in internet of things applications by using wireless sensor network," International Journal of online and biomedical engineering, Article vol. 16, no. 3, pp. 107-116, 2020. https://doi.org/10.3991/ijoe.v16i03.13021

S. Darrab, D. Broneske, G. J. I. J. o. M. L. Saake, and Computing, "Modern Applications and Challenges for Rare Itemset Mining," vol. 11, no. 3, 2021.

M. Al-dabag, H. S. ALRikabi, and R. Al-Nima, "Anticipating Atrial Fibrillation Signal Using Efficient Algorithm," International Journal of Online and Biomedical Engineering (IJOE), vol. 17, no. 2, pp. 106-120, 2021. https://doi.org/10.3991/ijoe.v17i02.19183

N. S. Alseelawi, E. K. Adnan, H. T. Hazim, H. Alrikabi, and K. Nasser, "Design and Implementation of an E-learning Platform Using N-Tier Architecture," International Journal of Interactive Mobile Technologies, vol. 14, no. 6, pp. 171-185, 2020.

R. Nisbet, J. Elder, and G. Miner, Handbook of statistical analysis and data mining applications. Academic Press, 2009.

N. A. H. Hala A. Naman, Mohand Lokman Al-dabag, Haider Th. Salim Alrikabi, "Encryption System for Hiding Information Based on Internet of Things," International Journal of Interactive Mobile Technologies (iJIM), vol. 15, no. 2, 2021. https://doi.org/10.3991/ijim.v15i02.19869

T. Kliegr, Š. Bahník, and J. J. A. B. S. Fürnkranz, "Advances in machine learning for the behavioral sciences," vol. 64, no. 2, pp. 145-175, 2020. https://doi.org/10.1177/0002764219859639

A. S. Rostami, M. Badkoobe, F. Mohanna, A. A. R. Hosseinabadi, and A. K. J. T. J. o. S. Sangaiah, "Survey on clustering in heterogeneous and homogeneous wireless sensor networks," vol. 74, no. 1, pp. 277-323, 2018. https://doi.org/10.1007/s11227-017-2128-1

W. Gan, J. C. W. Lin, H. C. Chao, J. J. W. I. R. D. M. Zhan, and K. Discovery, "Data mining in distributed environment: a survey," vol. 7, no. 6, pp. e1216, 2017. https://doi.org/10.1002/widm.1216

G. Pio, F. Serafino, D. Malerba, and M. J. I. s. Ceci, "Multi-type clustering and classification from heterogeneous networks," vol. 425, pp. 107-126, 2018. https://doi.org/10.1016/j.ins.2017.10.021

Downloads

Published

2021-04-13

How to Cite

[1]
A. F. Hassan, “Using Unlabeled Data Set for Mining Knowledge from DDB”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 7, no. 1, pp. 1–8, Apr. 2021.

Issue

Section

Articles

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.