Using Unlabeled Data Set for Mining Knowledge from DDB
DOI:
https://doi.org/10.26555/jiteki.v7i1.20164Keywords:
Knowledge Discovery in Databases (KDD), Data Mining (DM), Clustering Data Mining, Classification Data Mining, Distributed Database (DDB), Distributed Data Mining (DDM)Abstract
In this paper, two algorithms were introduced to describe two algorithms to describe and compare the applying of the proposed technique in the two types of the distributed database system. The First Proposed Algorithm is Homogeneous Distributed Clustering for Classification (HOMDC4C), which aim to learn a classification model from unlabeled datasets distributed homogenously over the network, this is done by building a local clustering model on the datasets distributed over three sites in the network and then build a local classification model based on labeled data that produce from clustering model. In the one computer considered as a control computer, we build a global classification model and then use this model in the future predictive. The Second Proposed Algorithm in Heterogeneous Distributed Clustering for Classification (HETDC4C) aims to build a classification model over unlabeled datasets distributed heterogeneously over sites of the network, the datasets in this algorithm collected in one central computer and then build the clustering model and then classification model. The objective of this work is to use the unlabeled data to introduce a set of labeled data that are useful for build a classification model that can predict any unlabeled instance based on that classification model. This was done by using the Clustering for Classification technique. Then presented this technique in distributed database environment to reduce the execution time and storage space that is required.References
O. Maimon and L. Rokach, "Introduction to knowledge discovery and data mining," in Data mining and knowledge discovery handbook: Springer, 2009, pp. 1-15. https://doi.org/10.1007/978-0-387-09823-4_1
A. Dogan and D. J. E. S. w. A. Birant, "Machine learning and data mining in manufacturing," Expert Systems with Applications, vol. 166, pp. 114060, 2021. https://doi.org/10.1016/j.eswa.2020.114060
A. Y. Sun and B. R. J. E. R. L. Scanlon, "How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions," vol. 14, no. 7, pp. 073001, 2019. https://doi.org/10.1088/1748-9326/ab1b7d
P. Burggräf, J. Wagner, and T. J. E. S. w. A. X. Weißer, "Knowledge-based problem solving in physical product development––A methodological review," vol. 5, pp. 100025, 2020. https://doi.org/10.1016/j.eswax.2020.100025
O. H. Yahya, H. T. S. Alrikabi, and I. A. Aljazaery, "Reducing the data rate in internet of things applications by using wireless sensor network," International Journal of online and biomedical engineering, Article vol. 16, no. 3, pp. 107-116, 2020. https://doi.org/10.3991/ijoe.v16i03.13021
S. Darrab, D. Broneske, G. J. I. J. o. M. L. Saake, and Computing, "Modern Applications and Challenges for Rare Itemset Mining," vol. 11, no. 3, 2021.
M. Al-dabag, H. S. ALRikabi, and R. Al-Nima, "Anticipating Atrial Fibrillation Signal Using Efficient Algorithm," International Journal of Online and Biomedical Engineering (IJOE), vol. 17, no. 2, pp. 106-120, 2021. https://doi.org/10.3991/ijoe.v17i02.19183
N. S. Alseelawi, E. K. Adnan, H. T. Hazim, H. Alrikabi, and K. Nasser, "Design and Implementation of an E-learning Platform Using N-Tier Architecture," International Journal of Interactive Mobile Technologies, vol. 14, no. 6, pp. 171-185, 2020.
R. Nisbet, J. Elder, and G. Miner, Handbook of statistical analysis and data mining applications. Academic Press, 2009.
N. A. H. Hala A. Naman, Mohand Lokman Al-dabag, Haider Th. Salim Alrikabi, "Encryption System for Hiding Information Based on Internet of Things," International Journal of Interactive Mobile Technologies (iJIM), vol. 15, no. 2, 2021. https://doi.org/10.3991/ijim.v15i02.19869
T. Kliegr, Å . BahnÃk, and J. J. A. B. S. Fürnkranz, "Advances in machine learning for the behavioral sciences," vol. 64, no. 2, pp. 145-175, 2020. https://doi.org/10.1177/0002764219859639
A. S. Rostami, M. Badkoobe, F. Mohanna, A. A. R. Hosseinabadi, and A. K. J. T. J. o. S. Sangaiah, "Survey on clustering in heterogeneous and homogeneous wireless sensor networks," vol. 74, no. 1, pp. 277-323, 2018. https://doi.org/10.1007/s11227-017-2128-1
W. Gan, J. C. W. Lin, H. C. Chao, J. J. W. I. R. D. M. Zhan, and K. Discovery, "Data mining in distributed environment: a survey," vol. 7, no. 6, pp. e1216, 2017. https://doi.org/10.1002/widm.1216
G. Pio, F. Serafino, D. Malerba, and M. J. I. s. Ceci, "Multi-type clustering and classification from heterogeneous networks," vol. 425, pp. 107-126, 2018. https://doi.org/10.1016/j.ins.2017.10.021
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with JITEKI agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution 4.0 International License