Cover Image

Revisiting the challenges and surveys in text similarity matching and detection methods

Alva Hendi Muhammad, Kusrini Kusrini, Irwan Oyong

Abstract


The massive amount of information from the internet has revolutionized the field of natural language processing. One of the challenges was estimating the similarity between texts. This has been an open research problem although various studies have proposed new methods over the years. This paper surveyed and traced the primary studies in the field of text similarity. The aim was to give a broad overview of existing issues, applications, and methods of text similarity research. This paper identified four issues and several applications of text similarity matching. It classified current studies based on intrinsic, extrinsic, and hybrid approaches. Then, we identified the methods and classified them into lexical-similarity, syntactic-similarity, semantic-similarity, structural-similarity, and hybrid. Furthermore, this study also analyzed and discussed method improvement, current limitations, and open challenges on this topic for future research directions.

Keywords


Text similarity; Similarity detection; Document similarity; Text matching; Natural language processing

Full Text:

PDF


DOI: http://dx.doi.org/10.26555/jifo.v16i3.a23471

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Alva Hendi Muhammad, Kusrini, Irwan Oyong

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

____________________________________
JURNAL INFORMATIKA

ISSN : 1978-0524 (print) | 2528-6374 (online)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

View JIFO stats