Journal Published Online: 28 February 2019
Volume 47, Issue 6

RETRACTED: Link-Based Clustering Algorithm for Clustering Web Documents

CODEN: JTEVAB

Abstract

Following an investigation undertaken by the publisher, we have determined that this paper was accepted on the basis of a compromised peer review process. We hereby retract the paper. The corresponding author has been notified of the retraction. The retraction statement can be found here: https://doi.org/10.1520/JTE20259998. Clustering web documents involves the use of a large amount of words to be inputted to clustering algorithms such as K-Means, Cosine Similarity, Latent Discelet Allocation, and so on. This causes the clustering process to consume much time as the number of words in each document increases. In many web documents, web links are available along with the contents; these web link texts may contain a tremendous amount of information for clustering. In our work, we show that just using the web link text alone gives better clustering efficiency than considering the whole document text. We implemented our algorithm with two benchmark datasets, and the results show that the clustering efficiency is increased by our algorithm more than the existing methods.

Author Information

Ashokkumar, P.
School of Computer Science and Engineering, Vellore, Tamil Nadu, India
Don, S.
TIFAC CORE in Automotive Infotronics, School of Computer Science and Engineering, Vellore, Tamil Nadu, India
Pages: 12
Price: $25.00
Related
Reprints and Permissions
Reprints and copyright permissions can be requested through the
Copyright Clearance Center
Details
Stock #: JTE20180497
ISSN: 0090-3973
DOI: 10.1520/JTE20180497