Machine Learning-based High-Dimensional Text Document Classification and Clustering

Ansh Kataria

Machine Learning-based High-Dimensional Text Document Classification and Clustering

By Ansh Kataria¹
View Affiliations Hide Affiliations

¹ Centre for Interdisciplinary Research in Business and Technology, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
Source: Demystifying Emerging Trends in Machine Learning , pp 139-149
Publication Date: February 2025
Language: English

Text classification is a difficult technique. Many techniques have been developed to decrease the dimension of feature vectors for use in text classification due to their enormous size. This work provides a detailed discussion of unique parameters utilising an optic clustering strategy, as well as a review of some of the most essential text categorization algorithms. In this case, the words are clustered according to their level of similarity. Each cluster's membership function is based on the mean along with the standard deviation of its data. Finally, characteristics are chosen from each grouping. Each cluster's extracted feature is the weighted sum of its words. There's also no need to guess or use trial-and-error approaches to determine the optimal number of clusters.

Hardbound ISBN: 9789815305401

Ebook ISBN: 9789815305395

Book DOI: https://doi.org/10.2174/97898153053951250201

From This Site

/content/books/9789815305395.chapter-13

dcterms_subject,pub_keyword

-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData

10

5

/content/books/9789815305395.chapter-13

dcterms_subject,pub_keyword

-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData

10

5

Chapter

content/books/9789815305395

Book

false

en

Machine Learning-based High-Dimensional Text Document Classification and Clustering

From This Site