Identification of Websites Using an Efficient Method Employing Text Mining Methods
- By Madhur Taneja1
-
View Affiliations Hide Affiliations1 Centre for Interdisciplinary Research in Business and Technology, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
- Source: Demystifying Emerging Trends in Machine Learning , pp 127-138
- Publication Date: February 2025
- Language: English
Identification of Websites Using an Efficient Method Employing Text Mining Methods, Page 1 of 1
< Previous page | Next page > /docserver/preview/fulltext/9789815305395/chapter-12-1.gif
Herein, we introduce a method for website classification using deep neural networks and mixed data extractors. We use iterative training as well as supervised learning approaches to use a gradient descent methodology to simulate the website categorization. This modern model is comprised of a webpage encoder, a convolutional neural network (CNN) feature extraction, a bidirectional long short-term memory (LSTM) feature extractor, as well as a fully connected classifier. It may retrieve various website features at various granularities. Our model may quickly select a suitable website class by concatenating mixed features obtained from mixed feature extractors. On the realistic website dataset that has been obtained, we conduct in-depth tests. The dataset is compiled using domains that were taken from the telecom operator's DNS records. The proposed categorization schema outperforms state-of-the-art models in comparison to our fresh model as well as a slew of popular machine learning algorithms in terms of accuracy, recall, F1, and precision. Other web apps may benefit from all of this as well, such as detecting fake websites as well as ads.
-
From This Site
/content/books/9789815305395.chapter-12dcterms_subject,pub_keyword-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData105