Increasing Performance of Boolean Retrieval Model by Data Parallelism Technique

Mukesh Rawat; Preksha Pratap; Manan Gupta; Hardik Sharma

Increasing Performance of Boolean Retrieval Model by Data Parallelism Technique

Authors: Mukesh Rawat¹, Preksha Pratap², Manan Gupta³, Hardik Sharma⁴
View Affiliations Hide Affiliations

¹ Department of Computer Science and Engineering, Meerut Institute of Engineering &Technology, Meerut, U.P., India ² Department of Computer Science and Engineering, Meerut Institute of Engineering &Technology, Meerut, U.P., India ³ Department of Computer Science and Engineering, Meerut Institute of Engineering &Technology, Meerut, U.P., India ⁴ Department of Computer Science and Engineering, Meerut Institute of Engineering &Technology, Meerut, U.P., India
Source: Recent Developments in Artificial Intelligence and Communication Technologies , pp 185-206
Publication Date: September 2022
Language: English

Information retrieval (IR) is to identify documents of non-uniform behavior that fulfill information requirements from the huge repository (maintained in computer systems). Different models have been defined to retrieve/fetch information. For example, the Boolean model, the Statistical model, which focuses on the vector space and probabilistic retrieval, and the Linguistic and Knowledge-based retrieval models. The Boolean model is defined as the “perfect match” model. If the queries are not accurate, they retrieve/fetch some irrelevant documents. This is called the precision (p) rate, which is the proportion of the relevant retrieved documents. The Boolean method provides good techniques to elaborate or concise a query. The Boolean method works well for the search process because of the clarity between the concepts. The Boolean retrieval model processes the queries in which terms of the queries are in the form of Boolean expressions, that is, in which terms of the user query combined with AND( amp;), OR(||), and NOT(!) operators. The model views documents in the form of inverted indexes. The key concept of an inverted index is to maintain a dictionary of terms. For every term, there is a collection of documents in which the term occurs. Posting is a collection of documents in which a term occurs. The list is known as the postings list (or inverted list), and all the postings lists are collectively called postings. But as the number of documents is increased, the postings of documents are also increased, and processing these documents becomes time-consuming; so to resolve this problem, a multithreaded model is proposed in which the postings list is broken down into different chunks and processes, due to which Boolean operation between postings in accordance with Boolean query becomes faster. Using this data parallelism technique, the performance of the Boolean Retrieval Model is increased.

Hardbound ISBN: 9781681089683

Ebook ISBN: 9781681089676

Book DOI: https://doi.org/10.2174/97816810896761220101

From This Site

/content/books/9781681089676.chap10

dcterms_subject,pub_keyword

-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData

10

5

/content/books/9781681089676.chap10

dcterms_subject,pub_keyword

-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData

10

5

Chapter

content/books/9781681089676

Book

false

en

Increasing Performance of Boolean Retrieval Model by Data Parallelism Technique

From This Site