BITS WILP Information Retrieval Quiz-1 2017-H2



Question 1
 Consider the following documents:
 D1. Cat in the hat 
D2. The cat chased the rat 
D3. The rat died
D4: The cat died 
What is the space requirement for an uncompressed Boolean term-document incidence matrix of the above documents?


Select one:
7 bytes
28 bits
28 bytes
 7 bits
Feedback
The correct answer is: 28 bits
Question 2
Which of the following terms have the same soundex code?

Select one or more:
 Brightsite
 Briteside
 Brightside
Feedback
Your answer is correct.
The correct answer is:  Brightside,  Brightsite
Question 3
Consider an index for 100000 documents each having a length of 750 words. Assume there are 200K distinct terms in total. What is the minimum number of bits required for representing the Doc-ID? 



Select one:
8 bits
18 bits
17 bits
20 bits
Feedback
The correct answer is: 17 bits
Question 4
Which of the following is(are) NOT true with Google Search Engine?

Select one:
It offers specialized search services

It does stemming
It does stop-word removal
None of the choices
Feedback
The correct answer is: None of the choices
Question 5
 A  fragment from an inverted index (augmented with positional information) is given below.
Information: d1:12 ; d2:23,32,43;  d3:13, d5:32,45,80
systems:  d1:15;  d2:34,42;  d3: 35, d5: 38
Which of the following phrase(s) has(have) possible occurrences in the above document sequence?

Select one or more:
 “Information retrieval systems”
 “Information systems”
  “Information theory retrieval systems”
None of the choices
Feedback
The correct answer is:  “Information retrieval systems”,   “Information theory retrieval systems”
Question 6
 Consider the following two postings list with the skip pointers shown. How many postings comparisons will be made while intersecting the two lists with skip pointers?

         

Select one:
7
8
6
9
Feedback
The correct answer is: 9
Question 7
 Consider the following fragment of a positional index with the format:
word: document: (position, position, . . .); document:(position, . . .i). . .
Gates: 1: (3); 2: (6); 3: (2,17); 4: (1);
IBM: 4: (3); 7: (14);
Microsoft: 1: (1); 2: (1,21); 3: (3); 5: (16,22,51);
The /k operator, word1 /k word2 finds occurrences of word1 within k words of word2 (either on left or right side), where k is a positive integer argument. Thus k = 1 demands that word1 be adjacent to word2.
What is the set of documents that satisfy the query Gates /2 Microsoft?




Select one:
1,3
3
1
 No document satisfies  the query
Feedback
The correct answer is: 1
Question 8
Given the query uni*e , if you want to search for permuterm wildcard index, which of the following keys can be looked upon?
  

Select one:
 e$uni*
e$uin*
 $unie*
Ie$un*
Feedback
The correct answer is: e$uin*
Question 9
If X denotes the length of string s1 and Y denotes the length of the string s2, then the  edit distance between s1 and s2 is never more than   --------------------


Select one:
Min(X,Y)
None of the Choices
Max(X,Y)
X+Y
Feedback
The correct answer is: Max(X,Y)
Question 10
What is the soundex code for the term “amazing”?

Select one:
  A552
A252
 A525
A255
Feedback
The correct answer is:  A525
Question 11
Given a document collection of 1000 documents which has 110 relevant documents for a given query and if the IR system retrieves 30 relevant and 15 irrelevant documents, what is the recall value of the system?


Select one:
0.03
0.27
0.33
0.66
Feedback
The correct answer is: 0.27
Question 12
 When Lemmatization is applied to the term “Destruction” to which of the following form it gets reduced?


Select one:
Destruct
Destroy
Destruc
Feedback
The correct answer is: Destroy
Question 13
Variable-size postings lists is used when



Select one:
 Less seek time is desired and the corpus is dynamic
Less seek time is desired and the corpus is static
More seek time is desired and the corpus is dynamic
More seek time is desired and the corpus is static
Feedback
The correct answer is: More seek time is desired and the corpus is dynamic
Question 14
 Inverted Index Dictionary is sorted by


Select one:
Term frequency
Document Frequency
Term/TermID
DocID
Feedback
The correct answer is: Term/TermID
Question 15
Which of the following is called an extended biword?
Select one:
NXNN
NNX*
NX*N
*NNX
Feedback
The correct answer is: NX*N
Question 16
If the two postings list are of length X and Y , then maximum number of operations  needed for merge is
     


Select one:
max(X,Y)
X+Y
X*Y
min(X,Y)
Feedback
The correct answer is: X+Y
Question 17
Given the Boolean query with terms (cat OR bat) AND NOT (dog or mat)
       Which of the following will be the equivalent Disjunctive Normal  Form of  
       the above query?

Select one:
 (cat AND (NOT dog) AND (NOT mat))  OR (cat AND bat AND(NOT dog))
(cat AND (NOT dog) AND (NOT mat))  OR (bat AND (NOT mat) AND(NOT dog))
None of the Choices
(cat AND bat AND (NOT dog)) OR (cat AND bat AND (NOT mat))
Feedback
The correct answer is: (cat AND (NOT dog) AND (NOT mat))  OR (bat AND (NOT mat) AND(NOT dog))
Question 18
If string s1= filosophi and s2= philosophy, what is the minimum edit distance
        between s1and s2?



Select one:
3
5
4
2
Feedback
The correct answer is: 3
Question 19
 Given a document containing the sentence “I left my left bag at my home”  the number of tokens in the sentence is
Select one:
8
6
4
Feedback
The correct answer is: 8
Question 20
Given a document collection which has 35 relevant documents, if an IR system retrieves 10 relevant and 13 irrelevant documents, what is the precision value of the system?


Select one:
 0.43
0.28
0.33
0.66
Feedback
The correct answer is:  0.43
Question 21
Consider the following documents:
 Doc1: new home sales top forecasts
Doc2: home sales rise in july
Doc3: increase in home sales in july
Doc4: july new home sales rise
 When the Term Document incidence matrix is constructed and the query home AND (new  OR july)     is executed on it, the resultant doc’s retrieved will be


Select one:
Doc1
Doc1,Doc3, Doc4
 Doc1, Doc4,
Doc1, Doc2,Doc3,Doc4
Feedback
The correct answer is: Doc1, Doc2,Doc3,Doc4
Question 22
Yahoo search engine uses stemming for its Index generation
                    
Select one:
True
False
Feedback
The correct answer is 'False'.
Question 23
When stemming is used, it should be used for both indexing and query processing.
Select one:
True
False
Feedback
The correct answer is 'True'.
Question 24
 Boolean Retrieval model maintains the term frequency. Is the statement True or False.

Select one:
True
False
Feedback
The correct answer is 'False'.
Question 25
Phrase queries can be solved using N-grams.

Select one:
True
False
Feedback
The correct answer is 'False'.

No comments:

Post a Comment