survival8: Naïve Bayes Classifier for Spam Filtering

Wednesday, July 28, 2021

Naïve Bayes Classifier for Spam Filtering

Concepts of Probability
Indepedent Events
Flipping a coin twice.

Dependent Events
Drawing two cards one by one from a deck without replacement.

First time: 52 cards
P(Jack of hearts) = 1/52

At the time of drawing second card, deck has now left with: 51 cards
So the deck at the time of second draw has changed because we are doing it without replacement

Addition Rule



Multiplication Rule



Bayes Theorem



What Is The Probability Of Getting “Class Ck And All The Evidences 1 To N”:



X1 to XN Are Our Evidence Events And They Are All Independent As Assumed In Naïve Bayes Algorithm (Or Classification).

P(x1, x2, x3, C) = P(x1|(x2, x3, C)) . P(x2, x3, C) 
RHS = P(x1|(x2, x3, C)).P(x2|(x3, C)).P(x3, C)
RHS = P(x1|(x2,x3,C)) . P(x2 | (x3, C)) . P(x3 | C) . P(C)

And if x1, x2 and x3 are independent of each other:
RHS = P(x1 | C) . P(x2 | C) . P (x3 | C) . P(C)

FRUIT PROBLEM



A fruit is long, sweet and yellow. Is it a banana? Is it an orange? Or is it some different fruit?
P(Banana | Long, Sweet, Yellow)
= (P(Long, Sweet, Yellow | Banana) * P(Banana)) / P(L,S,Y) 

P(L,S,Y | B) = P(L,S,Y,B) / P(B)

Naïve Bayes => All the events (such as L, S, Y) are independent.

Now, using the 'Chain Rule' along side 'Independence Condition':

=> P(L, S, Y, B) = P(L|B) * P(S|B) * P(Y|B) * P(B)

- - - 
P(Orange | Long, Sweet, Yellow)

Answer: Whichever P() is higher

P(Banana) = 50 / 100
P(Orange) = 30 / 100
P(Other) = 20 / 100



P(Long | Banana) = 40 / 50 = 0.8



P(Sweet | Banana) = 35 / 50 = 0.7



P(Yellow | Banana) = 45 / 50 = 0.9






P(Banana|Long, Sweet and Yellow) 
= P(Long|Banana) * P(Sweet|Banana) * P(Yellow|Banana) * P(banana)/ 
					(P(Long) * P(Sweet) * P(Yellow))
 = 0.8 * 0.7 * 0.9 * 0.5 / P(evidence) =0.252/denom

P(Orange|Long, Sweet and Yellow) = 0 

P(Other Fruit|Long, Sweet and Yellow)
= P(Long|Other fruit) * P(Sweet|Other fruit) * P(Yellow|Other fruit) * P(Other Fruit)/ (P(Long) * P(Sweet) * P(Yellow))  = 0.018/denom



P(ham | d6) and P(spam | d6)
D6: good? Bad! very bad!

P(ham | good, bad, very, bad) = 
P (good, bad, very, bad, ham) / P(good, bad, very, bad))

P(good, bad, very, bad, ham) =  P(good|ham)*P(bad|ham)*P(very|ham)*P(bad|ham)*P(ham)







Classified as spam!

Practice Question
Ques 1: What is the assumption about the dataset on which we can apply Naive Bayes' classification algorithm?
Ans 1:
That the evidence events should be independent of each other.

Ques 2: What is 'recall' metric in classification report?
Ans 2:
Recall: How many of the selected class instances have been predicted correctly (or we say “have been recalled”).

Labels: Technology,Artificial Intelligence,Machine Learning,