survival8: Logistic Regression Equation

Monday, July 29, 2024

Logistic Regression Equation

To See All ML Articles: Index of Machine Learning

What is Logistic Regression?

1. Logistic Regression is a binary classification algorithm. 
However, its multi-class variant also is there called Softmax Regression.

2. It is a supervised learning algorithm.

3. Logistic Regression is a probabilistic model.

4. It is applicable for linearly separable data. 
However, it can be tuned to accommodate for noise in the data.

We would arrive at the Logistic Regression equation by the end of this video...



Giving you a glimpse of end result, first:




y = x 

This is a simple function. Presented here for the completeness of slides.




y = np.exp(x)




Note:

np.exp(x) is equal to e^x or math.e**x in Python.

This is monotonically increasing function.

Next, we change it to 1 / (e^x) or equivalently to e^(-x). 

y = np.exp(-x)







np.exp(-x) ==> e^-x ==> 1/(e^x) 
Note two things about it:
1: It is monotonically decreasing.
2: The minimum value it takes is 0.
So next, we would shift it by 1.

y = 1 + np.exp(-x)








This is just np.exp(-x) shifted by 1.
Note two things about it:
1: It is also monotonically decreasing.
2: The minimum value it takes is 1.
So next, we would move it to denominator.

y = 1/(1 + np.exp(-x))







Properties of y = 1/(1 + np.exp(-x))

y = 1/(1 + np.exp(-x))
Implies: y = 1/(1 + 1/e^x) 

1. It is a continuous function.
2. It is differentiable everywhere. 
Derivative of sigmoid (σ) is: σ(x)(1−σ(x))

3. max(y) = 1
4. min(y) = 0
5. It is rotationally symmetric 
around it’s midpoint at (0, 0.5).

What it means in plain English is:
1. Sigmoid has well-defined output range,
2. Easy to optimize, and 
3. Has a clear probabilistic interpretation.




What’s with the theta-transpose expression?




Note: “Theta-transpose x” means nothing but dot product of two column (or row) vectors: 
Vector 1 → θ: [θ1, θ2,…, θn]
Vector 2 → x: [x1, x2,…, xn]

Then “Theta-transpose x” means: θ1*x1 + θ2*x2 + … + θn*xn 

Interpretation




What this means is:
If value from logistic function > 0.5: Class of input is 1
If value from logistic function < 0.5: Class of input is 0

Ref: stanford.edu