survival8: Getting the Geometric Intuition Behind Logistic Regression

Saturday, August 10, 2024

Getting the Geometric Intuition Behind Logistic Regression

To See All ML Articles: Index of Machine Learning

One of the first things to know about Logistic Regression is that:
• It is a Linear Model.

That means output of this model depends on the linear combination of it's features.

Having said that:

As a first step, let's create a linear combination of the features of the dataset with features :

 $\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n = 0$ 

where  $\beta_0$  is the intercept, and  $\beta_1, \beta_2, \ldots, \beta_n$  are the coefficients of the features  $x_1, x_2, \ldots, x_n$ .

Geometric Intuition

This equation of linear combination of features resembles the equation of a plane.

The equation of a plane in three-dimensional space is a linear equation that represents all the points  $(x, y, z)$  that lie on the plane. The general form of the equation of a plane is:

 $ax + by + cz = d$ 

Formula for Distance from a Point to a Plane

For a point  $P(x_0, y_0, z_0)$ , the distance  $D$  from the point to the plane is given by:

 $D = \frac{|ax_0 + by_0 + cz_0 - d|}{\sqrt{a^2 + b^2 + c^2}}$ 


Second thing to remember about Logistic Regression is that:
• It is a Binary classification model.

But how does that matter?

Being a linear model: we can say that decision boundary for Logistic Regression would be line in 2D, plane in 3D and hyperplane in nD.

Being a binary classification model: we can say that points will lie on either side of the decision boundary.

This means that distance of a point on the decision boundary will have the following expression set to 0:

 $D = \frac{|ax_0 + by_0 + cz_0 - d|}{\sqrt{a^2 + b^2 + c^2}}$ 

So: D = 0 for points on the decision boundary.

Equivalently, we can say:

 $***$ 

Or For Our Logistic Regression Model:

 $\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n = 0$ 

So the way to decide the class of a point is:

 $\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n = 0$ 

If this Beta expression > 0: point lies above the plane (on the one side of the plane)
And if this expression < 0: point lies below the plane (on the other side of the plane)

Logistic (or Sigmoid) Comes Into Picture

Now, statisticians knew that the range of  $\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n = 0$   $y \in (-\infty, \infty)$ 

To convert the Beta expression into a range of Probability and to also follow the properties of a Probability, we can pass it through a Logistic (or Sigmoid) expression:

Logistic function is:

 $\sigma(x) = \frac{1}{1 + e^{-x}}$ 

For Logistic Regression, we write:

 $\sigma(x) = \frac{1}{1 + e^{-\left(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n\right)}}$ 


Very Important Point :: This expression:  $σ (β, x)$  is the probability that data point in consideration lies in class Y=1.

Bonus Video:

Logistic Regression Indepth Intuition - Part 1



Logistic Regression Indepth Intuition - Part 2

survival8

Pages

Saturday, August 10, 2024

Getting the Geometric Intuition Behind Logistic Regression

• It is a Linear Model.

Geometric Intuition

Second thing to remember about Logistic Regression is that: • It is a Binary classification model.

But how does that matter?

Or For Our Logistic Regression Model:

Logistic (or Sigmoid) Comes Into Picture

Bonus Video:

No comments:

Post a Comment