Logistic Regression

What is Logistic Regression?

Logistic regression is a regression algorithm that uses a logistic function on the input features to predict the class probability or directly the class label for the target variable. In the second case, the output represents a set of categories instead of continuous values, meaning that the logistic regression acts here as a classification technique. A typical data science use case for logistic regression is predicting the likelihood of customer churn.

 

How does Logistic Regression mathematically differ from Linear Regression?

  • Linear regression predicts a continuous numerical output (such as estimating a temperature or a price), and its output can theoretically range from negative to positive infinity.
  • Logistic regression, however, is designed to output a probability value bounded strictly between 0 and 1. It achieves this by taking the standard linear equation and passing its result through a specific non-linear transformation function, making it explicitly suitable for classification rather than continuous forecasting.

 

What are the primary types of Logistic Regression models?

The algorithm is categorized into three types based on the structure of the target variable:

  • Binary Logistic Regression: The target variable has exactly two mutually exclusive outcomes (e.g., predicting whether an email is "Spam" or "Not Spam").
  • Multinomial Logistic Regression: The target variable has three or more possible discrete outcomes with no quantitative significance or order (e.g., predicting whether a user will click on Ad A, Ad B, or Ad C).
  • Ordinal Logistic Regression: The target variable has three or more possible outcomes that have a strict, defined hierarchy or ranking (e.g., predicting a customer satisfaction rating of "Poor", "Average", or "Excellent").