4

Classification I — Logistic Regression

Summary

From predicting numbers to predicting categories: the sigmoid function as a probability gate, binary cross-entropy loss, gradient descent for classification, decision boundary visualisation, one-vs-all multiclass classification on the Iris dataset, and the XOR problem — which reveals the fundamental limits of linear classifiers and motivates neural networks.