A Beginner’s Journey Through Logistic Regression and Deep Learning Concepts Inspired by Andrew Ng
Introduction
Machine Learning and Deep Learning are revolutionizing the tech world, and diving into these fields can feel overwhelming. Inspired by Andrew Ng’s highly-acclaimed Deep Learning Specialization, I’ve embarked on a journey to breakdown these concepts and share my learnings step by step. This post covers the foundational building blocks of logistic regression and its implementation — both mathematically and programmatically.
Whether you’re just starting or need a refresher, this guide will help you connect theory with practice.
Step 1: Understanding the Sigmoid Function
At the heart of logistic regression lies the sigmoid function, defined as:
Purpose: The sigmoid maps any input z to a value between 0 and 1. This makes it perfect for binary classification problems where outputs represent probabilities.
Python Implementation:
import numpy as np
def sigmoid(z):
return 1 / (1 + np.exp(-z))
Example:
Let’s compute the output for z = 4 using σ(z):
z = 4
y_hat = sigmoid(z)
print(f"Sigmoid Output: {y_hat:.2f}")
Step 2: Cost Function for Logistic Regression
To evaluate how well our model performs, we use the logistic regression cost function:
This measures the difference between predicted probabilities
Step 3: Computing Gradients
Using the chain rule, we compute gradients of the cost function with respect to weights (w) and bias (b):
. Gradient for weights:
. Gradient for bias:
Step 4: Implementing Gradient Descent
Gradient descent is the optimization algorithm we use to minimize the cost function iteratively:
- Update rule for weights:
- Update rule for bias:
- Where α is the learning rate.
Python Implementation:
def gradient_descent(X, y, w, b, alpha, num_epochs):
m = X.shape[0]
for epoch in range(num_epochs):
z = np.dot(X, w) + b
y_hat = sigmoid(z)
cost = -np.mean(y * np.log(y_hat) + (1 - y) * np.log(1 - y_hat))
dw = (1/m) * np.dot(X.T, (y_hat - y))
db = (1/m) * np.sum(y_hat - y)
w -= alpha * dw
b -= alpha * db
if epoch % 10 == 0:
print(f"Epoch {epoch}: Cost = {cost:.4f}")
return w, b
Step 5: Vectorization for Faster Computation
Using vectorized operations speeds up computations significantly by avoiding Python loops and leveraging efficient matrix operations.
- Example of cost computation:
z = np.dot(X, w) + b
y_hat = sigmoid(z)
cost = -np.mean(y * np.log(y_hat) + (1 - y) * np.log(1 - y_h
Practical Example: Logistic Regression with Toy Data
Let’s implement everything step by step on toy data.
Dataset:
X = np.array([[1], [2], [3]])
y = np.array([0, 1, 1])
w = np.zeros((X.shape[1],))
b = 0
alpha = 0.1 #learning rate
epochs = 100
w, b = gradient_descent(X, y, w, b, alpha,epochs)
print(f"Trained weights: {w}")
print(f"Trained bias: {b}")
Conclusion
In this post, we’ve explored the foundations of logistic regression, from the sigmoid function to gradient descent and vectorization. Each concept was implemented in Python to enhance reading and understanding both the theory and practice.
Next, we’ll dig deeper into neural networks and explore their connections to logistic regression — taking one step closer to building complex models.
If you’re also on this learning journey, feel free to share your thoughts or questions in the comments. Let’s learn together!
Your Feedback
Did you find this post helpful? Let me know what topics you’d like me to explore next. Don’t forget to like, share, and follow for mo