Programming Assignment: Single Perceptron Neural Networks for Linear Regression

linear-algebra
python
machine-learning
Published

January 7, 2025

Objective

A single perceptron with no activation function is equivalent to linear regression. Given inputs \(\mathbf{x}\), the output is:

\[\hat{y} = \mathbf{w}^T \mathbf{x} + b\]

We train it by minimizing the mean squared error (MSE) loss using gradient descent:

\[\mathcal{L} = \frac{1}{m}\sum_{i=1}^{m}(\hat{y}^{(i)} - y^{(i)})^2\]

import numpy as np

np.random.seed(42)
m = 100
X = 2 * np.random.rand(m, 1)
y = 3 * X.squeeze() + 1.5 + np.random.randn(m) * 0.5

Perceptron Implementation

def perceptron_train(X, y, lr=0.1, epochs=1000):
    m, n = X.shape
    w = np.zeros(n)
    b = 0.0
    losses = []

    for _ in range(epochs):
        y_hat = X @ w + b
        error = y_hat - y
        loss = np.mean(error ** 2)
        losses.append(loss)

        # Gradients
        dw = (2 / m) * X.T @ error
        db = (2 / m) * np.sum(error)

        w -= lr * dw
        b -= lr * db

    return w, b, losses

w, b, losses = perceptron_train(X, y)
print(f"Learned weight: {w[0]:.4f}  (true: 3.0)")
print(f"Learned bias:   {b:.4f}  (true: 1.5)")
print(f"Final MSE loss: {losses[-1]:.4f}")

Normal Equation (Closed-form Solution)

The optimal weights can also be found analytically:

\[\mathbf{w}^* = (X^T X)^{-1} X^T \mathbf{y}\]

X_b = np.hstack([np.ones((m, 1)), X])  # add bias column
theta = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y
print(f"Normal equation — bias: {theta[0]:.4f}, weight: {theta[1]:.4f}")