Linear Transformations
Definition
A function \(T: \mathbb{R}^n \to \mathbb{R}^m\) is a linear transformation if for all \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\) and scalar \(c\):
- \(T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})\) — additivity
- \(T(c\mathbf{u}) = cT(\mathbf{u})\) — homogeneity
Every linear transformation can be represented as matrix multiplication: \(T(\mathbf{x}) = A\mathbf{x}\).
Standard Matrix
The standard matrix of \(T\) is built by applying \(T\) to each standard basis vector \(\mathbf{e}_i\):
\[A = \begin{bmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) & \cdots & T(\mathbf{e}_n) \end{bmatrix}\]
Common Transformations in \(\mathbb{R}^2\)
Rotation by angle \(\theta\):
\[A = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}\]
Scaling:
\[A = \begin{pmatrix} s_x & 0 \\ 0 & s_y \end{pmatrix}\]
Reflection across the \(x\)-axis:
\[A = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}\]
Composition
Applying \(T_1\) then \(T_2\) is equivalent to multiplying their matrices:
\[T_2(T_1(\mathbf{x})) = A_2 A_1 \mathbf{x}\]
Note the order: \(A_2 A_1\), not \(A_1 A_2\).
Connection to Neural Networks
Each layer of a neural network applies a linear transformation \(A\mathbf{x}\) followed by a non-linear activation function. Understanding linear transformations is foundational to understanding how neural networks transform input data through layers.
Next: Programming Assignment — Single Perceptron Neural Networks