Perceptron
Perceptron

Perceptron

in
  1. Perceptron Principles and Implementation
    1. Single-Output Perceptron
      1. Mathematical Formulation
      2. Gradient Calculation
      3. Python Example
    2. Multi-Output Perceptron
      1. Mathematical Formulation
      2. Gradient Calculation
      3. Python Example
    3. Chain Rule
      1. Chain Rule Formula
      2. Python Example
    4. Backpropagation Algorithm
      1. Formula Derivation
      2. Python Example

Perceptron Principles and Implementation

The perceptron is a foundational concept in machine learning and the cornerstone of neural networks. This article explores the principles and implementation of perceptrons through single-output perceptrons, multi-output perceptrons, the chain rule, and the backpropagation algorithm.


Single-Output Perceptron

The single-output perceptron is the simplest neural network. Its core idea is to use a linear function to map the input to the output.

Mathematical Formulation
  • Activation function: where is an activation function (typically the sigmoid).
  • Loss function (squared error):
Gradient Calculation

Using gradient descent to optimize perceptron weights, we have:

Python Example
import numpy as np

# Activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Forward pass
def forward(X, W, b):
    return sigmoid(np.dot(X, W) + b)

# Loss function
def loss(O, T):
    return 0.5 * np.sum((O - T) ** 2)

# Gradient update
def update_weights(X, W, b, O, T, lr=0.1):
    delta = (O - T) * O * (1 - O)
    dW = np.dot(X.T, delta)
    db = np.sum(delta)
    W -= lr * dW
    b -= lr * db
    return W, b
Training output before:
[[1]
 [1]
 [1]
 [1]]
Epoch 0, Loss: 0.7340890587930462
Epoch 2000, Loss: 0.031960823202117496
Epoch 4000, Loss: 0.01475373398032874
Epoch 6000, Loss: 0.009359312487717476
Epoch 8000, Loss: 0.006791321484745856

Training output after:
[[0]
 [0]
 [0]
 [1]]

Multi-Output Perceptron

The multi-output perceptron extends the single-output version to allow multiple output nodes, suitable for multi-class classification tasks.

Mathematical Formulation
  • Multi-output formula:

  • Total error:

Gradient Calculation

Python Example
# (Code unchanged)
Epoch 0, Loss: 2.0055848558408393
Epoch 2000, Loss: 0.2695123992451301
Epoch 4000, Loss: 0.26755760116702415
Epoch 6000, Loss: 0.2668457531681846
Epoch 8000, Loss: 0.26653101686497677

Test set accuracy: 100.00%

Chain Rule

The chain rule is the core tool used to compute gradients in complex neural networks. It propagates the error step by step through each layer.

Chain Rule Formula

In neural networks:

Python Example
def chain_rule_grad(X, W, b, T, lr=0.1):
    O = forward(X, W, b)
    delta = (O - T) * O * (1 - O)
    dW = np.dot(X.T, delta)
    db = np.sum(delta, axis=0)
    W -= lr * dW
    b -= lr * db
    return W, b

Backpropagation Algorithm

Backpropagation applies the chain rule across all layers of a neural network.

Formula Derivation
  • Output layer:

  • Hidden layer:

Python Example
# (Code unchanged)
Epoch 0, Loss: 0.5887016577000074
Epoch 2000, Loss: 0.4070781963562403
Epoch 4000, Loss: 0.09266207200904633
Epoch 6000, Loss: 0.016896752153564412
Epoch 8000, Loss: 0.008292971056384607

Training output after:
[[0]
 [1]
 [1]
 [0]]