ReLU as a Switch

The ReLU (Rectified Linear Unit) activation function is one of the most commonly used activation functions in neural networks, especially in deep learning.

Definition

The ReLU function is defined as:

This means:

  • If the input x is x>0x > 0, the output is xx.

  • If the input x is x≤0x \leq 0, the output is 00.

Graphically

It looks like a straight line with a slope of 1 for positive inputs and flat (zero) for negative inputs.

Switching Viewpoint

ReLU can also be understood from an alternative perspective.

Consider that an electrical switch behaves linearly when "on" (e.g., 1 V in gives 1 V out, 2 V in gives 2 V out) and outputs zero when "off." 

From this viewpoint, ReLU acts like a switch that is "on" when x≥0x \geq 0 and "off" otherwise. The switching decision is (x≥0)?

More generally (outside of ReLU) other switching decisions are possible.




This switching interpretation can help demystify the behavior of ReLU-based neural networks. It highlights that ReLU units are effectively enabling or disabling connections based on the sign of their input. Once the switching states (i.e., which ReLUs are active) are known, the overall computation in the network simplifies: each neuron's output becomes a linear function of the input, and the entire network behaves as a piecewise linear system. 

In each linear region, standard linear algebra can be used to simplify the connected weighted sums through the active paths of the network. Giving a simple square matrix mapping of neural network input to output for each linear region. The same reasoning applies layer-wise during the feed forward phase.

During training (with SGD) all the switching states become known during the feed forward phase. That results in back-propagation only ever updating a simple linear system.
The non-linear effects of the update are deferred to later training examples.



 


Alignment of Switching

The weights along connected (switching) pathways in ReLU based neural networks strengthen in magnitude during training, making those pathways more likely to occur again.
Resulting in self-reinforcing, fire together wire together effects along connected switching pathways through the layers.



Comments

Popular posts from this blog

Neon Bulb Oscillators

23 Circuits you can Build in an Hour - Free Book

Q Multiplier Circuits