Hadamard + random sign diagonal gives a tiny, very fast, structured reservoir that’s easy to analyze and implement. 


Quick restatement of design

A reservoir where the recurrent map is:

rt+1HDrt  +  (input addition),r_{t+1} \leftarrow H D r_t \;+\; \text{(input addition)},

where

  • HH is a (normalized) Walsh–Hadamard orthogonal matrix (HH=IH^\top H = I),

  • DD is diagonal with random ±1\pm 1 on the diagonal,

  • you add the incoming data into the reservoir vector rr over time (some injection/drive).

That’s basically an orthogonal, sign-flipping structured reservoir with super-fast transforms (FWHT O(NlogN)O(N\log N)).


What this buys you (benefits)

  1. Extremely cheap mixing.
    FWHT is very fast and memory-friendly, so you can make very large reservoirs cheaply.

  2. Norm / energy preservation (before scaling).
    Because HH and DD are orthogonal, the linear map HDHD is orthogonal → it preserves the vector norm. So information isn’t immediately lost by contraction.

  3. Good information propagation / vanishing-gradient avoidance.
    Orthogonal dynamics avoid short-term vanishing of signal energy, which is desirable for long-memory tasks.

  4. Deterministic structure + reproducibility.
    You can re-seed reproducibly; no need to store dense random matrices.

  5. Reservoir as a structured filterbank.
    The action is like a global mixing / permutation + sign pattern; with nonlinearity it creates rich features.

  6. Easy to combine with ELM-style readout.
    Collect reservoir states rtr_t (or nonlinear transforms) and learn a linear readout by ridge regression.


Important concerns & how to handle them

1. Stability / echo-state property

  • Because HDHD is orthogonal, its spectral radius is 1 (eigenvalues lie on the unit circle).
    That means the pure linear map is marginally stable — signals do not decay. In practice this can cause persistent oscillations and make the reservoir overly sensitive or non-forgetting.

2. Too little nonlinearity / richness

  • Pure linear orthogonal recurrence preserves information but won’t create the nonlinear features needed for many tasks. Add an elementwise nonlinearity (tanh, relu, clipped linear) after the transform or on the state before readout.

  • Alternatively, use a memory+nonlinearity pipeline: apply HD, then elementwise nonlinearity, then maybe a diagonal nonlinearity or subsampling.

3. Periodicity / algebraic structure

  • Walsh–Hadamard is highly structured. Depending on how input injection is done and how nonlinearities are placed, you might see periodic or quasi-periodic dynamics (not necessarily bad, but worth testing).

  • If you observe poor mixing, an easy fix is to alternate transforms: e.g. use HD1H D_1 at step t, HD2H D_2 at step t+1 (two different diagonal sign patterns), or intersperse small permutations.

4. Input injection strategy matters

  • Options:

    • Additive injection: rρHDr+Winutr \leftarrow \rho H D r + W_{\text{in}} u_t. Simple and common.

    • Concatenation + projection: add input into a subset of reservoir nodes each step.

    • Periodic refresh: replace a fraction of rr with transformed inputs occasionally to avoid destructive interference.


Memory, capacity and dynamics

  • Orthogonal (unitary) recurrence tends to preserve memory well and supports long effective memory when combined with small leak or gain near 1. This often yields high linear memory capacity.

  • However, raw orthogonal linear reservoirs can be too “linear” — the effective computational power for nonlinear tasks depends on the nonlinearity and input projection.

Comments

Popular posts from this blog

Neon Bulb Oscillators

23 Circuits you can Build in an Hour - Free Book

Q Multiplier Circuits