Inward & Outward Facing Fast Random Projections.
Fast Random Projections Using the Walsh–Hadamard Transform and Sign-Flips
Good afternoon everyone. Today, I want to walk you through a fast and elegant way to construct random projections, using the Walsh–Hadamard transform combined with simple sign-flip operations.
Background – Why Random Projections?
Random projections are a powerful tool in machine learning and signal processing. They help reduce dimensionality, spread information evenly, and preserve distances in high-dimensional space — all while avoiding the cost of storing large dense random matrices.
Traditionally, a random projection is implemented as:
where is a dense random matrix. The drawback? Computing is in time and storage.
We can do better.
The HD Construction
Here’s the idea:
Let be the Walsh–Hadamard matrix, and be a diagonal matrix whose entries are random . Then:
This is a fast random projection because:
It is a random change of basis transform rather that the structured change of basis of the Walsh-Hadamard matrix alone.
-
Applying is just a pattern of sign flips — time.
-
Applying can be done in time via the Fast Walsh–Hadamard Transform (FWHT).
-
No multiplications are needed; only additions, subtractions, and sign changes.
Orthogonality and Inverse
An important property: each row of is still orthogonal. And both and are self-inverse:
So the inverse of
Here’s the derivation:
Inward vs Outward Facing Projections
Interestingly, and behave differently:
-
Inward-facing projection: spreads out the data evenly — a change in one cordinate in x is spread to all coordinates in .
-
Outward-facing projection: first applies which can concentrate energy into fewer coordinates, then just flips signs.
Why Outward Facing is Useful for Backpropagation
Suppose you apply as the final step in a neural network’s forward pass.
During backpropagation, the error vector must be multiplied by the inverse:
So the error signal is automatically redistributed using an inward-facing projection — ensuring the gradient flows back in a randomized but well-spread manner. This can improve robustness and reduce overfitting tendencies.
Multiple-Stage Projections for Sparse Data
For dense data, a single works well.
But for sparse data, one stage might not sufficiently mix the information. We can chain multiple stages:
Each is an independent random diagonal matrix. This compounding improves mixing for sparse inputs.
Self-Inverse Random Projections
We can even design self-inverse projections:
The inverse is:
This is handy when you need the forward and backward transforms to be identical — for example, in certain symmetric network architectures.
For sparse input data, one option is:
which retains self-inverse properties while enhancing mixing.
Summary
gives a fast, orthogonal, low-storage random projection.
-
is inward-facing — great for forward mixing.
-
is outward-facing — great for preparing error signals in backprop.
-
Multi-stage and self-inverse versions extend these ideas for sparse data.
In short, combining the Walsh–Hadamard transform with sign flips is a simple but extremely versatile trick for machine learning and signal processing.
Closing
This method gives you projections without storing large random matrices, while keeping all the nice orthogonality properties of dense random projections.
It’s a tool worth adding to your mental toolbox for efficient high-dimensional transformations.
Thank you.
Comments
Post a Comment