Hash & Select Extreme Learning Machine

Unit of Hash & Select Extreme Learning Machine

What’s happening step by step:

  1. Input vector:
    Length n, where n= 2^k (e.g. 8, 16, 32).

  2. Random projection RR:

    • A linear random projection spreads out the input information across all dimensions.

    • This ensures that even small input changes can flip some signs after binarization.

  3. Binarization → Hashing:

    • Each coordinate of RxR \mathbf{x} is binarized (sign or threshold).

    • Then grouped into 8-bit chunks → 256 possible keys.

    • This is essentially a locality sensitive hash (LSH). Inputs close in RR-space are likely to share the same or nearby keys.

  4. Second random projection UU:

    • URx produces a different transformed version of the input.

    • These outputs are grouped into blocks of 8 coordinates.

  5. Keyed weight block selection:

    • Each 8-dimensional block chooses a block of 8 weights from a pool of 256 blocks.

    • The chosen block index is determined by one of the 8-bit hash keys from step 3.

    • This ties the hash identity of the input (discrete) with the continuous projection URUR.

  6. Weighted multiplication:

    • The 8 coordinates of the projection block are multiplied by the 8 chosen weights.

    • This creates a final nonlinear, input-dependent weighted feature.


Properties / What this buys you:

  • Information spreading:
    Random projection distributes input energy uniformly, so binarization provides a “fair” hash.

  • Locality-sensitive weight selection:
    Keys cause structurally different weight paths for different inputs. This introduces discrete branching capacity into the ELM.

  • Hybrid continuous-discrete feature map:
    The continuous projection URUR gets modulated by discrete keys from RR. This creates richer feature diversity compared to plain ELMs.

  • Capacity expansion:
    With 256 weight blocks per group, you’ve essentially introduced a combinatorial number of possible weight configurations, vastly increasing representational power with modest parameter counts.

  • Analogy:
    It’s a bit like combining:

    • Random Fourier features (spreading info),

    • Locality sensitive hashing (discrete routing),

    • Mixture-of-experts gating (keyed weight selection).

Ensemble Several Units 


You can use Sum-then-Project or Project-then-Sum again using random projections to ensemble several units of Hash & Select Extreme Learning Machines all acting on the same input, into a more powerful system. 

Hints:
1/ Fast Inward and Outward facing Random Projections:
Use inward facing random projections for the extreme learning machine units and outward facing random projections to ensemble. 

2/ Sum-then-Project Vs Project-then-Sum:

3/ You could increase the block size to 16 for example, giving the extreme learning machine vast memory capacity. That would result in 65536 blocks of 16 weights for each 16 coordinate channel. You could tame that down a bit by using (for example) only 12 bits of each 16 bit key.

4/ Example code using Java:



Comments

Popular posts from this blog

Neon Bulb Oscillators

23 Circuits you can Build in an Hour - Free Book

Q Multiplier Circuits