The basic behavior of an artificial neural network is determined by the dot product (weighted sum) operator in each neuron.

This also has a geometric interpretation.

The dot product of A and B is the magnitude (vector length) of vector A by the magnitude of vector B by the cosine of the angle between them.

Dot product video.

To understand information storage in a weighted sum it helps to know the central limit theorem.

Statistical properties of the dot product.

The central limit theorem.

If you store 1 <vector,scalar> association in a weighed sum the weight vector will point in the same direction as the input vector. Store 2 <vector,scalar> associations and both input vectors will point some angle away from the weight vector. As a result the scalar output will be more sensitive to small changes in the input vectors. A reduction in an initial mild error correction capacity.

If you store too many associations you can only get approximations to the scalar values you want. They will basically be contamintated with Gaussian noise.

Switch Net 4 Switch Net 4. Random sign flipping before a fast Walsh Hadamard transform results in a Random Projection. For almost any input the result is a Gaussian distributed output vector where each output element contains knowledge of all the input elements. With Switch Net 4 the output elements are 2-way switched using (x<0?) as a switching predicate. Where x is the input to the switch. The switches are grouped together in units of 4 which together form a small width 4 neural. When a particualr x>=0, the pattern in the selected pattern of weights is forward projected with intensity x. When x<0 a different pattern of forward selected weights is again projected with intensity x (x being negative this time.) If nothing was projected when x<0 then the situation would be identical to using a ReLU. You could view the situation in Switch Net 4 as using 2 ReLUs one with input x and one with input -x. The reason to 2-way switch (or +- ReLU) is to avoid early information los

Random Projections for Neural Networks Variance and the CLT For Linear Combinations of Random Variables a negative sign before a particular variable has no effect on variance. And of course the Central Limit Theorem is in operation. The Walsh Hadamard transform (WHT) is a set of orthogonal weighted sums (where the weights are +1 or -1 or some constant multiple of those.) And so the variance equation for linear combinations of random variables applies (minus signs not invalidating that), as does the Central Limit Theorem. 8-Point Walsh Hadamard transform. Therefore applying the WHT to a sequence of random numbers from the Uniform random number distribution results in a sequence of numbers from the Gaussian (Normal) distribution. Example code is here: https://editor.p5js.org/siobhan.491/sketches/WhoxMA7pH If you want to use that behavior to generate Gaussians you should remember that the WHT leaves vector magnitued (length) unchanged (except by a constant c.) The result is all the Gau

Switch Net Switch Net. Switch Net is a neural network based on using the fast Walsh Hadamard tranform (WHT) as a fixed set of layer weights with a parametric switching based activation function. The cost of the WHT is nlog2(n) add subtracts as opposed to n squared fused multiply adds for a conventional dense neural layer's weight calculations. A fixed random pattern of sign flips is applied to the input data so that the first WHT fairly distributes information with a Gaussian distribution across its outputs. Then a switching decision is applied to each output of the WHT and the output is multiplied by one weight or another. The section between the red lines is basically a layer and can be repeated a number of times. A final WHT is done to combine the last lot of switched weight outputs. Some example code is here: https://editor.p5js.org/congchuatocmaydangyeu7/sketches/7ekZTwQMF One slight issue is if the switching weights for a particular output have opposite signs then the resul

Further information.

ReplyDeleteThe Weighted Sum:

https://archive.org/details/the-weighted-sum

A frozen neural network:

https://archive.org/details/afrozenneuralnetwork

The Walsh Hadamard transform:

https://archive.org/details/whtebook-archive

The Walsh Hadamard transform (short):

https://archive.org/details/short-wht

Activation weight switching (Beyond ReLU):

https://archive.org/details/activation-weight-switching

Zero curvature initialisation of neural networks:

https://archive.org/details/zero-curvatue

SwitchNet4 neural network:

https://discourse.processing.org/t/switch-net-4-neural-network/33220