Normalizing Input in Neural Network
- Python Automation and Machine Learning for ICs - - An Online Book - |
||||||||||||||||||||||||||
Python Automation and Machine Learning for ICs http://www.globalsino.com/ICs/ | ||||||||||||||||||||||||||
Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix | ||||||||||||||||||||||||||
================================================================================= Normalization typically refers to the process of scaling input features so that they have a mean of 0 and a standard deviation of 1. This can help the neural network converge faster during training and can mitigate problems associated with large or small input values. Normalizing the input in a neural network can be beneficial for several reasons, including avoiding saturation issues when the input values (z) are too large or too small. When input values are too large, they can lead to saturation of activation functions, such as the sigmoid or tanh functions, causing gradients to become very small during backpropagation. This is known as the vanishing gradient problem and can make training slow and difficult. Normalizing the input helps keep the values within a reasonable range, reducing the likelihood of saturation. Similarly, when input values are too small, the network may have difficulty learning meaningful representations, and normalization can help address this issue by bringing the values to a more suitable scale for learning. Batch normalization is a specific technique used in neural networks to normalize the input of each layer during training. It has been shown to improve training stability and speed by addressing issues related to input scale. The normalization can be done with the formula below, x = x/σ --------------------------------------------------- [3727a] where, σ is standard deviation. The normalization is scaling each value by dividing it by the standard deviation ( ) calculated from the dataset. Normalization is a common preprocessing step in machine learning, and it's often done to bring all features to a similar scale. This can be important for certain machine learning algorithms that are sensitive to the scale of input features.Figure 3717a shows the input features is scaled by normalization so that they have a mean of 0 and a standard deviation of 1.
Figure 3717b shows the normalization effect on Mean squared error (MSE). (a) (b) (c)
Figure 3717c shows the the benefits of normalization in the gradient descent:
. Figure 3717c. Benefits of normalization in the gradient descent (Code). Figure 3717d shows the data distribution, KDE (kernel density estimation) and variance before and after normalization.
============================================
|
||||||||||||||||||||||||||
================================================================================= | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||