Momentum Algorithm - Python Automation and Machine Learning for ICs - - An Online Book -  | 
    ||||||||
| Python Automation and Machine Learning for ICs http://www.globalsino.com/ICs/ | ||||||||
| Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix | ||||||||
================================================================================= In standard gradient descent, the update of the model parameters at each iteration is based solely on the current gradient of the loss function with respect to those parameters. The update rule is given by, 
 In gradient descent, momentum is a technique used to accelerate the convergence of an optimization algorithm, especially in the presence of noisy or sparse gradients. The idea behind momentum is to introduce a velocity term that helps the optimization process to navigate through the loss landscape more efficiently: 
 The momentum term helps the optimization process to keep moving in the same direction when the gradients change direction frequently, allowing for faster convergence. It can be particularly useful in overcoming oscillations or small, noisy gradients. Common choices for the momentum term () include values like 0.9 or 0.99. Adjusting the momentum term and learning rate is often necessary for optimal performance on a specific task. The nature of the update rule of the momentum is: 
 This behavior helps the optimization process navigate through flat or slowly changing regions (horizontal lines) more quickly while being more cautious in steep or rapidly changing regions (vertical lines). It allows the algorithm to maintain a higher velocity in directions where the gradients consistently point, helping it traverse flat regions more efficiently and converge faster along the less steep directions. 
 
 ============================================          
	   
 
 
 
 
 
 
 
 
 
  | 
    ||||||||
| ================================================================================= | ||||||||
| 
         | 
    ||||||||