RMSProp (C2W2L07)  Summary and Q&A
TL;DR
RMSprop is an algorithm that helps speed up gradient descent by damping oscillations and allowing for larger learning rates.
Key Insights
 🚦 RMSprop is an optimization algorithm that reduces oscillations in the vertical direction during gradient descent.
 🏋️ It computes exponentially weighted averages of the squares of derivatives to adjust updates in different directions.
 💨 The algorithm enables faster learning in the horizontal direction without diverging in the vertical direction.
 ❓ RMSprop can be combined with momentum to create an even better optimization algorithm.
Transcript
you've seen how using momentum can speed up gradient descent there's another algorithm called rmsprop which stands for root mean square prop they can also speed up gradient descent let's see how it works recall our example from before that if you implement gradient descent you can end up with huge oscillations in the vertical direction even while i... Read More
Questions & Answers
Q: How does RMSprop help reduce oscillations in the vertical direction during gradient descent?
RMSprop reduces oscillations by computing an exponentially weighted average of the squares of derivatives and adjusting the updates accordingly. The updates in the vertical direction are divided by a larger number, dampening the oscillations.
Q: What is the intuition behind the use of sDW and sDB in RMSprop?
The use of sDW and sDB in RMSprop aims to have a relatively smaller value for sDW, which corresponds to the vertical direction. This division helps damp out the oscillations in the vertical direction, while updates in the horizontal direction remain unaffected.
Q: How does RMSprop allow for larger learning rates?
By dampening oscillations and ensuring stable updates in the vertical direction, RMSprop allows for larger learning rates. This faster learning in the horizontal direction can speed up the convergence of the algorithm.
Q: Can RMSprop be used in highdimensional parameter spaces?
Yes, RMSprop can be used in highdimensional parameter spaces. While the example uses B and W as illustrations, in practice, DW and DB represent highdimensional parameter vectors. The intuition remains the same with dampening oscillations in specific dimensions.
Summary & Key Takeaways

RMSprop is an optimization algorithm that reduces oscillations in the vertical direction during gradient descent, allowing for faster learning in the horizontal direction.

The algorithm computes exponentially weighted averages of the squares of derivatives to adjust the updates in the vertical and horizontal directions.

In practice, RMSprop divides the updates in the vertical direction by a larger number and in the horizontal direction by a smaller number, resulting in more stable and efficient learning.