Random Initialization (C1W3L11)  Summary and Q&A
TL;DR
Initializing the weights randomly in a neural network is crucial for preventing symmetry and promoting different functions among hidden units.
Key Insights
 🥺 Initializing weights to zero in a neural network leads to symmetry among hidden units.
 🇦🇪 Symmetry among hidden units inhibits learning and prevents differentiation.
 🏋️ Randomly initializing weights prevents symmetry and promotes different functions among hidden units.
 🏋️ Multiplying randomly initialized weights by a small constant prevents activation function saturation and promotes efficient learning.
 🍉 Bias terms can be initialized to zero without causing symmetry problems.
 🏋️ The constant for weight initialization, such as 0.01, can be adjusted based on the depth of the neural network.
 ❓ Random initialization is crucial for the successful training of neural networks.
Transcript
when you train your neural network is important to initialize the weights randomly for logistic regression it was okay to initialize the weights to zero but for a neural network of initializer wastes the parameters to all 0 and then apply gradient descent it won't work let's see why so you have here two input features so n 0 is equal to 2 and 2 hid... Read More
Questions & Answers
Q: Why is it important to initialize the weights randomly in a neural network?
Random initialization in a neural network prevents symmetry among hidden units, allowing them to compute different functions and facilitating learning and differentiation.
Q: Is it acceptable to initialize the bias terms in a neural network to zero?
Yes, initializing the bias terms to zero is acceptable as it does not cause symmetry problems like initializing weights to zero does.
Q: How does symmetry affect the learning process in a neural network?
Symmetry in a neural network makes hidden units compute the same function, hindering learning and preventing differentiation among units.
Q: Why should the weights in a neural network be initialized with small random values?
Initializing weights with small random values prevents activation functions from saturating and promotes faster learning, as large weights can lead to slow gradient descent.
Q: Can a different constant be used instead of 0.01 for weight initialization?
Yes, when training a very deep neural network, a different constant, such as 0.001, may be more suitable, based on the specific learning requirements.
Summary & Key Takeaways

It is important to initialize the weights randomly in a neural network, unlike logistic regression where initializing weights to zero is acceptable.

Initializing weights to zero in a neural network leads to symmetry, where hidden units compute the same function, inhibiting learning and differentiation.

Multiplying randomly initialized weights by a small constant, such as 0.01, helps prevent saturation of activation functions and promotes efficient learning.