10.1. Long Short-Term Memory (LSTM) — Dive into Deep Learning 1.0.3 documentation thumbnail
10.1. Long Short-Term Memory (LSTM) — Dive into Deep Learning 1.0.3 documentation
d2l.ai
the problems of learning long-term dependencies (owing to vanishing and exploding gradients) became salient LSTMs resemble standard recurrent neural networks but here each ordinary recurrent node is replaced by a memory cell. Each memory cell contains an internal state, i.e., a node with a self-conn
2 Users
0 Comments
19 Highlights
3 Notes

Top Highlights

  • the problems of learning long-term dependencies (owing to vanishing and exploding gradients) became salient
  • LSTMs resemble standard recurrent neural networks but here each ordinary recurrent node is replaced by a memory cell. Each memory cell contains an internal state, i.e., a node with a self-connected recurrent edge of fixed weight 1, ensuring that the gradient can pass across many time steps without vanishing or exploding.
  • The term “long short-term memory” comes from the following intuition. Simple recurrent neural networks have long-term memory in the form of weights. The weights change slowly during training, encoding general knowledge about the data. They also have short-term memory in the form of ephemeral activations, which pass from each node to successive node...
  • Each memory cell is equipped with an internal state and a number of multiplicative gates that determine whether (i) a given input should impact the internal state (the input gate), (ii) the internal state should be flushed to 0 (the forget gate), and (iii) the internal state of a given neuron should be allowed to impact the cell’s output (the out...
  • we have dedicated mechanisms for when a hidden state should be updated and also when it should be reset

Domain

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.