9.5. Recurrent Neural Network Implementation from Scratch — Dive into Deep Learning 1.0.3 documentation thumbnail
9.5. Recurrent Neural Network Implementation from Scratch — Dive into Deep Learning 1.0.3 documentation
d2l.ai
represent each item by a one-hot encoding all entries are set to 0 , except for the entry corresponding to our token the length of the sequence introduces a new notion of depth. In addition to the passing through the network in the input-to-output direction, inputs at the first time step must pass
2 Users
0 Comments
23 Highlights
7 Notes

Top Highlights

  • represent each item by a one-hot encoding
  • all entries are set to 0 , except for the entry corresponding to our token
  • the length of the sequence introduces a new notion of depth. In addition to the passing through the network in the input-to-output direction, inputs at the first time step must pass through a chain of
  • layers along the time steps in order to influence the output of the model at the final time step.
  • For example, with learning rate � > 0 , each update takes the form � ← � − � � . Let’s further assume that the objective function � is sufficiently smooth. Formally, we say that the objective is Lipschitz continuous with constant � , meaning that for any � and � , we have

Domain

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.