15.3. The Dataset for Pretraining Word Embeddings — Dive into Deep Learning 1.0.0-beta0 documentation thumbnail
15.3. The Dataset for Pretraining Word Embeddings — Dive into Deep Learning 1.0.0-beta0 documentation
d2l.ai
when training word embedding models, high-frequency words can be subsampled It uniformly samples an integer between 1 and max_window_size at random as the context window size. For any center word, those words whose distance from it does not exceed the sampled context window size are its context word
1 Users
0 Comments
4 Highlights
0 Notes

Top Highlights

  • when training word embedding models, high-frequency words can be subsampled
  • It uniformly samples an integer between 1 and max_window_size at random as the context window size. For any center word, those words whose distance from it does not exceed the sampled context window size are its context words.
  • In a minibatch, the � th example includes a center word and its � � context words and � � noise words. Due to varying context window sizes, � � + � � varies for different � . Thus, for each example we concatenate its context words and noise words in the contexts_negatives variable, and pad zeros until the concatenation length reaches max ...
  • To distinguish between positive and negative examples, we separate context words from noise words in contexts_negatives via a labels variable. Similar to masks, there is also a one-to-one correspondence between elements in labels and elements in contexts_negatives, where ones (otherwise zeros) in labels correspond to context words (positive example...

Domain

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.