Rick Merritt


22 Quotes

"A transformer model is a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence."
Rick Merritt
What Is a Transformer Model?
"Transformer models apply an evolving set of mathematical techniques, called attention or self-attention, to detect subtle ways even distant data elements in a series influence and depend on each other."
Rick Merritt
What Is a Transformer Model?
"Transformers are translating text and speech in near real-time, opening meetings and classrooms to diverse and hearing-impaired attendees."
Rick Merritt
What Is a Transformer Model?
"Transformers can detect trends and anomalies to prevent fraud, streamline manufacturing, make online recommendations or improve healthcare."
Rick Merritt
What Is a Transformer Model?
"People use transformers every time they search on Google or Microsoft Bing."
Rick Merritt
What Is a Transformer Model?
"Created with large datasets, transformers make accurate predictions that drive their wider use, generating more data that can be used to create even better models."
Rick Merritt
What Is a Transformer Model?
"Transformers are in many cases replacing convolutional and recurrent neural networks (CNNs and RNNs), the most popular types of deep learning models just five years ago."
Rick Merritt
What Is a Transformer Model?
"Before transformers arrived, users had to train neural networks with large, labeled datasets that were costly and time-consuming to produce."
Rick Merritt
What Is a Transformer Model?
"By finding patterns between elements mathematically, transformers eliminate that need, making available the trillions of images and petabytes of text data on the web and in corporate databases."
Rick Merritt
What Is a Transformer Model?
"In addition, the math that transformers use lends itself to parallel processing, so these models can run fast."
Rick Merritt
What Is a Transformer Model?
"Small but strategic additions to these blocks (shown in the diagram below) make transformers uniquely powerful."
Rick Merritt
What Is a Transformer Model?
"Transformers use positional encoders to tag data elements coming in and out of the network. Attention units follow these tags, calculating a kind of algebraic map of how each element relates to the others."
Rick Merritt
What Is a Transformer Model?
"Attention queries are typically executed in parallel by calculating a matrix of equations in what’s called multi-headed attention."
Rick Merritt
What Is a Transformer Model?
"“Meaning is a result of relationships between things, and self-attention is a general way of learning relationships,”"
Rick Merritt
What Is a Transformer Model?
"“Machine translation was a good vehicle to validate self-attention because you needed short- and long-distance relationships among words,”"
Rick Merritt
What Is a Transformer Model?
"Thanks to a basket of techniques, they trained their model in just 3.5 days on eight NVIDIA GPUs, a small fraction of the time and cost of training prior models. They trained it on datasets with up to a billion pairs of words."
Rick Merritt
What Is a Transformer Model?
"Their Bidirectional Encoder Representations from Transformers (BERT) model set 11 new records and became part of the algorithm behind Google search."
Rick Merritt
What Is a Transformer Model?
"Soon transformer models were being adapted for science and healthcare."
Rick Merritt
What Is a Transformer Model?
"DeepMind, in London, advanced the understanding of proteins, the building blocks of life, using a transformer called AlphaFold2, described in a recent Nature article."
Rick Merritt
What Is a Transformer Model?
"NVIDIA and Microsoft hit a high watermark in November, announcing the Megatron-Turing Natural Language Generation model (MT-NLG) with 530 billion parameters. It debuted along with a new framework, NVIDIA NeMo Megatron, that aims to let any business create its own billion- or trillion-parameter transformers to power custom chatbots, personal assistants and other AI applications that understand language."
Rick Merritt
What Is a Transformer Model?
"To provide the computing muscle those models need, our latest accelerator — the NVIDIA H100 Tensor Core GPU — packs a Transformer Engine and supports a new FP8 format. That speeds training while preserving accuracy."
Rick Merritt
What Is a Transformer Model?
"Retrieval-based models learn by submitting queries to a database. “It’s cool because you can be choosy about what you put in that knowledge base,” he said."
Rick Merritt
What Is a Transformer Model?

Want to Save Quotes?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.