28 Quotes
"Reduce the precision of the numerical values used within the model. For example, you can switch from float32 to float16 (or even further down to int8)."
— Prateek Joshi
How to make LLMs faster"Train a smaller model to imitate the behavior of a larger model."
— Prateek Joshi
How to make LLMs faster"Break words into smaller units (i.e. subwords). This will allow you to reduce the size of the vocabulary."
— Prateek Joshi
How to make LLMs faster"Use highly optimized libraries (like Nvidia's TensorRT) to run your AI workloads. It can significantly boost the performance."
— Prateek Joshi
How to make LLMs faster"A good chunk of the chip's memory bandwidth is consumed by the model parameters that you load."
— Prateek Joshi
How to make LLMs faster"You don't have to load model parameters for every input sequence. You can batch them together and load the parameters only once."
— Prateek Joshi
How to make LLMs faster"They are compact additional layers in the model (e.g. LoRa, QLoRa). These layers are tunable, which means you can train them to do what you want. You can make these layers lightweight, which helps the model to learn quickly."
— Prateek Joshi
How to make LLMs faster"These are AI-infused products that are designed to solve specific problems in a particular vertical."
— Prateek Joshi
Verticalized AI"Verticalized AI models are specific to that domain and cannot really do much outside of that domain."
— Prateek Joshi
Verticalized AI"Because it will cost way more for the customer to go out and use disjointed tools."
— Prateek Joshi
Verticalized AI"Verticalized AI is particularly well suited to the enterprise. And companies that build verticalized AI applications are poised to win big time."
— Prateek Joshi
Verticalized AIExplore More Quotes 📚
Want to Save Quotes?
Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.