Kazuki
@kazuki
Cofounder of Glasp. I collect ideas and stories worth sharing 📚
San Francisco, CA
Joined Oct 9, 2020
1073
Following
5839
Followers
1.47k
13.59k
172.12k
pmarchive.com/luck_and_the_entrepreneur.html
Sep 15, 2022
201
bryce.medium.com/most-people-won-t-ff0959cdefc6
Sep 15, 2022
3
foundersatwork.posthaven.com/grow-the-puzzle-around-you
Sep 15, 2022
141
waitbutwhy.com/2015/12/the-tail-end.html
Sep 15, 2022
4
www.albertbridgecapital.com/post/stay-in-the-game
Sep 15, 2022
1
www.youtube.com/watch?v=tyL0OwAgc_I
Sep 12, 2022
2
nfap.com/wp-content/uploads/2022/07/Immigrant-Entrepreneurs-and-Billion-Dollar-Companies.DAY-OF-RELEASE.2022.pdf
Sep 12, 2022
11
hardfork.substack.com/p/the-breaking-of-the-modern-mind-the
Sep 11, 2022
4
www.youtube.com/watch?v=qvHhhIfu7Lo
Sep 10, 2022
23
ruben.verborgh.org/articles/redecentralizing-the-web/
Sep 9, 2022
6
arxiv.org/pdf/2205.06345.pdf
Sep 9, 2022
9
hbr.org/2007/07/the-knowledge-creating-company
Sep 9, 2022
7
aigrant.org/
Sep 8, 2022
7
www.gatesnotes.com/Health/Why-do-children-die
Sep 6, 2022
11
digitalnative.substack.com/p/the-long-tail-the-internet-and-the
Sep 6, 2022
162
e-tarjome.com/storage/panel/fileuploads/2019-12-16/1576487113_gh76.pdf
Sep 6, 2022
15
www.quantamagazine.org/self-taught-ai-shows-similarities-to-how-the-brain-works-20220811
Sep 3, 2022
7
www.forbes.com/sites/robtoews/2022/03/27/a-wave-of-billion-dollar-language-ai-startups-is-coming/?sh=32af08f62b14
Sep 3, 2022
9
www.sciencedirect.com/science/article/abs/pii/S0148296319300992
Sep 3, 2022
2
every.to/divinations/dall-e-2-and-the-origin-of-vibe-shifts
Aug 31, 2022
123
venturebeat.com/business/ai-weekly-google-sets-the-bar-for-ai-language-models-with-palm/
Aug 31, 2022
8
blog.eladgil.com/2022/08/ai-revolution-transformers-and-large.html
Aug 31, 2022
111
www.kleinerperkins.com/case-study/google/
Aug 30, 2022
51
longnow.org/ideas/02022/07/29/how-humans-grew-acorn-brains/
Aug 25, 2022
15
www.psychologytoday.com/us/blog/creative-explorations/201506/the-janusian-process-in-creativity
Aug 24, 2022
8
medium.com/taking-notes/yet-another-article-about-extensions-6aeca0225bfc
Aug 24, 2022
61
taschalabs.com/how-to-use-tokenization-for-business-growth-7-lessons-from-a-successful-project/
Aug 24, 2022
202
digitalnative.substack.com/p/cac-customer-acquisition-chaos
Aug 24, 2022
173
writingcooperative.com/five-pieces-of-writing-wisdom-most-writers-dont-learn-until-5-years-in-d57b33dab22c
Aug 24, 2022
9
www.ted.com/talks/larry_page_where_s_google_going_next/transcript
Aug 23, 2022
71
nesslabs.com/habit-trackers
Aug 22, 2022
101
nesslabs.com/work-in-public
Aug 19, 2022
122
on.substack.com/p/one-million-strong
Aug 17, 2022
5
rosie.land/posts/a-guide-to-curation-in-community/
Aug 16, 2022
11
nesslabs.com/flow
Aug 16, 2022
7
tamethestars.wordpress.com/2022/08/16/how-to-find-new-things-to-learn/
Aug 16, 2022
41
whizzoe.substack.com/p/how-to-monetize-the-curation-economy
Aug 16, 2022
51
radreads.co/43-life-lessons-at-age-43/
Aug 15, 2022
9
www.pewresearch.org/internet/2022/08/10/teens-social-media-and-technology-2022/
Aug 15, 2022
81
a16zcrypto.com/cc0-nft-creative-commons-zero-license-rights/
Aug 15, 2022
15
The number of parameters is important in LLMs, although more parameters don’t necessarily translate to a better-performing model.
PaLM 540B is in the same league as some of the largest LLMs available regarding the number of parameters: OpenAI’s GPT-3 with 175 billion, DeepMind’s Gopher and Chinchilla with 280 billion and 70 billion, Google’s own GLaM and LaMDA with 1.2 trillion and 137 billion and Microsoft – Nvidia’s Megatron–Turing NLG with 530 billion.
The first thing to consider when discussing LLMs, like any other AI model, is the efficiency of the training process.
PaLM uses a standard Transformer model architecture, with some customizations. Transformer is the architecture used by all LLMs and although PaLM deviates from it in some ways, what is arguably more important is the focus of the training dataset used.
The dataset used to train PaLM is a mixture of filtered multilingual web pages (27%), English books (13%), multilingual Wikipedia articles (4%), English news articles (1%), GitHub source code (5%) and multilingual social media conversations (50%). This dataset is based on those used to train LaMDA and GLaM.
Nearly 78% of all sources are English, with German and French sources at 3.5% and 3.2% and all other sources trailing far behind.
PaLM 540B surpassed few-shot performance of prior LLMs on 28 of 29 tasks.
PaLM outperforms the prior top score of 55% achieved by fine-tuning GPT-3 with a training set of 7,500 problems and combining it with an external calculator and verifier. This new score also approaches the 60% average of problems solved by 9- to 12-year-olds — the target audience for the question set.