
Patterns for Building LLM-based Systems & Products
eugeneyan.com/writing/llm-patterns/
Jun 19, 2024
1
Arvind Narayanan on X: "Tired: train/test leakage. Wired: benchmark contamination. Inspired: resample until answer is correct." / X
x.com/random_walker/status/1803392358093857127
Jun 19, 2024
1
(1) Alex Cheema - e/acc on X: "Llama 3 running locally on iPhone with MLX Built by @exolabs_ team @mo_baioumy h/t @awnihannun MLX & @Prince_Canuma for the port https://t.co/4swkM7mOfI" / X
x.com/ac_crypto/status/1781061013716037741
Jun 19, 2024
1
2406.11741v1.pdf
arxiv.org/pdf/2406.11741
Jun 19, 2024
1

Context caching | Google AI for Developers | Google for Developers
ai.google.dev/gemini-api/docs/caching?lang=python
Jun 19, 2024
1
x.com/johnathanbi/status/1803096216299090267?s=12
Jun 19, 2024
Pass@k or Pass@1? · Issue #1 · trotsky1997/MathBlackBox
github.com/trotsky1997/MathBlackBox/issues/1
Jun 18, 2024
1
quickwit-oss/tantivy: Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust
github.com/quickwit-oss/tantivy
Jun 18, 2024
1
Olympiad Solutions - Search / X
x.com/search?q=Olympiad%20Solutions&src=typed_query
Jun 18, 2024
1

The 100 Rep Squat Challenge
kettlebellaerobics.substack.com/p/the-100-rep-squat-challenge
Jun 18, 2024
1

Applied LLMs - What We’ve Learned From A Year of Building with LLMs
applied-llms.org/
Jun 18, 2024
1
(1) Terry Yue Zhuo on X: "In the past few months, we’ve seen SOTA LLMs saturating basic coding benchmarks with short and simplified coding tasks. It's time to enter the next stage of coding challenge under comprehensive and realistic scenarios! -- Here comes BigCodeBench, benchmarking LLMs on solving… https://t.co/w3Z6N5wnVk" / X
x.com/terryyuezhuo/status/1803076834520945117
Jun 18, 2024
1
(2) François Chollet on X: "I believe that program synthesis will solve reasoning. And I believe that deep learning will solve program synthesis (by guiding a discrete program search process). But I don't think you can go all that far with just prompting a LLM to generate end-to-end Python programs (even…" / X
x.com/fchollet/status/1803096195684012371
Jun 18, 2024
1
Caiming Xiong on X: "🎆I am pleased to announce the release of the latest version of the Salesforce Embedding Model (SFR-embedding-v2), which has reclaimed the top-1 position on the MTEB benchmark. ✨ Key Highlights: 🥇 Achieved the distinction of being the second model to surpass a 70+ performance… https://t.co/ucs4gXfp1v" / X
x.com/CaimingXiong/status/1802879572385714496
Jun 18, 2024
1

Debunking the Chessboard: Confronting GPTs Against Chess Engines to Estimate Elo Ratings and Assess Legal Move Abilities
blog.mathieuacher.com/GPTsChessEloRatingLegalMoves/
Jun 18, 2024
1
Beyond the Basics of Retrieval for Augmenting Generation – Parlance
parlance-labs.com/education/rag/ben.html
Jun 18, 2024
1

TaskMeAnything
www.task-me-anything.org/
Jun 18, 2024
2
John David Pressman on X: "My problem with "transformers don't generalize algebraic structures and therefore don't reason" is that while I agree this is a real limitation there are important aspects of reason which these models in fact do and other methods don't. We may need to divide "reason" up." / X
x.com/jd_pressman/status/1802835378451185733
Jun 18, 2024
1
François Chollet on X: "@RyanPGreenblatt @TomDAAVID @AndrewTBurks @dwarkesh_sp This isn't reasoning, it's intuition. Intuition is a fast, perception-like, inexact, approximate way of navigating a complex space. Your LLM has "intuition" over the space of program, which can be used to fight combinatorial complexity and make discrete program search more…" / X
x.com/fchollet/status/1802790666420035646
Jun 18, 2024
1
(1) François Chollet on X: "@mahaoo_ASI @wintermoat SOTA did not go from 35% to 50%. The 50% is on the evaluation set, the 35% is on the private test set. The solution that does ~35% on the private test set also did ~50% on the evaluation set, so 50% on the eval set is not clearly a new SOTA (it might be, but it isn't clear)" / X
x.com/fchollet/status/1802807579489468846
Jun 18, 2024
1
(1) Warp on X: "Type plain English on the command line. Accomplish any dev task. This is the command line for the AI era. New Agent Mode is available today. https://t.co/ptqib32w8o" / X
x.com/warpdotdev/status/1802736163507118387
Jun 18, 2024
1
(2) Gergely Orosz on X: "After having read it, I can say this is probably the best book to explain how ChatGPT (and LLMs) work (written by Stephen Wolfram, who excels at explaining complex topics simply, so perhaps not a surprise) There's also a blog post form for those wanting to read online." / X
x.com/GergelyOrosz/status/1802798002081251497
Jun 18, 2024
2