
星融元针对LLM大模型承载网发布星智AI网络解决方案
asterfusion.com/a20240205-ai-llm-solution/
Jun 13, 2024
4

云原生机器学习平台技术综述(编排调度篇)-来也科技
laiye.com/news/post/2627.html
May 20, 2024
1
DeepSpeed/blogs/deepspeed-fastgen/2024-01-19 at master · microsoft/DeepSpeed
github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen/chinese
May 20, 2024
8

Nvidia Blackwell Perf TCO Analysis - B100 vs B200 vs GB200NVL72
www.semianalysis.com/p/nvidia-blackwell-perf-tco-analysis
May 11, 2024
2

在生产环境中的OpenStack上运行Kubernetes集群 - 墨天轮
www.modb.pro/db/47575
May 9, 2024
2

OpenStack与K8s的关系 OpenStack与Kubernetes(K8s)的区别
www.usa-idc.com/news/idc/2023071414.shtml
May 9, 2024
1

Intel Introduces Gaudi 3 AI Accelerator: Going Bigger and Aiming Higher In AI Market
www.anandtech.com/show/21342/intel-introduces-gaudi-3-accelerator-going-bigger-and-aiming-higher
Apr 10, 2024
1

英伟达GB200架构解析:互联架构和未来演进-电子工程专辑
www.eet-china.com/mp/a301182.html
Apr 8, 2024
5

暴力美学的优雅化——NVidia的Rack Scale
zhuanlan.zhihu.com/p/689424234
Apr 8, 2024
1

英伟达AI芯片路线图分析与解读
wallstreetcn.com/articles/3712058
Apr 7, 2024
2

GPT-4 “炼丹”指南:MoE、参数量、训练成本和推理的秘密
www.aixinzhijie.com/article/6825966
Apr 2, 2024
4

英伟达 A100知识分享 GPU 板组单机价值量 1.2 万
www.jaeaiot.com/news/detail/32.html
Feb 22, 2024
3

SemiAnalysis | Dylan Patel | Substack
www.semianalysis.com/p/groq-inference-tokenomics-speed-but?utm_source=post-email-title&publication_id=329241&post_id=141888751&utm_campaign=email-post-title&isFreemail=true&r=b0aiz&utm_medium=email
Feb 22, 2024
1

Accelerating Generative AI with PyTorch II: GPT, Fast
pytorch.org/blog/accelerating-generative-ai-2/
Jan 22, 2024
1

How Nvidia’s CUDA Monopoly In Machine Learning Is Breaking - OpenAI Triton And PyTorch 2.0
www.semianalysis.com/p/nvidiaopenaitritonpytorch
Jan 22, 2024
9

GPU 进阶笔记(一):高性能 GPU 服务器硬件拓扑与集群组网(2023)
arthurchiao.art/blog/gpu-advanced-notes-1-zh/
Jan 8, 2024
1

TPUv5e: The New Benchmark in Cost-Efficient Inference and Training for <200B Parameter Models
www.semianalysis.com/p/tpuv5e-the-new-benchmark-in-cost
Dec 28, 2023
3

DSA的翻身路
zhuanlan.zhihu.com/p/626287371
Dec 26, 2023
2

数一数英伟达黄家刀法欠缺的招式——(上篇)
zhuanlan.zhihu.com/p/642260820
Dec 26, 2023
6

谈一下英伟达帝国的破腚
zhuanlan.zhihu.com/p/639181571
Dec 20, 2023
3

NLP(二十):漫谈 KV Cache 优化方法,深度理解 StreamingLLM
zhuanlan.zhihu.com/p/659770503
Oct 11, 2023
1

疯狂的 H100:现代 GPU 体系结构浅析,从算力焦虑开始聊起
zhuanlan.zhihu.com/p/659738090?utm_psn=1693972042385416192
Oct 9, 2023
2