
Mooncake (1): 在月之暗面做月饼,Kimi 以 KVCache 为中心的分离式推理架构
zhuanlan.zhihu.com/p/705754254
Jul 12, 2024
3
AI Inference — 从前沿技术到商业化实操观察 (社区版) - 飞书云文档
miracleplus.feishu.cn/docx/Lqe1dgVTho0vEVxZqLZcFpmgnkb
Jul 10, 2024
1

From bare metal to a 70B model: infrastructure set-up and scripts
imbue.com/research/70b-infrastructure/
Jul 3, 2024
1

星融元针对LLM大模型承载网发布星智AI网络解决方案
asterfusion.com/a20240205-ai-llm-solution/
Jun 13, 2024
4

云原生机器学习平台技术综述(编排调度篇)-来也科技
laiye.com/news/post/2627.html
May 20, 2024
1
DeepSpeed/blogs/deepspeed-fastgen/2024-01-19 at master · microsoft/DeepSpeed
github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen/chinese
May 20, 2024
8

Nvidia Blackwell Perf TCO Analysis - B100 vs B200 vs GB200NVL72
www.semianalysis.com/p/nvidia-blackwell-perf-tco-analysis
May 11, 2024
2

在生产环境中的OpenStack上运行Kubernetes集群 - 墨天轮
www.modb.pro/db/47575
May 9, 2024
2

OpenStack与K8s的关系 OpenStack与Kubernetes(K8s)的区别
www.usa-idc.com/news/idc/2023071414.shtml
May 9, 2024
1

Intel Introduces Gaudi 3 AI Accelerator: Going Bigger and Aiming Higher In AI Market
www.anandtech.com/show/21342/intel-introduces-gaudi-3-accelerator-going-bigger-and-aiming-higher
Apr 10, 2024
1

英伟达GB200架构解析:互联架构和未来演进-电子工程专辑
www.eet-china.com/mp/a301182.html
Apr 8, 2024
5

暴力美学的优雅化——NVidia的Rack Scale
zhuanlan.zhihu.com/p/689424234
Apr 8, 2024
1

英伟达AI芯片路线图分析与解读
wallstreetcn.com/articles/3712058
Apr 7, 2024
2

GPT-4 “炼丹”指南:MoE、参数量、训练成本和推理的秘密
www.aixinzhijie.com/article/6825966
Apr 2, 2024
4

英伟达 A100知识分享 GPU 板组单机价值量 1.2 万
www.jaeaiot.com/news/detail/32.html
Feb 22, 2024
3

SemiAnalysis | Dylan Patel | Substack
www.semianalysis.com/p/groq-inference-tokenomics-speed-but?utm_source=post-email-title&publication_id=329241&post_id=141888751&utm_campaign=email-post-title&isFreemail=true&r=b0aiz&utm_medium=email
Feb 22, 2024
1

Accelerating Generative AI with PyTorch II: GPT, Fast
pytorch.org/blog/accelerating-generative-ai-2/
Jan 22, 2024
1

How Nvidia’s CUDA Monopoly In Machine Learning Is Breaking - OpenAI Triton And PyTorch 2.0
www.semianalysis.com/p/nvidiaopenaitritonpytorch
Jan 22, 2024
9

GPU 进阶笔记(一):高性能 GPU 服务器硬件拓扑与集群组网(2023)
arthurchiao.art/blog/gpu-advanced-notes-1-zh/
Jan 8, 2024
1