跳转至
AI Course Notes
概览
正在初始化搜索引擎
hqhq1025/ai-course-notes
AI Course Notes
hqhq1025/ai-course-notes
首页
课程
课程
6.S191
6.S191
概览
MIT 6.S191: Introduction to Deep Learning
MIT 6.S191 Lecture 2: Recurrent Neural Networks, Transformers, and Attention
MIT 6.S191: Convolutional Neural Networks
Deep Generative Modeling
MIT 6.S191 Lecture 5: Deep Reinforcement Learning
MIT 6.S191 Lecture 6: Language Models and New Frontiers
MIT 6.S191 Lecture 7: Large Language Models
MIT 6.S191 Lecture 8: Large Language Models — Post-Training
AI 的希波克拉底誓言:你的 AI,你的责任
AI for Biology:生成式 AI 驱动的蛋白质设计
Agentic RL
Agentic RL
概览
PG Loss 详解:Policy Gradient 的核心组件
从 PG 到 TRPO 到 PPO
GRPO 下的 PG Loss 分析
REINFORCE 算法在语言模型中的应用
vLLM 推理与部署:参数调优与显存分析
DeepSeek Math V2:自我验证与推理训练
RL 是否激发了 Base Model 不具备的能力?
veRL 实战:Multi-turn SFT 训练
Reward Model 与概率统计建模
从概率分布视角重新审视 SFT 与 RL
DPO:从 Reward Model 到直接偏好优化
veRL Agentic Loop 实践
veRL Agentic Loop 代码详解
veRL Agentic Loop 计算细节:异步与状态管理
基于 Docker 的 veRL 环境安装与开发
veRL 训练参数详解
Advantage Estimator:GRPO、RLOO、REINFORCE++
veRL FSDP SFT Trainer 详解
SFT 训练细节补充:无需 Decoding 的监督学习
Tokenizer 编解码不可逆与训练崩溃
技术文章笔记
技术文章笔记
概览
Building Effective Agents — 构建高效 Agent 的实践指南
文章笔记:长时间运行 Agent 的有效 Harness 设计
Harness Design for Long-Running Application Development
Harness Design for Long-Running Application Development
文章笔记:为 AI Agent 编写高效工具
Claude Cowork 完全入门指南:插件、上下文文件与工作流
构建 Claude Code 的经验:Skills 的设计与使用
Karpathy 谈 Coding Agent、AutoResearch 与 AI 的未来
文章笔记:Harness Engineering
5 Agent Skill Design Patterns Every ADK Developer Should Know
Claude Code vs. Codex: The Definitive Guide
成为 Claude 架构师的完整学习路径
文章笔记:Harness Engineering for Coding Agents
文章笔记:Your Agent Needs a Harness, Not a Framework
Notes from inside China's AI labs — 中国 AI 实验室内部笔记
From “Reasoning” Thinking to “Agentic” Thinking
文章笔记:Evaluating Deep Agents — Our Learnings
文章笔记:Improving Deep Agents with Harness Engineering
用 Karpathy 的 AutoResearch 方法将 Claude Skills 效果提升 10 倍
OpenAI Codex 最佳实践
用 Codex 做数据分析与报告交付
Harness Engineering:Agent-First 开发范式下的工程实践
Claude Code 最佳实践:从"能用"到"真的好用"
Agentic Memory: A Detailed Breakdown
50 个 Claude Code 实用技巧与最佳实践
CS25
CS25
概览
[CS25 V1] Introduction to Transformers — Instructors
[CS25 V1] GPT Models (GPT-3, Codex) — Mark Chen, OpenAI
[CS25] Transformers in Vision — Lucas Beyer, Google Brain
[CS25] Decision Transformer / RL — Aditya Grover
[CS25] Mixture of Experts / Switch Transformer — Barret Zoph, Google
[CS25] Perceiver and Perceiver IO — Andrew Jaegle, DeepMind
[CS25] Self Attention & NPTs — Aidan Gomez
[CS25] Mechanistic Interpretability & Transformer Circuits — Chris Olah (Anthropic)
[CS25] Transformers in Audio/Speech/Music — Prateek Verma
[CS25] GLOM: Part-Whole Hierarchies — Geoffrey Hinton
[CS25] Introduction to Transformers — Andrej Karpathy
[CS25] Language and Human Alignment — Jan Leike, OpenAI
[CS25] Emergent Abilities and Scaling — Jason Wei, Google
[CS25] Strategic Games (Diplomacy) — Noam Brown, Meta
[CS25] Robotics and Imitation Learning — Ted Xiao, Google Brain
[CS25] Common Sense Reasoning — Yejin Choi
[CS25] Biomedical Transformers — Vivek Natarajan
[CS25] Neuroscience-Inspired AI — Trenton Bricken, Anthropic
[CS25] Low-level Embodied Intelligence / RT-2 — Fei Xia, Google DeepMind
[CS25] Generalist Agents in Open-Ended Worlds — Jim Fan, NVIDIA
[CS25] From Worry to Wonder: Transformer Lessons — Ashish Vaswani
[CS25] Recipe for Training Helpful Chatbots — Nazneen Rajani, HuggingFace
[CS25] No Language Left Behind: Scaling Human-Centered Machine Translation — Angela Fan, Meta
[CS25] Beyond LLMs: Agents, Emergent Abilities — Instructors
[CS25] Retrieval Augmented Language Models — Douwe Kiela
[CS25 V4] Overview of Transformers — Div Garg 公开讲座整理
[CS25] Intuitions on LMs + History of Architectures — Jason Wei + Hyung Won Chung, OpenAI
[CS25] Shaping the Future of AI from the History of Transformer — Hyung Won Chung, OpenAI
[CS25] Demystifying Mixtral of Experts — Albert Jiang, Mistral
[CS25] Near-Shallow Architectures — Jake Williams
[CS25] From LLMs to Multimodal Models — Ming Ding, Zhipu AI
[CS25] Behind the Scenes of LLM Pre-training: StarCoder — Loubna Ben Allal, HuggingFace
[CS25] New Training Objectives — Hyung Won Chung, OpenAI
[CS25 V5] Overview of Transformers — Instructors
[CS25 V5] RL as a Co-Design of Product and Research — Karina Nguyen, OpenAI
[CS25 V5] The Advent of AGI — Div Garg
[CS25 V5] On the Biology of a Large Language Model — Josh Batson, Anthropic
[CS25 V5] Multimodal World Models for Drug Discovery — Eshed Margalit, Noetik
[CS25 V5] Transformers in Diffusion Models — Sayak Paul, HuggingFace
[CS25 V5] Transformers for Video Generation (Movie Gen) — Andrew Brown, Meta
CS146S
CS146S
概览
CS146S Week 1: Introduction to Coding LLMs and AI Development
CS146S Week 2: The Anatomy of Coding Agents (MCP)
CS146S Week 3: The AI IDE (Context Engineering)
CS146S Week 4: Coding Agent Patterns
CS146S Week 5: The Modern Terminal
CS146S Week 6: AI Testing and Security
CS146S Week 7: Modern Software Support (Code Review)
CS146S Week 8: Automated UI and App Building
CS146S Week 9: Agents Post-Deployment (DevOps)
CS146S Week 10: What's Next for AI Software Engineering
CS153
CS153
概览
[CS 153] Frontier Systems 导论 — 课程讲师 Anjney Midha
[CS 153] Reddit 的扩展与重构 — 联合创始人兼 CEO Steve Huffman
[CS 153] 数据基础设施与国防级扩展 — Palantir CTO Shyam Sankar
[CS 153] 欧洲开源 AI 与 Mistral 的崛起 — 联合创始人 Guillaume Lample
[CS 153] Web 基础设施的扩展与适应 — Vercel CEO Guillermo Rauch
[CS 153] 国家级基础设施与数字化转型 — 沙特部长 Abdullah Alswaha
[CS 153] Cursor 的基础设施与 AI 编程 — CTO Sualeh Asif
[CS 153] 技术驱动的儿童保护 — Thorn CEO Julie Cordua
[CS 153] 身份基础设施的构建与扩展 — Okta CEO Todd McKinnon
[CS 153] Scaling AI — Anthropic 联合创始人 Ben Mann
[CS 153] 安全基础设施与事件响应 — Joe Sullivan
CS224N
CS224N
概览
CS224N Lecture 1: Intro and Word Vectors
CS224N Lecture 2: Word Vectors and Language Models
CS224N Lecture 3: Backpropagation and Neural Networks
CS224N Lecture 4: Dependency Parsing
CS224N Lecture 5: Recurrent Neural Networks
CS224N Lecture 6: Seq2Seq and Machine Translation
CS224N Lecture 7: Attention and LLM Intro
CS224N Lecture 8: Transformers
CS224N Lecture 9: Pretraining
CS224N Lecture 10: Post-training - RLHF, SFT, DPO
CS224N Lecture 11: Benchmarking and Evaluation
CS224N Lecture 12: Efficient Neural Network Training
CS224N Lecture 13: Brain-Computer Interfaces for Speech
CS224N Lecture 14: Reasoning and Agents
CS224N Lecture 15: Life After DPO
CS224N Lecture 16: ConvNets and Tree Recursive Neural Networks
CS224N Lecture 18: NLP, Linguistics, and Philosophy
CS224R
CS224R
概览
CS224R Lecture 1: 深度强化学习导论
CS224R Lecture 2: 模仿学习
CS224R Lecture 3: 策略梯度
CS224R Lecture 4: Actor-Critic 方法
CS224R Lecture 5: Off-Policy Actor-Critic
CS224R Lecture 6: Q-Learning
CS224R Lecture 7: Offline RL
CS224R Lecture 8: Reward Learning
CS224R Lecture 9: RLHF 与偏好优化
CS224R Lecture 10: 强化学习用于 LLM 推理
CS224R Lecture 11: 基于模型的强化学习
CS224R Lecture 12: 多任务 RL 与目标条件 RL
CS224R Lecture 13: 元强化学习
CS224R Lecture 14: 探索与元探索
CS224R Lecture 15: 层次化模仿与强化学习
CS224R Lecture 16: Autonomy — Chelsea Finn
CS224R Lecture 17: 用强化学习推进机器人智能
CS224R Lecture 18: 深度 RL 前沿与研究方法
CS224R Lecture 19: Q-Learning 复习与总结
CS231N
CS231N
概览
CS231N Lecture 1: Introduction
CS231N Lecture 2: Image Classification with Linear Classifiers
CS231N Lecture 3: Regularization and Optimization
CS231N Lecture 4: Neural Networks and Backpropagation
CS231N Lecture 5: Image Classification with CNNs
CS231N Lecture 6: CNN Architectures
CS231N Lecture 7: Recurrent Neural Networks
CS231N Lecture 8: Attention and Transformers
CS231N Lecture 9: Object Detection and Image Segmentation
CS231N Lecture 10: Video Understanding
CS231N Lecture 11: Large Scale Distributed Training
CS231N Lecture 12: Self-Supervised Learning
CS231N Lecture 13: Generative Models 1
CS231N Lecture 14: Generative Models (Part 2)
CS231N Lecture 15: 3D Vision
CS231N Lecture 16: Multi-Modal Foundation Models
CS231N Lecture 17: Robot Learning
CS231N Lecture 18: What We See and What We Value
CS336 2026
CS336 2026
概览
CS336 2026 Lecture 1:从零构建语言模型、课程版图与 Tokenization
CS336 2026 Lecture 2:Resource Accounting、Tensor、FLOPs 与 Memory
CS336 2026 Lecture 3:语言模型架构与超参数
CS336 2026 Lecture 4:Attention Alternatives 与 Mixture of Experts
CS336 2026 Lecture 5:GPUs、Roofline 与 FlashAttention
CS336 2026 Lecture 6:Benchmarking、Profiling 与 Triton Kernels
CS336 2026 Lecture 7:Parallelism 与分布式训练基础
CS336 2026 Lecture 8:Parallelism Basics 与大模型并行训练
CS336 2026 Lecture 9:Scaling Laws Basics
CS336 2026 Lecture 10:Inference、KV Cache 与服务系统
CS336 2026 Lecture 11:Scaling Case Study and Details
CS336 2026 Lecture 12:Evaluation
CS336 2026 Lecture 13:Data I
CS336
CS336
概览
CS336 Lecture 1: 课程导论与整体地图
CS336 Lecture 2: Building a Model in PyTorch
CS336 Lecture 3: Architectures, Hyperparameters
CS336 Lecture 4: Mixture of Experts
CS336 Lecture 5: GPUs
CS336 Lecture 6: Kernels, Triton
CS336 Lecture 7: Parallelism 1
CS336 Lecture 8: Distributed Training Across Multiple GPUs
CS336 Lecture 9: Scaling Laws 1
CS336 Lecture 10: Inference
CS336 Lecture 11: Scaling Laws 2
CS336 Lecture 12: Evaluation
CS336 Lecture 13: Data
CS336 Lecture 14: 数据过滤与去重
CS336 Lecture 15: Alignment - SFT/RLHF
CS336 Lecture 16: Reinforcement Learning from Verifiable Rewards
CS336 Lecture 17: Policy Gradient Mechanics and GRPO
Ungrounded 不着边际
Ungrounded 不着边际
概览
访谈笔记:GUI Agent 的下一站
访谈笔记:赵晨阳谈 SGLang、退学潮、AI Coding 与开源社区
WhynotTV
WhynotTV
概览
访谈笔记:陈天奇 — 机器学习系统,长期主义与初心
访谈笔记:胡渊鸣——Meshy AI、太极、MIT、清华姚班、图形学与创业哲学
翁家翌:OpenAI,GPT,强化学习,Infra,后训练,天授,tuixue,开源,CMU,清华
访谈笔记:杨硕——妙动科技、特斯拉Optimus、人形机器人与人机共生
张小珺商业访谈录
张小珺商业访谈录
概览
季逸超访谈:Manus 的技术路线与 Agent 前瞻
访谈笔记:谢赛宁——世界模型、逃出硅谷与 AMI Labs
杨植麟访谈:K2 模型、Scaling Law 与 Agent 未来
CS492D
CS492D
概览
KAIST CS492D: 生成模型导论 — GAN 与 VAE
DDPM: 去噪扩散概率模型(第一部分)
KAIST CS492D Lecture 3: Denoising Diffusion Probabilistic Models (Part 2)
KAIST CS492D Lecture 4: Score-Based Models
KAIST CS492D Lecture 5: Denoising Diffusion Implicit Models (DDIM)
KAIST CS492D: 条件生成与潜在扩散模型
KAIST CS492D: 连续时间扩散模型
KAIST CS492D Lecture 8: ODE Solvers (DPM-Solver)
KAIST CS492D Lecture 9: Flow Matching 1
KAIST CS492D Lecture 10: Flow Matching 2
KAIST CS492D Lecture 11: Inference-Time Guided Generation 1
KAIST CS492D Lecture 12: Inference-Time Guided Generation 2
KAIST CS492D Lecture 13: Score Distillation / Course Wrap-Up
扩散语言模型:从 MDLM 到 SoLM
KAIST CS492D Guest Lecture 2: 3D生成方法与视频模型
LLM Architect
LLM Architect
概览
MoE 初步:参数量计算、Qwen3-30B-A3B、GQA 与 Sparse MoE
Qwen3 Dense vs. MoE 深度对比:等效宽度与 RMSNorm
K2 Thinking:Interleaved Thinking 交错推理
Muon Optimizer:梯度白化与 SVD
RoPE 几何视角与 Qwen3 RoPE 计算细节
RoPE Attention 远程衰减推导
Attention Head 模式识别与 Attention Sink
VLM 多模态架构:Gemini vs Qwen3-VL vs K2.5
Dive into K2.5:原生多模态与 Agent Swarm
Prefill vs Decode、KV-Cache、GEMM vs GEMV
Modern Agent
Modern Agent
概览
Survey of LLM Agent Reasoning & Planning, PDDL
LangChain/LangGraph ReAct 差异与 LangSmith 监控
Planning Agent:Plan-Execute-RePlan
LlamaIndex RAG 原理及源码分析
Gemini 2.5 Pro + Agent = IMO 金牌:Actor-Critic Workflow
结构化输出与工具调用:LangChain / OpenAI API
LightRAG 知识图谱双层检索与 LangFuse
SAM3 与 VLM:Visual Prompting 与 Grounding
LightRAG 进阶:KG-based RAG 与知识图谱可视化
Gemini Agentic Vision:模型基础能力与 Post-Training
多模态 Visual Prompting:SAM 1/2/3 与 Grounding DINO
Codex 初步:系统提示词注入与 Context Engineering
Codex Skills 与 Tools 加载、Agent Runtime 全景
Superpowers Skills:Vibe Coding 时代的软件开发圣经
Codex 上下文压缩 Compact 与 Handoff
Codex 通用工具设计:Subagents 与 Apply_Patch
Codex 中的 Plan Mode 与 update_plan
20VC with Harry Stebbings
20VC with Harry Stebbings
概览
访谈笔记:Demis Hassabis —— AGI 为何比工业革命更重大
AITIME 论道
AITIME 论道
概览
通向通用智能体:Qwen 2025 技术进展
杨植麟:Scaling Law、模型架构与Agent智能
姚顺雨:AGI-Next 峰会演讲
张钹院士:迈向通用人工智能
阿里云
阿里云
概览
圆桌:通往AGI的大模型发展之路
CS294 F24
CS294 F24
概览
[LLM Agents F24] Course Introduction — Dawn Song & Xinyun Chen
[LLM Agents F24] Measuring Agent Capabilities and Anthropic's RSP — Ben Mann
[LLM Agents F24] Open-Source and Science in the Era of Foundation Models — Percy Liang
[LLM Agents F24] Project GR00T: A Blueprint for Generalist Robotics — Jim Fan
[LLM Agents F24] Towards a Unified Framework of Neural and Symbolic Decision Making — Yuandong Tian
[LLM Agents F24] AI Agents for Enterprise Workflows — Nicolas Chapados & Alexandre Lacoste
[Berkeley LLM Agents F24] Agents for Software Development — Graham Neubig
[LLM Agents F24] Compound AI Systems & the DSPy Framework — Omar Khattab
[LLM Agents F24] Enterprise Trends for Generative AI and Building Successful Agents — Burak Gokturk
[LLM Agents F24] Agentic AI Frameworks & AutoGen + Building a Multimodal Knowledge Assistant
[LLM Agents F24] LLM Agents: Brief History and Overview — Shunyu Yao
[LLM Agents F24] LLM Reasoning — Denny Zhou
CS294 F25
CS294 F25
概览
[LLM Agents F25] Agentic AI Safety & Security
[LLM Agents F25] Autonomous Agents — Peter Stone
[LLM Agents F25] Multi-Agent Systems in Era of LLMs — Oriol Vinyals
[LLM Agents F25] Practical Lessons from Deploying AI Agents – Clay Bavor
[LLM Agents F25 Lecture 05] AI Agents to Automate Science — James Zou
[LLM Agents F25] Predictable Noise in LLM Benchmarks
[LLM Agents F25] Multi-Agent AI by Noam Brown
[LLM Agents F25] Training Agentic Models — Weizhu Chen
[LLM Agents F25] 后训练可验证 Agent:从数据到算法
[LLM Agents F25] Evolution of System Designs — Yangqing Jia
[LLM Agents F25] LLM Agents Overview 与训练路线
CS294 SP25
CS294 SP25
概览
[LLM Agents SP25] Safe & Secure Agentic AI — Dawn Song
[LLM Agents SP25] Abstraction & Discovery with LLM Agents — Swarat Chaudhuri
[LLM Agents SP25] Informal+Formal Math Reasoning — Sean Welleck
[LLM Agents SP25] LMs for Autoformalization & Theorem Proving — Kaiyu Yang
[LLM Agents SP25] AlphaProof: RL Meets Formal Mathematics — Thomas Hubert
[LLM Agents SP25] Multimodal Agents: Perception to Action
[LLM Agents SP25] Multimodal Autonomous AI Agents
[LLM Agents SP25] Code Agents \ & AI Vulnerability Detection
[LLM Agents SP25] Open Training Recipes: LLM Reasoning
[LLM Agents SP25] Reasoning, Memory & Planning of Language Agents
[LLM Agents SP25] Learning to Reason with LLMs — Jason Weston
[LLM Agents SP25] Inference-Time Techniques for LLM Reasoning — Xinyun Chen
Cleo Abram
Cleo Abram
概览
Jensen Huang 的未来愿景:从 CUDA 到 Physical AI
Dwarkesh Patel Podcast
Dwarkesh Patel Podcast
概览
从 Scaling 时代到 Research 时代
Greg Isenberg
Greg Isenberg
概览
Claude Co-Work & Claude Code 深度解析
Lex Fridman Podcast
Lex Fridman Podcast
概览
访谈笔记:Dario Amodei, Amanda Askell & Chris Olah
访谈笔记:DeepSeek、中国 AI 与半导体地缘政治
Jensen Huang:NVIDIA 的系统级 AI 战略与产业化路径
OpenClaw:Agentic Engineering 的产品、工程与社会实验
State of AI in 2026:LLM、Coding、Scaling Laws、China、Agents、GPU
No Priors Podcast
No Priors Podcast
概览
Skill Issue — Code Agents, AutoResearch, and the Loopy Era of AI
NVIDIA GTC
NVIDIA GTC
概览
How We Scaled Kimi K2.5
青稞社区
青稞社区
概览
青稞 AI 嘉年华:Agentic 专题讨论
青稞 AI 嘉年华:Infra 专题讨论
青稞 AI 嘉年华:LLM/MLLM 专题圆桌
RL 专题|2025 "青稞" AI 嘉年华
Youtube
Youtube
概览
姚顺宇访谈:在 Anthropic 与 Gemini 训练模型,个人英雄主义之后
Zhangxiaojun
Zhangxiaojun
概览
Ep95 Vdwee6Voyrw
Ep96 Qtugoe1Xqzk
Ep97 Yshxmh Q Q4
Ep100 9Yjws Rt378
Ep101 A04Pojekncy
Ep102 Vwryhvsrz0S
Ep104 Qw Kgogqwjc
Ep106 Puptr04Av5G
Ep109 Pwy0Hvuh8Ga
Ep110 8Dkbh4X0D9O
Ep111 Jxeetulv9Ra
Ep112 6Yexfotusww
Ep113 Oug6Jrkecrc
Ep115 Gqgkkusx5Q0
Ep116 Khross7Yqn4
Ep117 Zrvnoyypawq
Ep118 Rxxvq7 Sjzm
Ep119 858Hr43Pegk
Ep120 40Qpt8R2Uys
Ep121 2O281Zy5Aze
Ep123 Qzbzfz2R Nw
Ep125 K82Ifzvkfcq
Ep127 Sg90Aehv3Vu
Ep128 Mw Ezf2Rhvg
Ep129 9Zsmtuuefmu
Ep130 Ruvj 5Dobxs
Ep132 N4 C Hsodpg
Ep133 Iiby0Fqpthi
Ep134 Owjtot14Bg0
Ep135 X8Qdqwivvta
Ep136 U1Lzp 7Ybn8
Ep137 Bv8Ghytff9W
Ep138 Vg1Rbqn1Sg4
Ep139 Xxz5Uh0L1Me
Special Biptonyq Ys
Special Eiqfomoucjs
Special Pvue4J5Cn98
阿里云
AGI 圆桌
共 1 份讲义。
讲义
日期
来源
资源
圆桌:通往AGI的大模型发展之路
2024
阿里云
阅读
·
LaTeX
·
备用 PDF
回到页面顶部