Algorithms & Systems for AI¶
What this subject is for: Making models fast, cheap, and deployable — distributed training, parallelism, kv-cache, FlashAttention, quantization, inference optimization.
Track status: 18 substantive concept pages · 6 stubs awaiting next cycle. See the live generation status and the latest retrospective.
Concepts¶
- Attention Mechanisms
- Data Parallelism
- Distributed Training Arc
- Distributed Training
- Flash Attention
- Inference optimization
- KV cache
- LLM Architecture Optimizations
- LLM Inference
- Mixed-precision training
- Model Deployment
- Model Parallelism
- Pipeline Parallelism
- Post-Training Quantization
- Precision scaling
- Quantization-Aware Training
- Quantization
- Tensor parallelism
Auto-seeded stubs awaiting next cycle: communication-collectives, compiler-optimizations-for-ml, differentiable-optimization, kv-cache-management, quantization-basics, reinforcement-learning-schedulers
Arcs through this subject¶
No arcs yet — the retrospective proposes these once concept coverage hits ≥4 pages per track.
Key thinkers¶
Author pages pending.
Builds tied to this subject¶
MVB recipes pending — currently they live inside concept pages' Build it sections.
Auto-rebuilt from filesystem state by scripts/rebuild_track_indexes.py — see system architecture.