An Interview with Zen Chief Architect Mike Clark
a year ago
- #Performance Optimization
- #CPU Architecture
- #x86 vs ARM
- Zen microarchitecture has significantly boosted AMD's CPU marketshare from 10% to 25% over eight years.
- Mike Clark, Zen's chief architect, discusses the flexibility of x86 and ARM ISAs, noting both can achieve similar performance per watt with appropriate microarchitecture.
- The x86 ISA's variable-length instructions and stronger memory model don't fundamentally limit performance or power efficiency.
- Larger page sizes (e.g., 16k or 64k) could benefit Zen architectures by reducing TLB pressure, though 4k pages remain manageable with techniques like page combining.
- CPUs and GPUs differ in cache line and register sizes (64 bytes vs. 128+ bytes) due to their focus on low-latency vs. throughput workloads, respectively.
- Scatter/gather operations are challenging on CPUs due to bandwidth limitations, but wider adoption in software could justify hardware improvements.
- Nontemporal stores can outperform regular stores when used correctly, as they reduce cache pollution and simplify memory subsystem operations.
- Modern CPU pipelines remain conceptually similar to older designs (e.g., Bulldozer), though proprietary optimizations prevent detailed public diagrams.
- Long-latency instructions like `sqrtpd` are handled by schedulers that account for their non-pipelined execution, avoiding overlap with other operations.
- Software developers are encouraged to adopt new ISA features (e.g., wider vectors, AI instructions) and provide feedback to guide future hardware designs.