The Return of Rigorous Full-System Timing Simulation
2 days ago
- #timing-simulation
- #full-system-simulation
- #computer-architecture
- Argues for a return to rigorous full-system timing simulation by measuring appropriate intervals, using correct metrics, and applying statistically sound methods.
- Highlights the importance of full-system simulation for capturing OS activity, interrupts, I/O, and device interactions critical to modern workloads.
- Discusses the timing simulation wall where detailed cycle-level simulation is too slow, pushing researchers toward approximations that may sacrifice accuracy.
- Explains that modern workloads require full-system simulation due to service-oriented, multi-tenant, and heterogeneous systems involving CPUs, GPUs, and accelerators.
- Compares simulation speeds, showing timing simulation is significantly slower than functional simulation or ISA emulation.
- Notes that capturing performance variability for server workloads requires simulating many seconds, which is impractical with current cycle-accurate simulators.
- Suggests using user-level IPC (U-IPC) over total IPC as a better metric for useful work, but emphasizes the need for metric validation.
- Critiques abbreviated measurement techniques like fixed instruction windows and phase-based sampling for potentially missing OS effects and lacking error bounds.
- Advocates for statistical sampling (e.g., SMARTS) to provide error bounds and confidence intervals, representing both frequent and impactful instructions.
- Describes a state-of-the-art sampling framework using QFlex 3.0, involving functional simulation, checkpointing, and parallel timing simulation.
- Outlines challenges such as accurate state generation, the functional simulation wall, checkpointing overheads, and difficulties with non-average or service-level metrics.
- Mentions open problems including multi-node full-system simulation and the need for interoperability across simulators.