SPEC

The SPEC CPU 2017 Benchmark and Hardware Simulation – A User Story

One of the difficulties in processor simulation is finding workloads that are based on real applications. The gem5 simulator, combined with SPEC CPU 2017, has recently helped to highlight complex areas of processor design related to moving data more efficiently.

Jason Lowe-Power

Jason Lowe-Power, an associate professor of computer architecture research at the University of California, Davis (UCD) for eight years, leads a group working in two critical areas. One is developing ideas for new hardware to help move more data more efficiently in future processors. The other is developing and specifying an infrastructure to enable other researchers and vendors to simulate the performance of new hardware ideas and designs – whether developed by the group or others – to assess the viability and value before committing to manufacturing them. It’s in simulating architecture that the SPEC CPU 2017 benchmark plays a vital role.

Lowe-Power is the project management committee chair of gem5 , a widely used open source platform that allows computer architecture researchers to simulate a wide range of computer systems, from simple embedded processors to complex out-of-order superscalar cores, to full-systems including memory hierarchies, I/O devices, and operating systems. gem5 was primarily conceived in the 2000s at the University of Michigan and the University of Wisconsin, Madison. Since then, hundreds of individuals from Arm, AMD, Google, HP, and other companies have been instrumental in its development.

gem5 can model the behavior of a processor at a very detailed level, tracking the execution of individual instructions cycle by cycle by modeling features like pipelining, caching, and branch prediction. It also offers different levels of simulation detail and speed, allowing researchers to choose the right trade-off for their specific needs. Atomic Mode, the fastest but least detailed, treats instructions as executing instantaneously. Timing Mode is more detailed and models pipeline stages and timing. Full-System Mode simulates an entire system, including booting an operating system, running applications, and interacting with virtual peripherals. These capabilities enable research teams to assess the potential impact on an entire system if they replace just one hardware component with a new design.

Once researchers decide on their simulation strategy, they still need to assess the performance of the new design in real-world scenarios. You can have an idea that makes “matrix multiply,” a cornerstone workload used in evaluating computing system performance, go a million times faster, but that will not have any impact on an application like Microsoft Word. So it's critical that researchers and vendors use benchmarks that are representative of real-world experiences.

This is why Lowe-Power uses the SPEC CPU 2017 benchmark suites. The benchmark is a set of standardized tests used to measure and compare compute-intensive performance. It stresses a system's processor, memory subsystem, and compiler using the broadest set of workloads developed from real user applications. There are 43 workloads in all, including integer workloads, such as 500.perlbench_r running the SpamAssassin email checker and 557.xz_r, a data compression and decompression utility using the LZMA2 algorithm. There are also floating point workloads such as 526.blender_r, a 3D modeling and rendering application, and the data-movement intensive workloads 500.bwaves_r (explosion modeling), 549.fotonik3d_r (Computational Electromagnetics) and 554.roms_r (regional ocean modeling).

Recently Lowe-Power’s team used the SPEC CPU 2017 benchmark to validate the models in gem5 against real hardware. The models provided by gem5 are highly configurable to simulate a wide variety of different CPU designs, but finding trusted values for the model parameters is difficult and time consuming. With the SPEC CPU 2017 benchmark, Lowe-Power’s team is providing validated models that can enable other researchers to have a reliable baseline to compare new ideas against.

Lowe-Power’s team has also used the benchmark to evaluate the potential of taking advantage of cryogenic semiconductor computing and superconductor electronics as promising alternatives to traditional semiconductor devices.

“SPEC has a well-deserved reputation for using its benchmarks to measure the performance of existing hardware systems, but not as many people know about our support for the SPEC CPU 2017 benchmark in gem5,” said Lowe-Power. “There are thousands of users of gem5 around the world who could get better insights into the benefits of their architectural ideas by taking advantage of the real-world workloads in the SPEC CPU 2017 benchmark. This can help them make better decisions about what is ultimately the future of computing.”

SPEC is committed to making access to benchmarks affordable for use in academic papers and research and offers special academic pricing on the SPEC CPU 2017 benchmark for organizations with 501(c)(3) or equivalent standing.