1. Introduction
2. Research
The comparison of the many available computer benchmarks has received much attention [1-4]; techniques such as
characterizing the benchmark code and statistical analysis of lower-level program characteristics have been developed.
Saavedra and Smith [4] characterize the benchmark codes of the SPEC CPU and Perfect Club suites.
They also introduce the concept of benchmark instability to characterize benchmarks that are dominated
by a code block with varying execution time over various architectures; and show that LINPACK is one of the most
unstable benchmarks.
Yi et al. [3] identify benchmark similarity based on how the benchmarks affect the processors.
Using a Placket-Burman experimental design, they identify the key processor design parameters.
They group a selected subset of the SPEC CPU benchmarks into eight similarity clusters.
Joshi et al. [2] compare the four generations SPEC CPU benchmarks, and conclude that
the dynamic instruction count is the most changed; as a result, the temporal data locality
has decreased over time, while the other benchmark characteristics have remained largely the same.
Hoste and Eeckhout [1] compare the microarchitecture-independent characteristics of
six benchmark suites (SPEC CPU 2000 (SPEC2k), MediaBench, MiBench, BioInfoMark, CommBench, and BioMetricsWorkload).
They find that 9 of the 14 floating-point benchmarks of the SPEC2k suite form a cluster of almost identical characteristics,
and that the benchmarks from the MediaBench and MiBench suites have correspondents in the SPEC2k benchmarks.
They also find that the BioInfoMark, the BiometricsWorkload, and the CommBench suites comprise several benchmarks that
are not similar to any of the SPEC2k benchmarks.
3. Benchmarks
| Idx |
Name |
Keywords |
Description |
Timeline |
Links |
| 1 |
Perfect Club |
general, workstation |
The Perfect Benchmarking effort. An acronym for PERFormance Evaluation by Cost-effective Transformations, Perfect was created at UIUC's Center for Supercomputing Research and Development (CSRD) in an effort to focus supercomputer performance evaluation at the applications level. Thirteen Perfect Benchmarks are Fortran programs drawn from a number of scientific and engineering research areas, including air pollution, fluid dynamics, seismic migration, quantum chromodynamics, and signal processing. |
1994 - SPEC and Perfect Club merge
Complete FORTRAN source code |
|
| 2 |
NASA Downloadable Benchmark Codes |
general, workstation, parallel, etc. |
List of software titles are available to download, both restricted and unrestricted. |
|
|
| 3 |
SPEC |
general, workstation, $$$ |
The Perfect Benchmarking effort. An acronym for PERFormance Evaluation by Cost-effective Transformations, Perfect was created at UIUC's Center for Supercomputing Research and Development (CSRD) in an effort to focus supercomputer performance evaluation at the applications level. Thirteen Perfect Benchmarks are Fortran programs drawn from a number of scientific and engineering research areas, including air pollution, fluid dynamics, seismic migration, quantum chromodynamics, and signal processing. |
SPEC CPU 2000 |
|
| 4 |
PhysicsBench |
game physics, desktop/server, single/parallel |
PhysicsBench: benchmark suite to represent physics simulation in future game-scenarios to be used by (1) computer architects/researchers for real-time physics hw/sw designs, and (2) application designers to determine gaming platform performance bounds. |
PhysicsBench 1.0/2.0 -- see Chapter 4 and Section 5.1 |
|
| 5 |
MediaBench |
media, desktop/server, single/parallel |
The MediaBench Consortium is making it its mission to provide a forum for developing and refining benchmark suites for multimedia systems. MediaBench consists of a number of popular embedded applications for communications and multimedia.
The list of applications is available here. |
MediaBench I
MediaBench II |
|
| 6 |
MiBench |
embedded, parallel |
MiBench is a free, commercially representative embedded benchmark suite. |
MiDataSets project: New MiBench benchmark datasets (20 per program) and a MiBench fork. |
|
| 7 |
CommBench |
telecommunications networking |
CommBench is a benchmark for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. CommBench is composed of eight programs, four of them oriented towards packet header processing and four oriented towards data stream processing. |
v.2.1.1 |
|
| 8 |
LMbench |
general, system performance |
|
LMbench v.2
LMbench v.3 |
|
| 9 |
CommBench |
scientific, parallel |
CommBench is a benchmark for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. CommBench is composed of eight programs, four of them oriented towards packet header processing and four oriented towards data stream processing. |
v.2.1.1 |
|
| 10 |
PARKBench |
scientific, parallel |
The PARallel Kernels and BENCHmarks (PARKBENCH) suite is comprehensive set of parallel benchmarks that include low level, kernel, and compact application benchmarks. |
v.2.1.1 |
|
| 11 |
HPC Challenge Benchmark |
scientific, parallel |
The HPC Challenge benchmark consists at this time of 7 benchmarks: HPL, STREAM, RandomAccess, PTRANS, FFTE, DGEMM and b_eff Latency/Bandwidth. HPL is the Linpack TPP benchmark. The test stresses the floating point performance of a system. STREAM is a benchmark that measures sustainable memory bandwidth (in GB/s), RandomAccess measures the rate of random updates of memory. PTRANS measures the rate of transfer for larges arrays of data from multiprocessor’s memory. Latency/Bandwidth measures (as the name suggests) latency and bandwidth of communication patterns of increasing complexity between as many nodes as is time-wise feasible. |
HPCC Benchmark v.1.2.0 |
|
| 12 |
Stanford SPLASH-2 |
scientific, parallel |
The Modified SPLASH-2 Benchmarks Suite is based on the Original SPLASH-2 Benchmarks Suite: it incorporates a few bug fixes, and mostly changes to make the suite compatible with modern programming practices. |
Original SPLASH 2 (copy) |
|
| 13 |
NPB |
scientific, parallel,
registration |
The NASA Parallel Benchmarks (NPB) are a small set of programs designed to help evaluate the performance of parallel supercomputers. The benchmarks, which are derived from computational fluid dynamics (CFD) applications, consist of five kernels and three pseudo-applications. |
|
|
| 14 |
BioBench/BioParallel |
bioinformatics, single/parallel |
BioBench is a bioinformatics suite assembled by members of the Systems and Computer Architecture Lab (SCAL) in the Department of Electrical and Computer Engineering at the University of Maryland. It includes applications for
Sequence Similarity Searching (BLAST, FASTA), Phylogenetic Analysis (PHYLIP), Multiple Sequence Alignment (CLUSTALW), Sequence Profile Searching (HMMER),
Genome-level Alignment (MUMMER), and Sequence Assembly (TIGR). |
|
|
| 15 |
ALPBench |
multimedia, single/parallel,
registration |
The All Levels of Parallelism for Multimedia Benchmark Suite (ALPBench) consists of a set of parallelized complex media applications gathered from various sources, and modified to expose thread-level and data-level parallelism.
The applications area are speech recognition, face recognition, ray traceing, and MPEG-2 encoding/decoding. |
|
|
| 16 |
BioMetricsWorkload |
biometrics, single |
BioMetricsWorkload (BMW) is a benchmark suite of biometric applications. To characterize the architectural aspects of biometrics applications, various benchmarks have been collected. Currently, the proposed BMW suite contains five applications (i.e. handwriting, fingerprint, face, voice and gait recognition) which cover a variety of the major biometrics techniques. |
|
|
| 17 |
BioInfoMark |
bioinfomatics, single |
BioInfoMark is a benchmark suite of representative bioinfomatics applications to facilitate the design and evaluation of computer architectures for these emerging workloads. The suite was designed, and is currently maintained, by Dr. Tao Li and Yue Li of the IDEAL research lab at the University of Florida. The suite is released freely for educational and research use, and can be downloaded via this page. |
|
|
| 18 |
BioPerf |
bioinfomatics, single/parallel |
BioPerf is a benchmark suite of representative bioinformatics applications to facilitate the design and evaluation of high-performance computer architectures for these emerging workloads. Currently, the BioPerf suite contains codes from 10 highly popular bioinformatics packages and covers the major fields of study in computational biology such as sequence comparison, phylogenetic reconstruction, protein structure prediction, and sequence homology & gene finding. The BioPerf suite includes benchmark source code, input datasets of various sizes, and information for compiling and using the benchmarks. It also includes parallel codes where available. |
|
|
| 19 |
Bonnie/Bonnie++ |
file system, single |
Bonnie is a file system benchmark that tests sequential I/O and random accesses. Bonnie++ is a benchmark suite that is aimed at performing a number of simple tests of hard drive and file system performance. |
Bonnie (archive)
Bonnie++ |
Bonnie
Bonnie++
|
| 20 |
GrenchMark/ServMark |
general, grid |
GrenchMark is a flexible and extensible framework for generating synthetic grid workloads. |
|
|
| Idx | Reference | Links |
| [1] | Kenneth Hoste, Lieven Eeckhout: Comparing Benchmarks Using Key Microarchitecture-Independent Characteristics. IISWC 2006: 83-92. |
 |
| [2] | Ajay Joshi, Aashish Phansalkar, Lieven Eeckhout, Lizy Kurian John: Measuring Benchmark Similarity Using Inherent Program Characteristics. IEEE Trans. Computers 55(6): 769-782 (2006). |
 |
| [3] | Joshua J. Yi, David J. Lilja, Douglas M. Hawkins: A Statistically Rigorous Approach for Improving Simulation Methodology. HPCA 2003: 281-. |
 |
| [4] | Rafael H. Saavedra, Alan Jay Smith: Analysis of Benchmark Characteristics and Benchmark Performance Prediction. ACM Trans. Comput. Syst. 14(4): 344-384 (1996). |
 |