Although we’ve talked about all the potential that HPC has, we haven’t seen it in action yet. In fact, we don’t even know how it stacks up against other supercomputers. One of the team’s goals this semester is to determine the capabilities of HPC. Benchmarking means testing using a standard. Although benchmarking provides a measure of system performance, it cannot tell us about how HPC will work every situation.
For example, a widely known benchmark for automobiles is the time it takes to accelerate to 60 mph. This gives us a frame of reference for its acceleration, but it says nothing about several other factors that may be important in evaluating the car. The acceleration benchmark cannot tell us the car’s gas mileage or what its top speed is; in fact, it may not even show the acceleration given wear and tear or modifications to the car.
In the last post, we mentioned a benchmarking tool called High Performance LINPACK (HPL), a standard benchmark for high performance computing. According to a recent article on CNET, many have criticized the use of LINPACK as a benchmark in light of China’s Tianhe-1A GPU-based system claiming first place on the Top 500 list, as LINPACK only tests floating-point calculations. We realize that a benchmark merely tests a certain aspect in a set scenario, but it will help us get a better feel for how powerful HPC actually is.
HPL is just one of the benchmarks we’ll be running on HPC. We plan to install the HPC Challenge Benchmark, a benchmarking suite that includes seven tests, in order to give us a more complete picture.
• HPL will test how fast HPC can execute floating-point calculations on linear equations.
• DGEMM, a part of the BLAS programming interface which allows for linear algebra, will be used for testing the rate at which real matrix-matrix multiplication at double-precision takes place.
• STREAM will test the “sustainable memory bandwidth (in GB/s) and the corresponding computation rate for simple vector kernels.” This is important because certain operations could be limited due to a lack the memory bandwidth as opposed to processing power.
• PTRANS tests the simultaneous communication between pairs of processors.
• RandomAccess measures the rate of random updates of memory in Giga-updates per second (GUPS). It avoids the cache by randomizing what is needed from memory.
• FFT uses Discrete Fourier Transforms to test the system’s rate of double precision executions.
• A set of tests based on b_eff determines the “bandwidth and latency of a number of simultaneous communication patterns.” b_eff returns an average number, as messages can be of different sizes.
With benchmarking, we can gauge the performance of HPC based on actual data. Once we have assessed the capabilities of our system, we will use it’s power to solve non-trivial problems such as those related to computational biology and chemistry. Keep checking back as we make progress.