HPC Challenge Benchmark Record

System Information
Affiliation:   Army High Performance Computing Research Center (AHPCRC)   URL:  
Location:   Minneapolis Minnesota   System Use:   Government
System Manufacturer:   Cray Inc.   System Name:   X1
Interconnect Manufacturer:   Cray   Interconnect Type:   Cray modified 2D torus
Operating System:   Unicos/MP 2.4   MPI:   MPT.
MPI Wtick:     BLAS:   Cray libsci
Language:   C   Compiler:  
Compiler Flags:     Processor Type:   Cray X1 MSP
Processor Speed:   0.8 GHz   Total Processors:   128
Processors Entered:   124   Processors determined:   124
Cores per chip:     HPL Processes:   124
MPI Processes:   124   Threads Entered:   1
Threads determined:   1   FLOPs per cycle:   16
Theoretical peak:   1.5874 TFlop/s   Total memory:   GiB
FFT library:    
Explain Optimizations:
STREAM: Aligned the data to cache line boundaries and added no_cache_alloc directives. Single cpu RandomAccess: Change vector length to 1024 and added concurrent directive. MPI RandomAccess: Changed distribution so all processors have equal number of elements except last cpu, this eliminated the need for a if test. Implemented "Extra Buckets" to vectorize.

HPL:   1.18203 Tflop/s   HPL time:   4217.86
HPL eps:   1.110223e-16   HPL Rnorm1:  
HPL Anorm1:     HPL AnormI:  
HPL Xnorm1:     HPL XnormI:  
HPL N:   195555   HPL NB:   112
HPL NProw:   4   HPL NPcol:   31
HPL depth:   1   HPL NBdiv:   2
HPL NBmin:   4   HPL CPfact:   R
HPL CRfact:   R   HPL CPtop:   1
HPL order:   R
HPL dMach EPS:     HPL sMach EPS:  
HPL dMach sfMin:     HPL sMach sfMin:  
HPL dMach Base:     HPL sMach Base:  
HPL dMach Prec:     HPL sMach Prec:  
HPL dMach mLen:     HPL sMach mLen:  
HPL dMach Rnd:     HPL sMach Rnd:  
HPL dMach eMin:     HPL sMach eMin:  
HPL dMach rMin:     HPL sMach rMin:  
HPL dMach eMax:     HPL sMach eMax:  
HPL dMach rMax:     HPL sMach rMax:  
dweps:     sweps:  

PTRANS:   39.3826 GB/s   PTRANS time:   1.94 seconds
PTRANS residual:   0   PTRANS N:   97777
PTRANS NB:   121   PTRANS NProw:   4
PTRANS NPcol:   31

S-STREAM Copy:   20.4881 GB/s   S-STREAM Scale:   21.0179 GB/s
S-STREAM Add:   23.8419 GB/s   S-STREAM Triad:   24.0247 GB/s
EP-STREAM Copy:   19.5068 GB/s   EP-STREAM Scale:   19.3419 GB/s
EP-STREAM Add:   21.1848 GB/s   EP-STREAM Triad:   21.7521 GB/s
STREAM Vector Size:   102800416   STREAM Threads:   1

S-RandomAccess:   0.208216 Gup/s   EP-RandomAccess:   0.208677 Gup/s
G-RandomAccess:   Gup/s   G-RandomAccess N:  
G-RandomAccess time:   90.268261 seconds   G-RandomAccess Check Time:   seconds
G-RandomAccess Errors:     G-RandomAccess Errors Fraction:  
G-RandomAccess TimeBound:     G-RandomAccess ExeUpdates:  
RandomAccess N:  

S-FFT:   GFlop/s   EP-FFT:   GFlop/s
MPIFFT:   GFlop/s   MPIFFT N:  
MPIFFT Max Error:     MPIFFT time0:   seconds
MPIFFT time1:   seconds   MPIFFT time2:   seconds
MPIFFT time3:   seconds   MPIFFT time4:   seconds
MPIFFT time5:   seconds   MPIFFT time6:   seconds
FFTEnblk:     FFTEnp:  

S-DGEMM:   GFlop/s   EP-DGEMM:   GFlop/s

RandomRing Latency/Bandwidth
RandomRing Latency:   20.8481 usec   RandomRing Bandwidth:   0.803881 GB/s

NaturalRing Latency/Bandwidth
NaturalRing Latency:   18.089 usec   NaturalRing Bandwidth:   4.11851 GB/s

PingPong Latency/Bandwidth
Maximum PingPong Latency:   9.69063 usec   Maximum PingPong Bandwidth:   9.103323 GB/s
Minimum PingPong Latency:   8.109 usec   Minimum PingPong Bandwidth:   4.99775 GB/s
Average PingPong Latency:   8.981 usec   Average PingPong Bandwidth:   8.502434 GB/s

Size of Data Types
char:   byte     short:   bytes
int:   bytes   long:   bytes
void ptr:   bytes   float:   bytes
double:   bytes   size t:   bytes
s64Int:   bytes   u64Int:   bytes

M OpenMP:     OpenMP Num Threads:  
OpenMP Num Procs:     OpenMP Max Threads:  

MemProc:     MemSpec:  


Version: 0.5.1.b - Run Type: opt - Parent ID: 28
Created: 2004-05-03 - Exported: Thu Jun 23 15:55:33 2022
HPC Challenge Benchmark Record