Abstract: A portable program executes on different platforms and yields consistent performance. With the focus on portability, this paper presents an in-depth study of the performance of three NAS benchmarks (EP, MG, FT) compiled with three commercial HPF compilers (APR, PGI, IBM) on the IBM SP2. Each benchmark is evaluated in two versions: using DO loops and using F90 constructs and/or HPF's Forall statement. Base-line comparison is provided by versions of the benchmarks written in Fortran/MPI and ZPL, a data parallel language developed at the University of Washington
While some F90/Forall programs achieve scalable performance with some compilers, the results indicate a considerable portability problem in HPF programs. Two sources for the problem are identified. First, Fortran's semantics require extensive analysis and optimization to arrive at a parallel program; therefore relying on the compiler's capability alone leads to unpredictable performance. Second, the wide differences in the parallelization strategies used by each compiler often require an HPF program to be customized for the particular compiler. While improving compiler optimizations may help to reduce some performance variations, the results suggest that the foremost criteria for portability is a concise performance model that the compiler must adhere to and that the users can rely on.