[DFTB-Plus-User] Running DFTB+ on Ibm-multicores

Giovanni La Penna glapenna at iccom.cnr.it
Wed Apr 21 12:01:36 CEST 2010


Dear DFTB users and developers,

  I am measuring the performance of DFTB+ 1.1 for systems
of 1300 atoms on Ibm multicore systems, sp6 based,
running aix OS. The nodes of the architecture I
am testing have 32 cores, so they should
run OpenMP applications with 32 threads, even 64
with some tricks.
They actually do this, but with the same
user and wall times (measured with the "time" Unix command)
of the serial version (netlib lapack).
Here attached is the makefile.
Two questions:

1) Is the "time" command the proper way to measure
performance for OpenMP applications? Are there other
possibilities?

2) Before trying to install a (maybe) multi-threading
version of Atlas (that with "pt"), do you expect any improvement
compared to the thread-safe atlas version (that without "pt",
the only one installed on the system so far)?

Thank you in advance,

               Giovanni La Penna

============================================================
Giovanni La Penna - National research council (Cnr)
Institute for chemistry of organo-metallic compounds (Iccom)
via Madonna del Piano 10,
I-50019 Sesto Fiorentino, Firenze, Italy
tel.: +39 055 522-5264, fax: +39 055 522-5203
e-mail: glapenna at iccom.cnr.it - http://www.iccom.cnr.it/lapenna
skype: giovannilapenna
============================================================
-------------- next part --------------
# -*- makefile -*-
############################################################################
# System dependent Makefile options for
# AIX, IBM xlf compiler (version 3.1)
############################################################################

# Fortran 90 compiler
# cf. http://www.nersc.gov/nusers/resources/software/ibm/xlf.php
#FC90 = xlf95 -qsuffix=f=f90 # -qsuppress=cmpmsg
FC90 = xlf95_r -qsuffix=f=f90 # -qsuppress=cmpmsg


# Optimization flags for the Fortran 90 compiler
#FC90OPT = -O3 -qstrict -qarch=auto -qtune=auto
# From Jump documentation on PowerPC6
FC90OPT = -O3 -qsmp=omp -qstrict -qarch=pwr6 -qtune=pwr6 

# Preprocessor (leave empty, if the compiler has a built in preprocessor)
#CPP = cpp -traditional
CPP = cpp

# Options for preprocessing
CPPOPT = -WF,-DDEBUG=$(DEBUG)

# Postprocessing of the preprocessor output (add-on pipe)
CPPPOST = $(ROOT)/utils/fpp/fpp.sh noln2

# Linker
LN = $(FC90)

# Linker options
LNOPT = 

# Override options for different DEBUG modes
ifeq ($(DEBUG),1)
    FC90OPT = -g
endif
ifeq ($(DEBUG),2)
    FC90OPT = -g
endif
ifeq ($(DEBUG),3)
    FC90OPT = -g -C
endif

# Library options in general
# on Cineca-sp6
# module load atlas
LIBOPT = -L${ATLAS_LIB} 
# netlib lapack
# module load lapack
#-L${LAPACK_LIB}

# How to link specific libraries

# standard lapack
# these make a rather slow parallel dftb on Cineca-sp6
#LIB_BLAS   =  -lblas
#LIB_LAPACK = -llapack -lxlsmp

# thread-safe atlas
LIB_LAPACK = -llapack -lxlsmp
LIB_BLAS = -lf77blas -lcblas -latlas



More information about the DFTB-Plus-User mailing list