[DFTB-Plus-User] Parallelization performance of v. 18.1

Ben Hourahine benjamin.hourahine at strath.ac.uk
Wed Mar 21 00:15:11 CET 2018


Hello Mie,

your graph seems to be consistent with ~95% parallel efficiency if the
speed up is about a factor of 20 for the largest case, as based on
Amdahl's law. This is about what would be expected for the ScaLAPACK
eigensolvers. Later this year we will also be offering some other
parallel solvers with a bit better scaling efficiency.

We will add some material to the online recipes about parallel code use
in the next few weeks. This will include a discussion about the type of
strong scaling calculation you are using. I suspect what you actually
need is the weak scaling behaviour of the code, i.e. where the size of
the problem is also increased as the number of processors is increased.
This weak scaling behaviour is approximately quadratic (at least for the
example cases I've tried), as would be expected from the cubic scaling
of the eigen-problem.

I guess your system is metallic, so will probably require multiple
k-points. In that case you might be able to gain some further
scalability by using the Groups option in the Parallel settings. In that
case approximately 8 cores per-k-point for this system might work well
based on your data.

Regards

Ben

On 20/03/18 16:56, Andersen, Mie wrote:
> Dear DFTB+ users,
> 
> 
> I am interested in calculating large systems and therefore did some
> tests on the parallelization performance of DFTB+ (details are given
> below). What I would like to know is if the results I am getting (see
> attached plot) is the expected performance, and if not, what I can do to
> improve on it?
> 
> 
> Compilation details:
> 
> I compiled the latest v. 18.1 release using:
> 
> - the 2017 version of Intel MPI
> 
> - the Math Kernel Libraries (MKL) containing BLACS, LAPACK and ScaLAPACK
> (v. 11.3)
> 
> 
> Test system details:
> 
> - 1600 Cu atoms in 2D periodic cell
> 
> - Gamma point k-point sampling
> 
> (I attach also the input files as well as the output file for the
> calculation on 84 cores)
> 
> 
> Hardware details:
> 
> - 28-way Haswell-based nodes and FDR14 Infiniband interconnect
> 
> 
> Thanks in advance and best regards,
> 
> Mie
> 
> 
> 
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
> 

-- 
      Dr. B. Hourahine, SUPA, Department of Physics,
    University of Strathclyde, John Anderson Building,
            107 Rottenrow, Glasgow G4 0NG, UK.
    +44 141 548 2325, benjamin.hourahine at strath.ac.uk

2013/14 THE Awards Entrepreneurial University of the Year
      2012/13 THE Awards UK University of the Year

   The University of Strathclyde is a charitable body,
        registered in Scotland, number SC015263


More information about the DFTB-Plus-User mailing list