[DFTB-Plus-User] Parallelization performance of v. 18.1

Andersen, Mie mie.andersen at ch.tum.de
Thu Mar 22 09:33:37 CET 2018


Dear Ben,
Thanks for the explanation, that's good to know.
The system is metallic, but the supercell is large enough that Gamma point sampling is sufficient.
Best,
Mie

-----Original Message-----
From: DFTB-Plus-User [mailto:dftb-plus-user-bounces at mailman.zfn.uni-bremen.de] On Behalf Of Ben Hourahine
Sent: Wednesday, 21 March, 2018 0:15
To: dftb-plus-user at mailman.zfn.uni-bremen.de
Subject: Re: [DFTB-Plus-User] Parallelization performance of v. 18.1

Hello Mie,

your graph seems to be consistent with ~95% parallel efficiency if the speed up is about a factor of 20 for the largest case, as based on Amdahl's law. This is about what would be expected for the ScaLAPACK eigensolvers. Later this year we will also be offering some other parallel solvers with a bit better scaling efficiency.

We will add some material to the online recipes about parallel code use in the next few weeks. This will include a discussion about the type of strong scaling calculation you are using. I suspect what you actually need is the weak scaling behaviour of the code, i.e. where the size of the problem is also increased as the number of processors is increased.
This weak scaling behaviour is approximately quadratic (at least for the example cases I've tried), as would be expected from the cubic scaling of the eigen-problem.

I guess your system is metallic, so will probably require multiple k-points. In that case you might be able to gain some further scalability by using the Groups option in the Parallel settings. In that case approximately 8 cores per-k-point for this system might work well based on your data.

Regards

Ben

On 20/03/18 16:56, Andersen, Mie wrote:
> Dear DFTB+ users,
> 
> 
> I am interested in calculating large systems and therefore did some 
> tests on the parallelization performance of DFTB+ (details are given 
> below). What I would like to know is if the results I am getting (see 
> attached plot) is the expected performance, and if not, what I can do 
> to improve on it?
> 
> 
> Compilation details:
> 
> I compiled the latest v. 18.1 release using:
> 
> - the 2017 version of Intel MPI
> 
> - the Math Kernel Libraries (MKL) containing BLACS, LAPACK and 
> ScaLAPACK (v. 11.3)
> 
> 
> Test system details:
> 
> - 1600 Cu atoms in 2D periodic cell
> 
> - Gamma point k-point sampling
> 
> (I attach also the input files as well as the output file for the 
> calculation on 84 cores)
> 
> 
> Hardware details:
> 
> - 28-way Haswell-based nodes and FDR14 Infiniband interconnect
> 
> 
> Thanks in advance and best regards,
> 
> Mie
> 
> 
> 
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-u
> ser
> 

-- 
      Dr. B. Hourahine, SUPA, Department of Physics,
    University of Strathclyde, John Anderson Building,
            107 Rottenrow, Glasgow G4 0NG, UK.
    +44 141 548 2325, benjamin.hourahine at strath.ac.uk

2013/14 THE Awards Entrepreneurial University of the Year
      2012/13 THE Awards UK University of the Year

   The University of Strathclyde is a charitable body,
        registered in Scotland, number SC015263 _______________________________________________
DFTB-Plus-User mailing list
DFTB-Plus-User at mailman.zfn.uni-bremen.de
https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user


More information about the DFTB-Plus-User mailing list