[DFTB-Plus-User] Correct Way to Run MPI-enabled DFTB+ (Conda Installation)
Ankur Gupta
ankur at lbl.gov
Wed Apr 27 22:46:57 CEST 2022
Dear Dr. Aradi,
Thank you for your detailed response. Yes, it seems like only one process
and one thread are sufficient for the systems I am working with. I will
work on larger systems in the future; however, I am currently testing the
performance of xTB methods on only smaller systems.
I have another question related to the usage of DFTB+. I am currently
trying to optimize periodic organic systems (containing around 60-110
atoms) using the GFN2-xTB method; however, it seems like the geometry
optimization is extremely slow in terms of gradient convergence. For
example, in around 300 geometry optimization steps, the gradient norm
dropped from 0.72 H/a0 to only 0.0005499 H/a0. Is there a way to speed up
the gradient convergence so that the geometry optimization finishes in
fewer steps?
Best,
Ankur
Best,
Ankur
On Wed, Apr 27, 2022 at 7:49 AM Bálint Aradi <aradi at uni-bremen.de> wrote:
> Dear Ankur,
>
> please note, that the sizes of the DFTB and xTB Hamiltonians are very
> small (compared to ab initio Hamiltonians). Consequently,
> MPI-parallelization starts to pay off only, if you have large systems. A
> periodic structure with 64 atoms (where a diagonalization takes probably
> not even seconds), runs best with 1 process and 1 thread only.
> Distributing it over more processes/threads will just increase the
> communication overhead and slow it down.
>
> As for the problems with thread creation: Maybe, you were launching too
> many processes/threads at the same time. E.g. if you run an MPI-parallel
> job and does not restrict the number of threads, each MPI-process may
> create as many threads as cores on the node, resulting in way too many
> threads. (Talking about which, it is very recommended to leave the
> UseOmpThreads option in the Parallel{} section at its default value
> 'No'. That helps to avoid such problems, as it stops the MPI-version if
> the number of threads is greater than 1).
>
> Best regards,
>
> Bálint
>
> On 26.04.22 20:51, Ankur Gupta wrote:
> > Hi all,
> >
> > I have installed the DFTB+ (MPI version) program through Anaconda. The
> > program seems to be working fine on a single processor. For example, for
> > a periodic structure containing 64 atoms, the DFTB+ program completed
> > around 130 optimization steps at GFN2-xTB theory on a single processor
> > (MPI processes: 1, OpenMP threads: 1).
> >
> > However, when I use multiple processors using the command "mpirun -np 12
> > dftb+ > DFTB_output'' (MPI processes: 12, OpenMP threads: 1), the
> > calculation becomes extremely slow. For the same molecule, it takes
> > around half an hour for one optimization step. I also set up the
> > following environment variables,
> > export OMP_NUM_THREADS=1
> > export OMP_PLACES=threads
> > export OMP_PROC_BIND=spread
> >
> > Another thing I noticed is that if I don't set up these environment
> > variables, I get the following error,
> > libgomp: Thread creation failed: Resource temporarily unavailable
> >
> > Therefore, I would like to know the correct way to run DFTB+ with MPI so
> > that I could make the computations faster.
> >
> > Best,
> > Ankur
> >
> > Best,
> > Ankur
> >
> > _______________________________________________
> > DFTB-Plus-User mailing list
> > DFTB-Plus-User at mailman.zfn.uni-bremen.de
> >
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
>
>
> --
> Dr. Bálint Aradi
> Bremen Center for Computational Materials Science, University of Bremen
> http://www.bccms.uni-bremen.de/cms/people/b-aradi/
>
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/attachments/20220427/dfa09543/attachment.htm>
More information about the DFTB-Plus-User
mailing list