[DFTB-Plus-User] R: Fwd: DFTB+ v 21.2 - memory saturation on cluster
Alessandro Pecchia
alessandro.pecchia at cnr.it
Fri Jul 22 09:58:18 CEST 2022
Dear Giacomo,
can you recompile dftb+ with GNU compiler, MKL and openmpi and run it on the cluster? Maybe the Intel compiler has some issue in automatically deallocating stuff, we don’t know ? It would be interesting to know.
With gnu you may loose a bit, but if you use MKL (also the included scalapack) you should be almost there.
Alex
Da: DFTB-Plus-User [mailto:dftb-plus-user-bounces at mailman.zfn.uni-bremen.de] Per conto di giacomo buccella
Inviato: venerdì 22 luglio 2022 09:29
A: dftb-plus-user at mailman.zfn.uni-bremen.de
Oggetto: [DFTB-Plus-User] Fwd: DFTB+ v 21.2 - memory saturation on cluster
Dear DFTB+ users,
I'd like to ask your opinion about a memory issue generated by running the code on a cluster machine.
When I run DFTB+ (v 21.2), the memory% used by the code increases as the time elapses, until reaching a 100% saturation. If the runs lasts for a long time (for instance with a high number of dynamical steps) the calculation ends with bad termination.
On the cluster, I installed the code using an Intel 2020 compiler, the Intel variant of MPI and MKL 2020. Please find attached the cluster.zip file, in which I included the detailed procedure I followed and the configuration files I used. The code has been compiled with PLUMED library support, which however does not seem to be the cause for memory saturation.
By running the code on my laptop, instead, I don't find any memory problem. In this case I built it in a GNU environment, with the following libraries:
openmpi-4.1.3
scalapack-2.1.0
OpenBLAS-0.3.20
plumed-2.8.0
Please find attached the laptop.zip file, where I included the detailed procedure I followed and the configuration files I used.
The test-calculation.zip file includes the input files needed to reproduce a test calculation that runs fine on my laptop, but gives rise to memory saturation on cluster nodes.
To be more specific, the memory saturation happens on the cluster only when a parallel MPI run with more than one job is used.
Does anyone experienced such an issue with DFTB+?
Thank you very much. I'd be grateful for your help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/attachments/20220722/4dd5ca92/attachment.htm>
More information about the DFTB-Plus-User
mailing list