[DFTB-Plus-User] DFTB+MPI-NEGF on SLURM
Gabriele Penazzi
penazzi at uni-bremen.de
Thu Apr 9 11:24:13 CEST 2015
On 04/09/2015 11:07 AM, Alessandro Pirrotta wrote:
> Thank you both for your emails.
>
> I run the job using the following commands and in both cases it seems
> like the same job is running in $NCPUS
> giving me $NCPUS outputs overlapping on the same file.
>
> srun -n $NCPUS -cores-per-socket=$NCPUS dftb+mpi-negf.r4732_ifort_tested
> mpirun -np $NCPUS dftb+mpi-negf.r4732_ifort_tested
>
> FYI I compiled it using the extlib from the website and the
> makefile make.x86_64-linux-ifort
If you downloaded the code directly from the website you may give a look
to this hotfix:
https://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/2015/001784.html
If colliding I/O persists, try to lower verbosity settings.
Best,
Gabriele
>
> *Alessandro Pirrotta*
> PhD student
>
>
>
> *Faculty of Science**
> *Department of Chemistry &
> Nano-Science Center
> University of Copenhagen
> Universitetsparken 5, C321
> 2100 Copenhagen Ø
> Denmark
>
> DIR +45 21 18 11 90
> MOB +45 52 81 23 41
>
> alessandro.pirrotta at chem.ku.dk <mailto:alessandro.pirrotta at chem.ku.dk>
>
> alessandro.pirrotta at gmail.com <mailto:alessandro.pirrotta at gmail.com>
>
> www.ki.ku.dk <http://www.ki.ku.dk/>
>
>
> On 9 April 2015 at 10:56, Alessandro Pirrotta <tqn722 at alumni.ku.dk
> <mailto:tqn722 at alumni.ku.dk>> wrote:
>
> Thank you both for your emails.
>
> I run the job using the following commands and in both cases it
> seems like the same job is running in $NCPUS
> giving me $NCPUS outputs overlapping on the same file.
>
> srun -n $NCPUS -cores-per-socket=$NCPUS
> dftb+mpi-negf.r4732_ifort_tested
> mpirun -np $NCPUS dftb+mpi-negf.r4732_ifort_tested
>
> FYI I compiled it using the extlib from the website and the
> makefile make.x86_64-linux-ifort
>
>
> *Alessandro Pirrotta*
> PhD student
>
>
>
> *Faculty of Science**
> *Department of Chemistry &
> Nano-Science Center
> University of Copenhagen
> Universitetsparken 5, C321
> 2100 Copenhagen Ø
> Denmark
>
> DIR +45 21 18 11 90 <tel:%2B45%2021%2018%2011%2090>
> MOB +45 52 81 23 41 <tel:%2B45%2052%2081%2023%2041>
>
> alessandro.pirrotta at chem.ku.dk <mailto:alessandro.pirrotta at chem.ku.dk>
>
> alessandro.pirrotta at gmail.com <mailto:alessandro.pirrotta at gmail.com>
>
> www.ki.ku.dk <http://www.ki.ku.dk/>
>
>
> On 9 April 2015 at 10:05, Gabriele Penazzi <penazzi at uni-bremen.de
> <mailto:penazzi at uni-bremen.de>> wrote:
>
> On 04/09/2015 09:07 AM, Alessandro Pirrotta wrote:
>> Dear DFTB+ users,
>>
>> I am having a problem running the DFTB+ on SLURM.
>> When I am connected to the front end of my account in my
>> university computer cluster, the executable runs correctly: I
>> have run the test and only 2 tests failed
>> (
>> ======= spinorbit/EuN =======
>> electronic_stress element 0.000101791078878
>> Failed
>> stress element 0.000101791078878
>> Failed
>> )
>>
>> When I submit a job with SLURM and I execute normally ./dftb+
>> I get a MPI error (see below).
>> If I run "mpi -n 1 dftb+" the job runs correctly over a node
>> and a single core.
>> *How do I run dftb+ over a single node, using n cores?*
> *[cut]*
>
> Hi Alessandro,
>
> when running NEGF, the parallelization is very different with
> respect to solving the aigenvalue problem. dftb+negf is
> parallelized on two level: MPI by distribution of energy
> points and possibly OMP by linking with threaded blas/lapack
> libraries. The former is on us and it is mandatory to compile
> supporting MPI, the latter is on the BLAS/LAPACK vendor and it
> may be active or not depending on the way you compile it. See
> the README.NEGF and README.PARALLEL files in the src directory.
>
> If you link a threaded library, then you will have to specify
> how many OMP threads you assign per process in your job
> script. For example
>
> $ export OMP_NUM_THREADS=4
> $ mpirun -n 1 dftb+
>
> would use 4 cores on 1 process (the correct specification
> depends on your architecture, you may need or not additional
> flags but probably you have an howto related to your
> facility). Therefore the answer to you question is that you
> may want to use n threads on 1 process, or n processes on n
> cores, or (more likely) something in the middle depending on
> your system.
>
> A note on efficiency. As on "common" test systems (tens to
> thousands atoms) the lapack/scalack scale efficiently up to
> 2-4 threads, it is usually convenient to reserve some cores
> for threading. Also, as the parallelization on energy points
> implies solving N independent Green's functions, therefore it
> needs to allocate N times memory where N is the number of
> processes. For large systems it may be necessary to run a
> process per socket, to get the maximum available memory. With
> the current version also the Poisson is a bit more efficient
> if you have less processes on a socket, considering these
> points at the end of the day I usually run with 2 or 4 omp
> threads (if I don't hit memory problems).
>
> Hope this helps,
> Gabriele
>
>
>
>
> --
> --
> Dr. Gabriele Penazzi
> BCCMS - University of Bremen
>
> http://www.bccms.uni-bremen.de/
> http://sites.google.com/site/gabrielepenazzi/
> phone: +49 (0) 421 218 62337 <tel:%2B49%20%280%29%20421%20218%2062337>
> mobile: +49 (0) 151 19650383 <tel:%2B49%20%280%29%20151%2019650383>
>
>
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> <mailto:DFTB-Plus-User at mailman.zfn.uni-bremen.de>
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
>
>
>
>
>
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
--
--
Dr. Gabriele Penazzi
BCCMS - University of Bremen
http://www.bccms.uni-bremen.de/
http://sites.google.com/site/gabrielepenazzi/
phone: +49 (0) 421 218 62337
mobile: +49 (0) 151 19650383
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/attachments/20150409/65e457b1/attachment.htm>
More information about the DFTB-Plus-User
mailing list