[DFTB-Plus-User] DFTB+MPI-NEGF on SLURM

Gabriele Penazzi penazzi at uni-bremen.de
Thu Apr 9 11:24:13 CEST 2015


On 04/09/2015 11:07 AM, Alessandro Pirrotta wrote:
> Thank you both for your emails.
>
> I run the job using the following commands and in both cases it seems
> like the same job is running in $NCPUS
> giving me $NCPUS outputs overlapping on the same file.
>
> srun -n $NCPUS -cores-per-socket=$NCPUS dftb+mpi-negf.r4732_ifort_tested
> mpirun -np $NCPUS dftb+mpi-negf.r4732_ifort_tested
>
> FYI I compiled it using the extlib from the website and the
> makefile make.x86_64-linux-ifort

If you downloaded the code directly from the website you may give a look
to this hotfix:

https://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/2015/001784.html

If colliding I/O persists, try to lower verbosity settings.

Best,
Gabriele


>
> *Alessandro Pirrotta*
> PhD student
>
>  
>
> *Faculty of Science**
> *Department of Chemistry &
> Nano-Science Center
> University of Copenhagen
> Universitetsparken 5, C321
> 2100 Copenhagen Ø
> Denmark
>
> DIR +45 21 18 11 90
> MOB +45 52 81 23 41
>
> alessandro.pirrotta at chem.ku.dk <mailto:alessandro.pirrotta at chem.ku.dk>
>
> alessandro.pirrotta at gmail.com <mailto:alessandro.pirrotta at gmail.com>
>
> www.ki.ku.dk <http://www.ki.ku.dk/>
>
>
> On 9 April 2015 at 10:56, Alessandro Pirrotta <tqn722 at alumni.ku.dk
> <mailto:tqn722 at alumni.ku.dk>> wrote:
>
>     Thank you both for your emails.
>
>     I run the job using the following commands and in both cases it
>     seems like the same job is running in $NCPUS
>     giving me $NCPUS outputs overlapping on the same file.
>
>     srun -n $NCPUS -cores-per-socket=$NCPUS
>     dftb+mpi-negf.r4732_ifort_tested
>     mpirun -np $NCPUS dftb+mpi-negf.r4732_ifort_tested
>
>     FYI I compiled it using the extlib from the website and the
>     makefile make.x86_64-linux-ifort
>
>
>     *Alessandro Pirrotta*
>     PhD student
>
>      
>
>     *Faculty of Science**
>     *Department of Chemistry &
>     Nano-Science Center
>     University of Copenhagen
>     Universitetsparken 5, C321
>     2100 Copenhagen Ø
>     Denmark
>
>     DIR +45 21 18 11 90 <tel:%2B45%2021%2018%2011%2090>
>     MOB +45 52 81 23 41 <tel:%2B45%2052%2081%2023%2041>
>
>     alessandro.pirrotta at chem.ku.dk <mailto:alessandro.pirrotta at chem.ku.dk>
>
>     alessandro.pirrotta at gmail.com <mailto:alessandro.pirrotta at gmail.com>
>
>     www.ki.ku.dk <http://www.ki.ku.dk/>
>
>
>     On 9 April 2015 at 10:05, Gabriele Penazzi <penazzi at uni-bremen.de
>     <mailto:penazzi at uni-bremen.de>> wrote:
>
>         On 04/09/2015 09:07 AM, Alessandro Pirrotta wrote:
>>         Dear DFTB+ users,
>>
>>         I am having a problem running the DFTB+ on SLURM.
>>         When I am connected to the front end of my account in my
>>         university computer cluster, the executable runs correctly: I
>>         have run the test and only 2 tests failed 
>>         (
>>         ======= spinorbit/EuN =======
>>         electronic_stress    element              0.000101791078878  
>>                 Failed
>>         stress               element              0.000101791078878  
>>                 Failed 
>>         )
>>
>>         When I submit a job with SLURM and I execute normally ./dftb+
>>         I get a MPI error (see below).
>>         If I run "mpi -n 1 dftb+" the job runs correctly over a node
>>         and a single core.
>>         *How do I run dftb+ over a single node, using n cores?*
>         *[cut]*
>
>         Hi Alessandro,
>
>         when running NEGF, the parallelization is very different with
>         respect to solving the aigenvalue problem. dftb+negf is
>         parallelized on two level: MPI by distribution of energy
>         points and possibly OMP by linking with threaded blas/lapack
>         libraries. The former is on us and it is mandatory to compile
>         supporting MPI, the latter is on the BLAS/LAPACK vendor and it
>         may be active or not depending on the way you compile it. See
>         the README.NEGF and README.PARALLEL files in the src directory.
>
>         If you link a threaded library, then you will have to specify
>         how many OMP threads you assign per process in your job
>         script. For example
>
>         $ export OMP_NUM_THREADS=4
>         $ mpirun -n 1 dftb+
>
>         would use 4 cores on 1 process (the correct specification
>         depends on your architecture, you may need or not additional
>         flags but probably you have an howto related to your
>         facility). Therefore the answer to you question is that you
>         may want to use n threads on 1 process, or n processes on n
>         cores, or (more likely) something in the middle depending on
>         your system.
>
>         A note on efficiency. As on "common" test systems (tens to
>         thousands atoms) the lapack/scalack scale efficiently up to
>         2-4 threads, it is usually convenient to reserve some cores
>         for threading. Also, as the parallelization on energy points
>         implies solving N independent Green's functions, therefore it
>         needs to allocate N times memory where N is the number of
>         processes. For large systems it may be necessary to run a
>         process per socket, to get the maximum available memory. With
>         the current version also the Poisson is a bit more efficient
>         if you have less processes on a socket, considering these
>         points at the end of the day I usually run with 2 or 4 omp
>         threads (if I don't hit memory problems).
>
>         Hope this helps,
>         Gabriele
>
>
>
>
>         -- 
>         --
>         Dr. Gabriele Penazzi
>         BCCMS - University of Bremen
>
>         http://www.bccms.uni-bremen.de/
>         http://sites.google.com/site/gabrielepenazzi/
>         phone: +49 (0) 421 218 62337 <tel:%2B49%20%280%29%20421%20218%2062337>
>         mobile: +49 (0) 151 19650383 <tel:%2B49%20%280%29%20151%2019650383>
>
>
>         _______________________________________________
>         DFTB-Plus-User mailing list
>         DFTB-Plus-User at mailman.zfn.uni-bremen.de
>         <mailto:DFTB-Plus-User at mailman.zfn.uni-bremen.de>
>         https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
>
>
>
>
>
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user

-- 
--
Dr. Gabriele Penazzi
BCCMS - University of Bremen

http://www.bccms.uni-bremen.de/
http://sites.google.com/site/gabrielepenazzi/
phone: +49 (0) 421 218 62337
mobile: +49 (0) 151 19650383

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/attachments/20150409/65e457b1/attachment-0001.html>


More information about the DFTB-Plus-User mailing list