[DFTB-Plus-User] Fwd: Parallel OpenMP with the DFTB+ API

David Furman sirok4 at gmail.com
Thu Nov 28 13:04:58 CET 2019


OK. Problem solved!
After looking at the linker line for my code, I noticed that it links
against lapack and blas as well (needed for some other packages), but I
forgot to remove these for the DFTB+ API compilation.
After removing the links to lapack and blas, and leaving only the mkl
libraries, everything works as expected.

Thanks for the help!

David

---------- Forwarded message ---------
From: David Furman <sirok4 at gmail.com>
Date: Wed, Nov 27, 2019 at 9:27 PM
Subject: Re: [DFTB-Plus-User] Parallel OpenMP with the DFTB+ API
To: Bálint Aradi <aradi at uni-bremen.de>


Hi Bálint,
Thanks for reaching out!

Following the manual, I compiled DFTB+ API via `make api` and linked to the
*mkl_intel_lp*, *mkl_intel_thread* and *mkl_core* libraries. (I also tried
linking to libiomp5 which did not help).

When I execute my code which is interfaced to the DFTB+ API, I can see (in
the DFTB+ output file) that the correct number of threads is recognized
(OpenMP threads:              6),
but there is no effect whatsoever, regardless of the number of threads I
use.
The same dftb_in.hsd file, when is run directly with DFTB+ executable
(which was compiled alongside the API library), does benefit from the
increased number of threads.

Doing `ldd mycode` results in the following:
linux-vdso.so.1 =>  (0x00007ffff7290000)
libmkl_intel_lp64.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64/libmkl_intel_lp64.so
(0x00007f6f47758000)
libmkl_intel_thread.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64/libmkl_intel_thread.so
(0x00007f6f451eb000)
libmkl_core.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64/libmkl_core.so
(0x00007f6f40eba000)
libiomp5.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64/libiomp5.so
(0x00007f6f40ac4000)
libstdc++.so.6 =>
/usr/local/chemistry/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0/libstdc++.so.6
(0x00007f6f4072c000)
libm.so.6 => /lib64/libm.so.6 (0x0000003be3800000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003be3c00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003be3400000)
libgcc_s.so.1 =>
/usr/local/chemistry/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/lib64/libgcc_s.so.1
(0x00007f6f404fe000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003be4000000)
/lib64/ld-linux-x86-64.so.2 (0x000055d981518000)

While the same command for the standalone DFTB+ (compiled alongside the
DFTB+ API libray) results in:
linux-vdso.so.1 =>  (0x00007ffd9f784000)
libmkl_intel_lp64.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64/libmkl_intel_lp64.so
(0x00007f4f83897000)
libmkl_intel_thread.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64/libmkl_intel_thread.so
(0x00007f4f8132a000)
libmkl_core.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64/libmkl_core.so
(0x00007f4f7cff9000)
libiomp5.so =>
/usr/local/shared/intel/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64/libiomp5.so
(0x00007f4f7cc03000)
libm.so.6 => /lib64/libm.so.6 (0x0000003be3800000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003be3c00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003be3400000)
libgcc_s.so.1 =>
/usr/local/chemistry/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/lib64/libgcc_s.so.1
(0x00007f4f7c9d5000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003be4000000)
/lib64/ld-linux-x86-64.so.2 (0x000055be98040000)

The same behavior is reproducible with different systems (from small to
large systems).
So, at the moment, I'm not sure what can cause the difference in
"performance"... it's as if the mkl threading libraries have absolutely no
effect...



On Tue, Nov 26, 2019 at 3:37 PM Bálint Aradi <aradi at uni-bremen.de> wrote:

> Dear David,
>
> > Simply doing 'export OMP_NUM_THREADS=6' and running my code is just
> > using 1 thread. I did manage to run DFTB+ directly on 6 threads with
> > the above command though...
>
> Actually, that should work exactly this way. Are you sure, you linked
> the right (threaded) lapack/blas libraries to your application, when you
> linked it with the library?
>
> Note: When you compile DFTB+ with API support (-DWITH_API) and install
> it (via `make install`), it will create
> _install/lib/pkgconfig/dftbplus.pc. This file contains the flags which
> were used to link the DFTB+ standalone. If that standalone shows proper
> threading behaviour, an application using the library (which was
> produced during the same build) should show it as well, provided you use
> the same flags.
>
> Best regards,
>
> Bálint
>
> --
> Dr. Bálint Aradi
> Bremen Center for Computational Materials Science, University of Bremen
> http://www.bccms.uni-bremen.de/cms/people/b-aradi/
>
>
>

-- 

David Furman, PhD, MRSC
Herchel Smith Research Fellow
Department of Chemistry
University of Cambridge
Lensfield Road
CB2 1EW



-- 

David Furman, PhD, MRSC
Herchel Smith Research Fellow
Department of Chemistry
University of Cambridge
Lensfield Road
CB2 1EW
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/attachments/20191128/40908b17/attachment.html>


More information about the DFTB-Plus-User mailing list