[DFTB-Plus-User] On parallel version of DFTB+

ZHAOHUI HUANG zuh101 at psu.edu
Mon Sep 19 21:45:28 CEST 2016


Hi,

  Thanks for quick reply. yeah, I know you mean the first step of BerkeleyGW. When you solve eigenvalues of exciton, in the space of quasiparticle wave functions, you don't diagonalize exciton Hamiltonian containing two-body effect? Last year, I had a 114 atoms structure, the exciton Hamiltonian had dimension around 170,000. They used iterative method to diagonalize H. Can I say SCALAPACK solution is direct attempts to get eigenvalues? for example, CHEEV or CHEEVD (divide and qanquer algorithm). so is it possible to use iterative method to handle large size of TB Hamiltonian?  I expect DFTB+ could extend to handle structure with one million or so. comments?

ZhaoHui Huang,


----- Original Message -----
From: "Jacek Jakowski" <jjakowski at gmail.com>
To: "User list for DFTB+ related questions" <dftb-plus-user at mailman.zfn.uni-bremen.de>
Sent: Monday, September 19, 2016 3:28:14 PM
Subject: Re: [DFTB-Plus-User] On parallel version of DFTB+

My estimates are  based on diagonalization for dense matrices, which
is what scalapack does.  The specific numbers are based on
diagonalization on Cray XC30 (intel Xeons, 16 cores per node).
Diagonalization as  well as other matrix-matrix operations scale
cubically with the system  size  which means  that if you decreases
your system  2 times  then the computational cost  reduces 8
times(=2^3).  Actually I need to correct my previous message:  10
hours on 4000 cores  for a single diagonalization is not for 100k for
for 400k basis functions.  Yes, tight-binding  is much faster than
conventional DFT, but  dense  linear algebra still scales  cubically
and dominates  computations. The   speedup  with respect to DFT  comes
from  two factors: (1)   for the given number of atoms  the matrices
are about  5-10 smaller than  in conventional DFT with localized basis
 set, (2)   the  formation  of DFTB matrices  is very small comparing
to the same size DFT matrices.

According to the  official information BerkeleyGW does  not do
diagonalization but  take the results of diagonalization from other
codes  as input (and computes higher order corrections). Also, it is
intended for up to a few hundreds of atoms.

Besides DFTB+, you  can try   divide and conquer implementation called
 DC-DFTB-K  (Japan)   or  cp2k implementation (they use ELPA if I
remember correctly).


Jacek


On Mon, Sep 19, 2016 at 2:01 PM, ZHAOHUI HUANG <zuh101 at psu.edu> wrote:
> Can you describe some algorithm details used in DFTB+? especially on Hamiltonian diagonalization? Tight-binding calculations are supposed to run very fast, but your reply impressed me with totally different picture. It takes me time to think over your words. I have not realized that DFTB+ might require a few thousand of processors. simply ask, could you tell me on what most CPU time are spent with DFTB+ calculation? diagonalization?
> thanks a lot.
>
> If you use iterative method to solve Hamiltonian eigenvalues as implemented in BerkeleyGW, what do you think the calculation speed?
>
> ZhaoHui Huang,
>
>
> ----- Original Message -----
> From: "Jacek Jakowski" <jjakowski at gmail.com>
> To: "User list for DFTB+ related questions" <dftb-plus-user at mailman.zfn.uni-bremen.de>
> Sent: Saturday, September 17, 2016 8:26:15 PM
> Subject: Re: [DFTB-Plus-User] On parallel version of DFTB+
>
> Most likely you don't have enough memory to fit the 26,000 atoms  on
> your computer, even if   DFTB+ can handle it.   Assuming that  your
> 26k atoms are carbons (or similar) you need 80GB   to fit a single
> matrix (100kbasis) in memory and much more (like 10 times)  for a real
> calculations.
> But then  if this fits into  your memory, then  100k matrices on 4000
> cores takes about 10 hours for a  single diagonalization (real case).
> It would  probably  took  something like a month  to do SCF, and
> about half  a year for  a few  MD steps.
>
> I suggest  that you decrease  the size of cell so that your matrices
> are below  32,000.
>
> Jacek
>
> On Fri, Sep 9, 2016 at 1:36 PM, ZHAOHUI HUANG <zuh101 at psu.edu> wrote:
>> Hello,
>>
>>      Sorry to bother you if not interested.
>>
>>      I have an issue from running parallel DFTB+. My unit cell contains 26,000 atoms and I just want to relax the structures a few steps. By running the code, I first get output overflow error message, then I increase MAXRECL parameter defined in HSDParser package. It runs indeed. but It failed by SCALAPACK error,
>>
>> MAXNEIGHBORS: 8847
>>   iSCC Total electronic   Diff electronic      SCC error
>> Operation failed!
>> ppotrf in scalafx_ppotrf_dreal
>> Info: 23233
>>
>>
>>     Is there any code developer who is familiar with this part of code? thanks.
>>
>>
>> ZhaoHui Huang,
>> _______________________________________________
>> DFTB-Plus-User mailing list
>> DFTB-Plus-User at mailman.zfn.uni-bremen.de
>> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
_______________________________________________
DFTB-Plus-User mailing list
DFTB-Plus-User at mailman.zfn.uni-bremen.de
https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user


More information about the DFTB-Plus-User mailing list