[DFTB-Plus-User] DFTB+ mpi running issue

Pei Wang n8413274 at qut.edu.au
Thu Oct 21 04:08:35 CEST 2021


Dear DFTB+ users,

I'm calculating the DOS of a carbon system with 1e4 atoms using 96 gpus 
and 380 gb memory. The program reported error while finished 1 geometry 
step before generating charge.in. I inquired tech support of hpc and 
they suggested that it might be a bug or feature of the software. Could 
you help identify what cause the problem? Thanks.

Best Regards,

Pei Wang

------------------------------------------------------------------------------------------------- 
input file

/Geometry = GenFormat {
   <<< geo.gen
}

Driver = ConjugateGradient {
   MaxSteps = 2
   LatticeOpt = Yes
   MaxLatticeStep = 0.005
}


Hamiltonian = DFTB {
   SCC = Yes
   # ReadInitialCharges = Yes
   SCCTolerance = 1e-6
   Solver = DivideAndConquer{}
   MaxAngularMomentum = {
     C = "p"
   }
   Filling = Fermi {
     Temperature [Kelvin] = 300
   }
   SlaterKosterFiles = Type2FileNames {
     Prefix = "../../slako/"
     Separator = "-"
     Suffix = ".skf"
   }
   KPointsAndWeights = {
     0.0 0.0 0.0 1.0
   }
}

Parallel{
   # UseOmpThreads = Yes
}

Analysis {
   ProjectStates {
     Region {
       Atoms = C
       ShellResolved = Yes
       Label = "pdos.C"
     }
   }
}

ParserOptions {
   ParserVersion = 8
}/


---------------------------------------------------------------- error

/Loading dftbplus/20.1
   Loading requirement: openmpi/4.0.2
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine Line        Source
dftb+.mpi          0000000001332DEB  for__signal_handl Unknown  Unknown
libpthread-2.28.s  000014ABBE82AB30  Unknown Unknown  Unknown
libuct.so.0.0.0    000014ABA95F36DE  uct_mm_iface_prog Unknown  Unknown
libucp.so.0.0.0    000014ABA98270EA  ucp_worker_progre Unknown  Unknown
mca_pml_ucx.so     000014ABA9C88317  mca_pml_ucx_progr Unknown  Unknown
libopen-pal.so.40  000014ABBDAA9E8B  opal_progress Unknown  Unknown
libmpi.so.40.20.2  000014ABBEE081AB  ompi_request_defa Unknown  Unknown
libmpi.so.40.20.2  000014ABBEE3162D  MPI_Testall Unknown  Unknown
libmkl_blacs_open  000014ABBF7FE818  MKLMPI_Testall Unknown  Unknown
libmkl_blacs_open  000014ABBF802C42  BI_BuffIsFree Unknown  Unknown
libmkl_blacs_open  000014ABBF802816  BI_UpdateBuffs Unknown  Unknown
libmkl_blacs_open  000014ABBF7DCFED  dgesd2d_ Unknown  Unknown
dftb+.mpi          000000000130A227  Unknown Unknown  Unknown
dftb+.mpi          0000000001232901  Unknown Unknown  Unknown
dftb+.mpi          0000000001232ACB  Unknown Unknown  Unknown
dftb+.mpi          000000000053FFFC  Unknown Unknown  Unknown
dftb+.mpi          00000000004B41F3  Unknown Unknown  Unknown
dftb+.mpi          000000000049CB88  Unknown Unknown  Unknown
dftb+.mpi          0000000000419608  Unknown Unknown  Unknown
dftb+.mpi          00000000004177A2  Unknown Unknown  Unknown
libc-2.28.so       000014ABBE4764A3  __libc_start_main Unknown  Unknown
dftb+.mpi          00000000004176AE  Unknown Unknown  Unknown
forrtl: error (78): process killed (SIGTERM)/

/.../

/.../

/dftb+.mpi          00000000004177A2 Unknown               Unknown  Unknown
libc-2.28.so       000014F7D7A504A3  __libc_start_main Unknown  Unknown
dftb+.mpi          00000000004176AE  Unknown Unknown  Unknown
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node gadi-cpu-clx-2950 
exited on signal 9 (Killed).
--------------------------------------------------------------------------/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zfn.uni-bremen.de/pipermail/dftb-plus-user/attachments/20211021/e19d3793/attachment.htm>


More information about the DFTB-Plus-User mailing list