All-electron full-potential DFT: The Jülich FLEUR family of codes
Transcription
All-electron full-potential DFT: The Jülich FLEUR family of codes
Mitglied der Helmholtz-Gemeinschaft All-electron full-potential DFT: The Jülich FLEUR family of codes Daniel Wortmann Quantum Theory of Materials Institute for Advanced Simulation The Team (IAS-1/PGI-1) Kohn Sham - Density Functional Theory Standard Model in many Fields: Physics, Chemistry, Material Science, Biology … Basic idea: 0 @ X i hi + X i,j 1 Uij A (~r1 , ...~rN ) = E (~r1 , ...~rN ) Mapping of the many-electron system on a system of noninteracting electrons described by an effective single-particle Hamiltonian h i (~r ) = ✏i r) i (~ Self-consistency problem ninit Potential generation Diagonalization Fermi level Construction of charge h= 1 2 r + V0 + VC [n] + VEX [n] 2 h i (r ) = ✏i N= ✏X i <✏F n(r ) = i (r ) 1 i ✏X i <✏F i Mixing of charge n = F [nold , nnew ] | (r )|2 Simulations Different Algorithms Variety of Codes Parallelization Analysis of Data Density Functional Theory 1 2 ( r + Vef f ) = ✏ 2 Supercomputers Clusters Workstations Inhomegenous machines Bandgaps Strong correlations Different energy scales Structural relaxations Electronic structure Magnetic properties Phonons&Magnons Transport Typical Applications 10x10x10 Atomic structure Electronic structure Magnetic structure Zoo of methods Basis sets: • Plane waves • Numerical/Analytical Localized basis sets Local density approximation (LDA), GGA LDA+U Hybrid Functionals GW-Approximation Real space grids Green functions 1 2 ( r + Vef f ) = ✏ 2 Finite difference approx. Non-relativistic equation Scalar-relativistic approx. Spin-orbit coupling Dirac equation All-electron Pseudo-potential Shape approximations Full potential Spin-polarized calculations Method development Codes presented today: • FLEUR • juRS (P. Baumeister) • KKRnano (E. Rabel) Further Codes developed in IAS-1: • Tight Binding code juTiBi • Various KKR codes • Spin-dynamic code juSpinX FLEUR: the Jülich FLAPW codes n All-electron, n All full-potential DFT elements, open systems Atomic structure Total energies Forces Electronic structure Bandgaps Bandstructures Charge density Surface states Linearized Augmented Plane Waves All-electron code: V(r) contains singularity due to the nucleii Basis functions: G = ( e i(k+G )r P lm (alm,G u(r ) + blm,G u̇(r ))Ylm (r ) Numerical radial basis: ✓ @u Energy derivative: u̇ = @✏ 2 Interstitial Muffin-Tin ◆ 1 @ 0 + V (r ) ru(r ) = ✏ l l ru(r ) 2 2 @r +Additional localized functions (local orbitals) can be added Where does the CPU time go? Self consistency loop: Potential generation Fermi level Hamiltonian setup Diagonalization Mixing of charge Construction of charge H,S Diagonalization Charge Time , PE 50% 13% 33% 28min , 1 PE 27% 20% 44% 36min , 32 PE 33% 50% 17% 10min , 30 PE 23% 61% 11% 22min , 40 PE Parallelisation Multiple level parallelism: k-loop + further loops Potential generation Fermi level Mixing of charge MPI+OMP Hamiltonian setup Diagonalization Construction of charge i MPI+OMP MPI+(OMP) k-point loop, MPI parallel, little communication MPI k-point loop, MPI parallel Eigenvalue Problem Hci = ✏i Sci For large systems this is the computational most relevant problem • Generalized eigenvalue problem • Full-Matrix solver needed • Usually only about 5-10% of eigenvectors are needed • Many iterations with similar matrices • For machines with few processors: need to store many solutions of the eigenvalue problem Hamiltonian Setup Interstitial contribution: • Plane wave part • Not a simple integration over all space ! Hij D E lapwi |Ĥ|lapwj Muffin-tin contribution: Z • For each atom 1 • For each pair of basis e i(k+Gi )r Ĥe i(k+Gj )r dr = functions i,j V INT !⇤ ! Z X X Ylm (alm;i ul + blm;i u̇l Ĥ Ylm (alm;j ul + blm;j u̇l dr + = MT lm Hamiltonian: • Kinetic energy, spherical potential • Non-spherical potential couples different l,m lm Similar for overlap matrix Eigenvalue Problem ELPA Library provides OMP+MPI parallelism But: only Juropa so far Unified Interface for output quantities? DFT codes face a lot of common tasks: • Generation of input, symmetry • Determination of relaxed positions • Plotting of output: • Band structure • Density of States • Charge density • Charge density mixing • k-integration Implementation of hybridWillfunctionals it blend? Becke 1993, JCP 98, p.1372 and p.5648 • hybrid functionals combine bare (or screened) nonlocal (NL) Hartree-Fock exchange with local (L) xc functionals • Kohn-Sham equation with an additional non-local operator NL, Vx,GG 0 (k) = LD AE xch LDA C • FLAPW basis: occ. X BZ Z Z X HF Exchange an ge xc E A GG on orrelati C A G G orrelati on hyb. LDA Exc = Exc + a0 (ExHF ge n ha ExLDA ) + ax (ExGGA ExLDA ) + ac (EcGGA ⇤ 0 ⇤ 0 Martin Schlipf 0 3 3 0 Muffin-tin Recipes (r)⇥ (r)v(r, r )⇥ (r ) (r )d r d r 0 kG nq nq kG q n • employ mixed product basis for NL, Vx,GG 0 (k) = occ. X BZ X X n q EcLDA ) IJ ⇥ kG |⇥nk q q M ⇤v (q)⇥M IJ q I J ⇥nk with the bare (screened) Coulomb matrix vIJ (q) q | kG0 ⇤ Hybrid functionals § hybrid functionals are a factor of 10 up to 100 computationally more expensive than conventional LDA or GGA calculations § more than 90% of the time is spent to compute the matrix elements of the non-local exchange potential § by introducing an auxiliary basis {MIq }the matrix elements are casted into a sum over vector-matrix-vector products Vn1 n2 (k) = BZ X occ. X X q n IJ ⇥ n1 k | nq MIq ⇤CIJ (q)⇥MJq Loop over k-points Loop over q-points Loop over bands n1 Loop over occupied bands n Sparse matrix vector product Loop over bands n2 scalar product nq | n2 k ⇤ GW approximation (SPEX code) Direct calculation of electronic excitation energies Ekn Equation of motion of interacting particles (electrons or holes): ĥ0 (r) Numerical scaling: (Size)4 kn (r) + GW (r, r ; Ekn ) kn (r) Self-energy operator: Depends on r, r', and E (complex) è 8 independent parameters GW approximation: ⇥ GW (r, r For comparison DFT: i ; ⇥) = 2 i⇤ GKS d⇥ ⇥ (r, r ; ⇥ + ⇥ )W (r, r ; ⇥ )e Equation of motion formally of noninteracting particles: ĥ0 (r)⇥kn (r) + v xc (r)⇥kn (r) = Numerical Scaling: (Size)3 3 (r )d r = Ekn kn kn ⇥kn (r) Exchange-correlation potential: Depends on r è only 3 independent parameters Green function embedding Bloch spectral function n Density Functional Theory for broken symmetries ■ Green function method ■ Complex Bandstructure, Transport Interface transmission Ag/Pt Order-N scaling n Separate system into layers n Each step scales linear with size: 1) Green function for each insulated layer 2) Propagate embedding potentials 3) Green function for each layer with correct embedding potentials Efficient distribution on many processors Additional features • Film setups • Semi-infinite vacuum, no supercells • Wannier functions • Wannier interpolation • Construction of TB models • Non-collinear magnetism • Electric fields • Spin-spirals • Spin-orbit interaction • LDA+U • Calculation of U by constrained RPA • Optimized effective potential method+RPA • Interface to van der Waals code • ……. Summary FLEUR: Possible Projects: • All-electron fullpotential DFT • Eigenvalue problem • Features • CPU intensive parts • Parallelization • OMP parallelization • Unified input/output interface • Charge density mixing for large systems