Particle-laden Turbulence in a Radiation Environment: Efficient Linear Solver

Particle-laden Turbulence in a Radiation Environment: Efficient Linear Solver
Slide Note
Embed
Share

This research focuses on developing a parallel linear solver using hierarchical matrix techniques to solve particle-laden turbulence problems in radiation environments. Motivations include tackling large linear systems from various applications and leveraging heterogeneous machine architectures. The solver demonstrates superior efficiency and accuracy through hierarchical matrix compression of fill-in structures. Algorithmic approaches optimize mesh partitioning, fill-in compression, and coarse-level vertex elimination.

  • Linear Solver
  • Hierarchical Matrices
  • Particle-laden Turbulence
  • Radiation Environment
  • Efficiency

Uploaded on Feb 16, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Particle-laden Turbulence in a Radiation Environment Parallel Linear Solver Using Hierarchical Matrix C. Chen, L. Cambier, E. Darve, Stanford; S. Rajamanickam, R. Tuminaro, E. Boman, Sandia PSAAP II

  2. Motivations (1/2) Increasingly large linear systems from various applications Ice Sheet Simulation Structural Mechanics Solid Mechanics Electro- magnetics [NY times, 2015, Liu, 2017, Aminfar, 2015, Takahashi, 2017]

  3. Motivations (2/2) Heterogeneous machine architectures NERSC Cori Computer NVIDIA GPU Intel KNL

  4. Solve Ax = b Existing linear solvers - sparse direct solvers (e.g., SuperLU) - iterative methods - preconditioners (e.g., ILU, multigrid) Hierarchical solvers/preconditioners - tunable accuracy - fast and robust - high arithmetic intensity - reduced communication and synchronization

  5. Key tool: hierarchical matrices [Hackbusch, Bebendorf, Borm, Gu, etc]: off-diagonal blocks of typical differential and integral operators are effectively low-rank Theoretically optimal complexity of operation counts and memory footprint Data-sparse representation - H, H2, HSS and HODLR [Aminfar, 2015]

  6. Key idea: compressing fill-in Exploit the low-rank structure of fill-in - singular value-based truncation - in contrast to level-based and threshold-based dropping rules in ILU or IChol - compression of fill-in in a hierarchical fashion [Aminfar, 2015]

  7. Algorithm of our solver Partition the sparse matrix (mesh or graph) Compress fill-in and create coarse vertices (red in fig.) Eliminate fine vertices (green in fig.) Recurse on the coarse system (level-1 fill-in) Partition of mesh grid/unknowns [Pouransari, 2017] Far-field (fill-in) compression

  8. Algorithm illustration (1) original partition (2) one coarse node (3) two coarse nodes (6) coarse level (5) many coarse nodes (4) three coarse nodes

  9. Particle-laden Turbulence in a Radiation Environment Parallel algorithm PSAAP II

  10. Parallel algorithm: concurrency Fill-in exists between two clusters with at most distance 2 Fill-in pattern: ILU(1) [Yang, 2017]

  11. Parallel algorithm: data dependency d1: boundary nodes; on critical path; requires coloring d2: lower priority than d1 d3: interior nodes; lowest priority; no communication needed

  12. d1: relaxed distance-2 coloring P0 P1 Four colors for regular grids P3 P2

  13. Parallel algorithm Asynchronous algorithm: d3 computation overlaps with d1, d2 communication. d1 nodes d2 nodes No communication d3 nodes

  14. Communication & Computation Local communication with neighbors at every level Batched dense linear algebra, good for many-core processors (e.g., GPU, MIC) Assume: problem size=N, # processors=p, rank=r and M = Nr2/p - computation: O(M) - communication: O(M2/3) - # messages: O( log(N/rp)+log(p) )

  15. Comparison with SuperLU-Dist 16 cores on NERSC Edison

  16. Matrices from UFL collections

  17. Particle-laden Turbulence in a Radiation Environment Application: ice sheet modeling PSAAP II

  18. Solve for ice sheet velocity Larsen C ice shelf Extruded (thin) mesh in the vertical direction Neumann & Robin boundary conditions [NASA, 2017, Tuminaro, 2016]

  19. Improved convergence Model problem on a cube: strong vertical coupling; anisotropic Preserve near-null space (piecewise constant) in the low-rank approximation [Yang, 2017]

  20. Summary Hierarchical solver - data-sparse representation - fast and robust Parallel algorithm - local communication - high computational intensity - reduced global synchronization Various applications - symmetric and nonsymmetric - positive definite or indefinite

  21. Acknowledgements

  22. References https://www.nytimes.com/2015/09/12/science/climate-study-predicts-huge-sea-level-rise-if-all-fossil-fuels- are-burned.html https://www.nasa.gov/feature/goddard/2017/massive-iceberg-breaks-off-from-antarctica Liu, Yinjian, Jinyou Xiao, An Enhanced Hierarchical Solver for Preconditioning of Large-Scale Finite Element, working manuscript. Aminfar, AmirHossein, A fast and memory efficient sparse solver with applications to finite element matrices (PhD thesis), Stanford University, 2015. Takahashi, Toru, Pieter Coulier, and Eric Darve. "Application of the inverse fast multipole method as a preconditioner in a 3D Helmholtz boundary element method." Journal of Computational Physics 341 (2017): 406-428. Pouransari, Hadi, Pieter Coulier, and Eric Darve. "Fast hierarchical solvers for sparse matrices using extended sparsification and low-rank approximation." SIAM Journal on Scientific Computing 39, no. 3 (2017): A797-A830. Yang, Kai, Hadi Pouransari, and Eric Darve. "Sparse hierarchical solvers with guaranteed convergence." arXiv preprint arXiv:1611.03189 (2016). Tuminaro, Raymond, Mauro Perego, Irina Tezaur, Andrew Salinger, and Stephen Price. "A matrix dependent/algebraic multigrid approach for extruded meshes with applications to ice sheet modeling." SIAM Journal on Scientific Computing 38, no. 5 (2016): C504-C532. Chen, Chao, Hadi Pouransari, Siva Rajamanickam, Erik G. Boman, and Erik Darve. "A distributed-memory hierarchical solver for general sparse linear systems." Journal of Parallel Computing, accepted subject to minor changes.

More Related Content