Sobolev Node 6.7 Showcase and Supercomputer Hardware Specifications

1 / 12

Embed Share

Explore the showcase of Sobolev Node 6.7 with K20m GPU Accelerator and delve into the high-performance supercomputer hardware specifications featuring Intel Xeon and NVIDIA GPUs. Discover the cutting-edge technology and monitoring insights for optimal performance.

sjoz Follow

Uploaded on Jun 03, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Sobolev(+Node 6, 7) Showcase +K20m GPU Accelerator

Supercomputer www.top500.org The No. 1, Tianhe-2, and the No. 7, Stampede. -Intel Xeon Phi processors The No. 2, Titan, and the No. 6, Piz Daint. -NVIDIA GPUs Share GPU : NVIDIA 46, ATI Radeon 3 Xeon Phi : 21 Hybrid : 4

Hardware Specification Main module 4 * Intel Xeon X7550 : 2GHz, 18MB Cache, 8Cores Memory : 64GB QDR 40Gb/s Infiniband Sub-module (*5) 2 * Intel Xeon X5660 : 2.8GHz, 12MB Cache, 6Cores Memory : 48GB QDR 40Gb/s Infiniband Sub-module (*2) 2 * Intel Xeon E5-2650 : 2.6GHz, 20MB Cache, 8Cores Memory : 128GB QDR 40Gb/s Infiniband

Monitoring : sobolev.kaist.ac.kr Sobolev Node6 Node1 GPU Node2 Node7 GPU Node3 Node4 Node5

Tesla K20m CUDA parallel processing cores : 2496 Memory size : 5GB GDDR5 Processor core clock : 706 MHz Peak double precision floating point performance : 1.17Tflops Thermal solution : Passive

Test problem u = f in = 0,1 [0,1] u = 0 on ? Solution : u = ???2(??) ???2(??) Jacobi (GPU) vs Block Jacobi (CPU) Conjugate gradient method 1. 2.

1.Jacobi (GPU) vs Block Jacobi (CPU) Meshsize(h) Jacobi CUDA(GPU) 0.49 4.06 47.3 938.85 Block Jacobi mpi6*6(CPU) 0.3285 4.8383 103.8234 1438.0965 mpi3*3(CPU) 0.6630 20.1053 273.5613 4297.2409 mpi9*9(CPU) 1.3400 3.0964 54.2547 741.5949 1/128 1/256 1/512 1/1024

1.Jacobi (GPU) vs Block Jacobi (CPU)

2.Conjugate Gradient CUDA 0.11 0.4 2.23 15 mpi1 1.17 8.90 79.37 649.91 mpi2 0.59 4.46 38.58 320.47 mpi4 0.31 2.26 20.35 178.33 mpi8 0.17 1.19 11.31 114.00 mpi16 0.12 0.69 5.74 69.20 mpi32 0.13 0.45 2.89 30.68 mpi64 0.23 0.51 2.11 15.42 1/256 1/512 1/1024 1/2048

2.Conjugate Gradient

Sobolev Node 6.7 Showcase and Supercomputer Hardware Specifications

Download Presentation

Presentation Transcript

Related

More Related Content