Circuit Simulation via Matrix Exponential Method

Slide Note

This presentation explores the application of Matrix Exponential Method in circuit simulation, covering topics such as design flow, timing analysis, logic synthesis, and more. It also delves into emerging demands in system verification, scalability, and power grid analysis. The included publications highlight advancements in circuit simulation techniques, adaptive time stepping, stiffness handling, parallel processing, and more.

quey454 Follow

Uploaded on Mar 11, 2025 | 2 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Circuit Simulation via Matrix Exponential Method Speaker: Shih-Hung Weng Adviser: Chung-Kuan Cheng Date: 05/31/2013 1

Foundation of Design Flow lookup table Timing Analysis Logic Synthesis Placement Routing Abstraction Layer characterization Circuit Simulation Circuit Simulation 2

Emerging Demands Full system verification and analysis scalability and performance on-chip power grid voltage low frequency time 3

Publications (1/3) Circuit Simulation with Matrix Exponential Method: 1. S.-H. Weng, H. Zhuang and C.K. Cheng, Adaptive Time Stepping for Power Grid Simulation using Matrix Exponential Method , submitted to IEEE ICCAD 2013 2. S.-H. Weng, Q. Chen and C.K. Cheng, Circuit Simulation using Matrix Exponential Method for Stiffness Handling and Parallel Processing , IEEE ICCAD, Nov. 2012 3. Q. Chen, W. Schoenmaker, S.-H. Weng, C.K. Cheng, G.-H. Chen, L.-J. Jiang and N. Wong, A Fast Time- Domain EM-TCAD Coupled Simulation Framework via Matrix Exponential, IEEE ICCAD, Nov. 2012 (Best Paper Award Candidate) 4. Y. Li, Q. Cheng, S.-H. Weng, C.K. Cheng and N. Wong, Globally Stable, Highly Parallelizable Fast Transient Circuit Simulation via Faber Series , IEEE NewCAS May. 2012 5. S.-H. Weng, Q. Chen and C.K. Cheng, Time-Domain Analysis of Large-Scale Circuits by Matrix Exponential Method with Adaptive Control , IEEE Trans. on CAD, Jul. 2012 6. Q. Chen, S.-H. Weng and C.K. Cheng, A Practical Regularization Technique for Modified Nodal Analysis in Large-Scale Time-Domain Circuit Simulation , IEEE Trans. on CAD, Jun. 2012 7. S.-H. Weng, Q. Chen and C.K. Cheng, Circuit Simulation by Matrix Exponential Method, IEEE ASIC Conference, Oct. 2011 8. S.-H. Weng, P. Du and C.K. Cheng, A Fast and Stable Explicit Integration Method by Matrix Exponential Operator for Large Scale Circuit Simulation , IEEE ISCAS, May. 2011 4

Publications (2/3) Clock Gating Synthesis: 9. S.-H Weng, Y.-M. Kuo and S.-C. Chang, Timing Optimization in Sequential Circuit by Exploiting Clock-Gating Logic, ACM Trans. on DAES, April 2012. 10. Y.-M. Kuo, S.-H. Weng, and S.-C. Chang, A Novel Sequential Circuit Optimization with Clock Gating Logic, IEEE ICCAD, Nov. 2008 High-speed Interconnect: 11. G. Sun, S.-H. Weng, C.K, Cheng, B. Lin and L. Zeng, An On-Chip Global Broadcast Network Design with Equalized Transmission Lines in the 1024-Core Era , IEEE SLIP Jun. 2012 12. S.-H. Weng, Y. Zhang, J. F. Buckwalter and C.K. Cheng, Energy Efficiency Optimization through Co- Design of the Transmitter and Receiver in High-Speed On-Chip Interconnects , accepted by IEEE Trans. on VLSI Placement and Routing: 13. C.K. Cheng, P. Du, A.B. Kahng and S.-H. Weng, Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-path Steiner Graph, IEEE ISPD, Mar., 2012 14. P. Du, W. Zhao, S.H. Weng, C.K. Cheng and R.L. Graham, Character Design and Stamp Algorithms for Character Projection Electron-Beam Lithography, IEEE ASPDAC, Feb., 2012 5

Publications (3/3) Power Grid Analysis: 15. X. Hu, P. Du, S.-H. Weng and C.K. Cheng, Worst-Case Noise Prediction With Non-zero Current Transition Times for Power Grid Planning, accepted by IEEE Trans. on VLSI. 16. C.-C. Chou, H.-H. Chuang, T.-L. Wu, S.-H. Weng, and C.K. Cheng, Eye Prediction of Digital Driver with Power Distribution Network Noise, IEEE EPEPS, Nov. 2012 (Best Student Paper Award) 17. P. Du, S.-H. Weng, X. Hu and C.K. Cheng, Power Grid Sizing via Convex Programming, IEEE ASIC Conference, Oct. 2011 18. P. Du, X. Hu, S.H. Weng, A. Shayan, X. Chen, A. E. Engin and C.K. Cheng, Worst-Case Noise Prediction with Non-zero Current Transition Times for Early Power Distribution System Verification, IEEE ISQED, Mar. 2010 19. S.-H. Weng, Y.-M. Kuo, S.-C. Chang, and M. Marek-Sadowska, Timing Analysis Considering IR Drop Waveforms in Power Gating Designs, IEEE ICCD, Oct. 2008 6

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method Krylov Subspace Approximation Rational Krylov Subspace Approximation Parallelism Experimental Results Conclusions 7

Circuit Formulation Formulated as a system of DAEs[Ho et. al. 75] capacitance & inductance branch currents & nodal voltages input sources ( ) x q ( ) t ( ) t ( ) x ( ) t + = + + C x G x i u L L currents of nonlinear devices derivative of charges in nonlinear devices resistance & incidence linearized by compact model (BSIM, PSP, etc.) 8

Circuit Formulation Formulated as a system of DAEs[Ho et. al. 75] ( ) ( ) t x C x q L = + ( ) t ( ) x ( ) t + + G x i u L after linearization = Gx + C x u ( ) ( ) ( ) t t t Solve x(t) in implicit or explicit numerical method 9

Numerical Integration (1/2) = Gx + C x u Forward Euler (1st order explicit) h G C I x 1 ( + = ( ) ( ) ( ) t t t + 1 1 x C u ) h sparse matrix-vector product n n n Backward Euler (1st order implicit) ) / ( 1 + + = n h G C x + 1 C x u ( / ) h solving a linear system + 1 n n Stability issue for stiff circuit forward Euler unstable result backward Euler performance & scalability issues 10

Numerical Integration (2/2) Methods Computation Scalability Error Stability Step size O(h2) Forward Euler x=Av high low tiny O(h2) Backward Euler Ax=b low A-stable medium O(h3) Trapezoidal Ax=b low A-stable > Backward Euler O(hn) and beyond? simple high high large more #steps Linear Nonlinear stiffness Methods Performance = # steps x computation per step High Mild Low High Mild Low Forward Euler slow fast circuit dependent slow fast Backward Euler medium slow lots Ax=b Trapezoidal > Backward Euler and beyond? fast one Ax=b with fixed step size in C/h+G 11

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method Krylov Subspace Approximation Rational Krylov Subspace Approximation Parallelism Experimental Results Conclusions 12

Matrix Exponential Method (1/2) = Gx + Analytical solution of Let A=-C-1G, b=C-1u(Ccan be regularized [TCAD 12]) C x u ( ) ( ) ( ) t t t h b + = + + d A A ( ) h h x x ( ) ( ) ( ) t h e t e t 0 + b b ( ) ( ) t h t = + b b ( ) ( ) t h Let input be piecewise linear + ( ) ( ) b b ( ) ( ) t h t + = + + + A A A 1 2 h h h x x A b A I A ( ) ( ) ( ) ( ) t h e t e t e h h 13

Matrix Exponential Method (2/2) One-exponential formulation [Al-Mohy&Higham 11] reduce three matrix exponential to one + ( ) ( ) b b ( ) ( ) t h t + = + + + A A A 1 2 h h h x x A b A I A ( ) ( ) ( ) ( ) t h e t e t e h h 0 1 = J b 0 0 A W x ( ) t h 0 J + b + = = ( ) ( ) t h t W b x I ( ) t ( ) 0 t h e where h n e 0 2 = e 2 1 14

Advantages Accuracy: Analytical solution ApproximateeAh as (I+Ah) Forward Euler ApproximateeAh as (I-Ah)-1 Backward Euler Stability: A-stable for passive circuits reference solution How to compute eAv? 15

Computation on Matrix Exponential 19 dubious ways[van Loan03] Categories Based on 2 3 A A + + + + I A Series Method ! 2 ! 3 ( ) ( ) A SBS A N eA Rational Approximation small D = C 1 A B Decomposition + = = B C BC CB e e e Splitting spec(A) 1 ( ) 1 z I A e z dz Quadrature Rule 2 i large eAv m v Av A v { , , , } span Krylov Subspace regular basis and rational basis 16

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method Krylov Subspace Approximation Rational Krylov Subspace Approximation Parallelism Experimental Results Conclusions 17

Krylov Subspace Approximation (1/2) Krylov subspace K(A, v) = {v, Av, A2v, , Am-1v} orthogonalized by Arnoldi process scaling invariant = H V AV {v, Av, A2v, , Am-1v} m m m Arnoldi process sparse matrix-vector multiplication approximate eAhv by eHmh adaptivity H v A h h v V e e e efficiency m m 1 2 m is about 10~100 posteriori error estimation[Saad92] ( ) H = + h m v H , 1 Err m m h e e e m fast error estimation 1 krylov 2 18

Krylov Subspace Approximation (2/2) Stiffness affects step size and dimension Arnoldi process captures extreme and clustered eigenvalues Image{h } Image{h } captured regions - max - min Real{h } Real{h } Arnoldi process with a small m highly stiff critical part for eAh shrink h or increase m for capturing critical eigenvalues remedied by restarted scheme and scaling effect [ICCAD 12] Error bound [Saad92] + 1 m e v 2 Err = A h ( )! where krylov + 2 1 m 2 19

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method Krylov Subspace Approximation Rational Krylov Subspace Approximation Parallelism Experimental Results Conclusions 20

Rational Krylov Subspace Approximation (1/2) Rational basis (I- A)-1 K((I- A)-1, v) = {v, (I- A)-1v, , (I- A)-mv} Arnoldi process avoid regularization of C ( ) .. for j = 1, 2, . . . , m solve (I- A)w = vj for i = 1, 2, . . . , j Hi,j = wTvi w = w Hi,jvi end Hj+1,j = |w|2 vj+1 = w/Hj+1,j end 1 = H V I A V m m m (C+ G)w=Cvj mV H , w=Avj m subspace for A one LU for linear circuit ( ) 1 1 I H V AV m m m 21

Rational Krylov Subspace Approximation (1/2) Rational basis (I- A)-1 K((I- A)-1, v) = {v, (I- A)-1v, , (I- A)-mv} Approximation of eAhv adaptivity h ~ H ( ) ~ m A h v v V e e e = 1 H I H m 1 m m 2 Posteriori error estimation[van den Eshof 06] ~ Hm v / h I e ~ H ( ) ( ) = + m A v , 1 Err m m e e 2 ~ H + 1 1 rational m / h m 22

Rational Krylov Subspace Approximation (2/2) Spectral transformation similar to preconditioning relax stiffness constraint enable large step size with less dimension Image{h } applying large h to 1/ (I-H-1) transforming spectrum by (I- A)-1 small gap - min - min -h min- max - max max min -h max within a unit circle Real{h } determined by captured by Arnoldi process small m is acceptable critical part for eA projecting back to A by 1/ (I-H-1) 23

Rational Krylov Subspace Approximation (2/2) Spectral transformation similar to preconditioning relax stiffness constraint enable large step size with less dimension h ~ H m = A h v v V Error e e e m 1 2 fix , sweep m and h small step size 24

Rational Krylov Subspace Approximation (2/2) Spectral transformation similar to preconditioning relax stiffness constraint enable large step size with less dimension h ~ H m = A h v v V Error e e e m 1 2 fix h , sweep m and large error = 10-12 25

Wrap Up Methods Computation Scalability Error Stability Step size O(h2) Forward Euler x=Av high low tiny O(h2) Backward Euler Ax=b low A-stable medium O(h3) Trapezoidal Ax=b low A-stable > Backward Euler O(hn) Krylov Approx x=Av high high medium O(hn) Ration Krylov Ax=b low high large Linear Nonlinear Methods High Mild Low High Mild Low Forward Euler slow fast slow fast Backward Euler medium slow Trapezoidal > Backward Euler Krylov Approx slow fast slow medium Ration Krylov fast slow 26

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method Krylov Subspace Approximation Rational Krylov Subspace Approximation Parallelism Experimental Results Conclusions 27

Parallelism in Krylov Subspace Arnoldi process sparse matrix-vector multiplication [Bell&Garland 09] a a x thread 1 thread 2 1 , 1 , 1 n 1 x 2 x thread n-1 thread n 1 n a a x 1 , , n n n n Exponential of a small matrix [Higham 05] dense matrix by matrix operation 28

Input Grouping Constant slope within a step + ( ) ( ) b b ( ) ( ) t h t + = + + + A A A 1 2 h h h x x A b A I A ( ) ( ) ( ) ( ) t h e t e t e h h input 1 input 2 time tiny steps due to maintaining constant slope 29

Input Grouping Constant slope within a step thread 1 group 1 time thread 2 group 2 time 30

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method Krylov Subspace Approximation Rational Krylov Subspace Approximation Parallelism Experimental Results Conclusions 31

Settings of Experiments Environment Implemented in Matlab Intel i7 2.67GHz with 4GB memory Benchmarks Nonlinear and large-scale circuits Power distribution networks IBM power grid testcases[Nassif 08] Design Category # R # C # Trans. Size Stiffness 1.1x103 D1 16bit adder 723 34 448 579 5.4x106 D2 ALU 13.6K 4.3K 6502 10K R min 1.6x106 D3 IO 1.26M 34.6K 1461 630K R max 2.6x105 D4 Power grid 10.4M 8.6M 0 12M 32 generalized eigenvalues of (G, C)

Settings of Experiments RC tanks for PCB and package Environment Implemented in Matlab Intel i7 2.67GHz with 4GB memory Benchmarks Nonlinear and large-scale circuits Power distribution networks IBM power grid testcases[Nassif 08] Area (mm2) Design # R # C # L Size Stiffness 0.352 8.7x109 P1 23K 15K 15K 45.7K 1.402 8.3x109 P2 348K 228K 228K 688K 2.802 1.0x1010 P3 1.46M 0.97M 0.97M 2.90M 5.002 1.0x1010 P4 3.75M 2.47M 2.47M 7.40M 33

Settings of Experiments Environment Implemented in Matlab Intel i7 2.67GHz with 4GB memory Benchmarks Nonlinear and large-scale circuits Power distribution networks IBM power grid testcases[Nassif 08] Design # R # C # L # I # V Size Stiffness 3.5x1012 ibmpg2t 245K 36K 330 36K 330 164K 3.4x1011 ibmpg3t 1.60M 201K 955 201K 955 1M 2.5x1011 ibmpg4t 1.83M 265K 962 266K 962 1.2M 4.7x1011 ibmpg5t 1.55M 473K 277 473K 539K 2.1M 3.8x1011 ibmpg6t 2.41M 761K 281 761K 836K 3.2M 34

Nonlinear and Large-scale Circuits Matrix exponential method (MEXP) Krylov subspace approximation Restarted scheme and parallel SpMV on GPU Trapezoidal method (TRAP) same adaptive scheme as MEXP Parallel SpMV Design Size time m TRAP MEXP-Krylov speedup D1 579 100ps 20 671.4s 408.7s 1.64X D2 10K 100ps 30 3,085.91s 982.14s 3.14X D3 630K 100ps 30 8,053.45s 535.92s 15.05X D4 12M 1ns 20 fails 629.56 n/a 35

Power Distribution Networks Simulate long time span (1 s) for step response One LU factorization averaged by forward/backward substitutions MEXP with rational basis adaptively scales h/ TRAP uses predetermined step size adaptive & large step size MEXP Rational ( = 10-10) TRAP (h = 10ps) Design LU(s) Total LU(s) Total Speedup P1 0.67 44.85m 0.68 2.86m 15.73X P2 15.60 15.43h 15.48 54.57m 16.96X P3 91.60 76.92h 93.28 4.30h 17.91X P4 293.81 203.64h 298.83 11.26h 18.08X 36

Power Distribution Networks 37

IBM Testcases Widely adopted benchmarks Many input current sources Same MEXP with rational basis and TRAP ill alignment MEXP Rational ( = 10-10) TRAP (h = 10ps) Design LU(s) Total(s) LU(s) Total(s) Speedup ibmpg2t 1.31 48.19 1.29 41.81 1.15X ibmpg3t 18.05 493.97 18.41 413.90 1.19X ibmpg4t 30.32 675.78 31.01 229.13 2.95X ibmpg5t 16.16 657.13 16.48 649.97 1.01X ibmpg6t 23.99 965.53 34.60 915.62 1.05X 38

IBM Testcases 39

IBM Testcases Applying simple grouping each group of inputs has the same pivot points 6X speedup on average MEXP Rational ( = 10-10) TRAP (h = 10ps) Design LU(s) Total (s) # Group LU (s) Total (s) Speedup ibmpg2t 1.31 48.19 25 1.29 7.93 6.77X ibmpg3t 18.05 493.97 25 18.41 86.24 6.08X ibmpg4t 30.32 675.78 4 31.01 124.16 5.73X ibmpg5t 16.16 657.13 25 16.48 111.97 5.44X ibmpg6t 23.99 965.53 25 34.60 166.34 5.80X 40

Conclusions Emerging challenges in the circuit simulation scalability and performance Matrix exponential method accuracy, adaptivity and stability regular and rational Krylov subspace approximation Effectiveness of matrix exponential method Simulate a large-scale circuit with 12M nodes Nonlinear circuits: 6.61X speedup on average Impulse response for PDNs: 15X speedup IBM testcases: 6X speedup using input grouping 41

Future Works Variant basis in Krylov subspace inverted, extended basis Model Order Reduction and matrix exponential method both exploiting Krylov subspace utilizing well-developed MOR to MEXP Hybrid simulation via matrix exponential handle thermal, mechanical phenomena with FEM 42

Thank you! 43

Where are we? Trade off between stability and performance stability high Matrix Exponential Method [Weng et. al. 11] Backward Euler Trapezoidal Method(SPICE) ETD in numerical community: [Saad 92] [Ban et. al. 11] [Aluffi-Pentini et. al. 03] [Hochbruck et. al. 97] SILCA [Li & Shi, 03] Waveform Relaxation [E Lelarasmee et. al, 82] Domain Decomposition [K. Sun et. al., 07] Tailor for circuit simulation: Adaptive step control Scaling effect Nonlinear device Parallelization ACES [Devgan & Rohrer, 97] Telescopic [Dong & Li, 10] LIM [J. E. Schutt-Aine, 01] Forward Euler computational effort low high 44

Adaptive Step Control Typical circuit behavior error budget Err larger h err total h total T smaller h 45

Adaptive Step Size Strategy Adjustment of step size Krylov subspace approximation require only to scale Hm: A Hm re-calculate eHm backward Euler (C/h+G) changes and needs to solve linear system again Strategy: maximize step size with a given error budget Errtotal error are from Krylov space method and linearization A W x ( ) t h 0 J + = x I ( ) 0 t h e n e 2 = + + 1 x C G C x Bu ( / ) / ( ) h h + + 1 1 n n n L NL total Err Err total Err h Err h krylov nonlinear total T total T 46

Nonlinear Formulation Decouple nonlinear and linear components d ( ( ) ) ( ) h + = + + + + A A ( ) h h x x F x b ( ) ( ) t h e t e t t 0 ( ( ) t ) 1 C i x approximate eAF + ) ( + ) ( ) b b ( ) ( ) h 2 t h t ( ( ) ) ( ( ( ) t ) + = + + + + + A A A 1 2 h h h x F x x F x A b A I A ( ) ( ) ( ) ( ) t h t h e t e t e h h calculate Jacobian matrix constant during Newton s iteration J(F) in MEXP has less non-zeros ( ) 2 + + C G G / h C G MEXP: BE: L NL NL h 47

Only Inverted Rational basis A-1 K( A-1, v) = {v, A-1v, , A-mv} requires more m and smaller h Image{h } only inverted after shifted-and-inverted -1/ min Real{h } smaller spectrum 48

Different needs large m 49

Different 50

Circuit Simulation via Matrix Exponential Method

Download Presentation

Presentation Transcript

Related

More Related Content