Maximum Fanout-Free Window Enumeration

maximum fanout free window enumeration towards n.w
1 / 44
Embed
Share

This study explores the concept of Maximum Fanout-Free Window Enumeration towards Local Multi-Output Sub-structure Synthesis. It discusses the motivation, algorithms, applications, and implications of cutting-edge techniques in the field.

  • Maximum 4 words in a tag. scalability
  • synthesis
  • algorithms
  • circuits

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Maximum Fanout-Free Window Enumeration: Towards the Local Multi-Output Sub-structure Synthesis Ruofei Tang*, Xuliang Zhu, Lei Chen, Xing Li, Xin Huang, Mingxuan Yuan, Weihua Sheng and Jianliang Xu 17/3/2025

  2. Outline Preliminaries Motivation Maximum Fanout-Free Windows & Enumeration Algorithms Experimental Results Application - Rewriting 17/3/2025 2

  3. And-Inverter Graph (AIG) A classical and scalable representation of combinational circuits Primary Output Solid Line for Direct Pass of Signal Logic AND Gate Dashed Line for Negation of Signal Primary Input 17/3/2025 3

  4. Cuts A cone-shaped area in AIG The same in graph algorithms Root 17/3/2025 4

  5. Cuts A cone-shaped area in AIG Denoted as (r, L) = (17, {10, 12, 15}) (r, L) is called k-feasible if |L| k Root Leaf-nodes (input nodes) 17/3/2025 5

  6. Applications of Cuts Cuts are proposed for LUT-mapping. Widely used for efficient enumeration methods I4 I3 I1 I2 I3 I3 I2 I1 I2I3 I1 I4I5 I6 I1 I2 I1 I2 I3 I4 LUT Mapping Balancing Rewriting 17/3/2025 6

  7. Motivation Cuts only supports a single-output substructure Scale of cuts is small Hard to apply multiple output synthesis algorithms on cuts. 17/3/2025 7

  8. Maximum Fanout-Free Window (MFFW) Expand upon cuts and enable multiple outputs An MFFW contains all nodes that can be expressed by input nodes 10 10 9 9 8 8 7 7 5 6 5 6 I1 I2 I3 I4 I1 I2 I3 I4 17/3/2025 8

  9. MFFW Enumeration MFFW need efficient enumeration methods for synthesis Propose two enumeration methods: Dynamic and Static Cut Synthesis Algorithm Enumeration AIG MFFW Enumeration 17/3/2025 9

  10. Dynamic Enumeration Step 1: Select a set of leaf-nodes as inputs of MFFW 10 9 8 7 5 6 I1 I2 I3 I4 17/3/2025 10

  11. Dynamic Enumeration Step 2. Initialize set S = {}, queue Q = leaf-nodes 10 9 Q = { I1, I2, I3, I4 } Nodes to process 8 S = {} 7 Processed Nodes 5 6 I1 I2 I3 I4 17/3/2025 11

  12. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end Q = { I1, I2, I3, I4 } 10 9 fanout(I1)= {5, 7} 8 Another fanin in S? If Y: Push node to Q Move I1 to S 7 5 6 I1 I2 I3 I4 S = {I1} 17/3/2025 12

  13. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end Q = {I2, I3, I4, 5} 10 9 fanout(I2)= {5} 8 Another fanin in S? If Y: Push node to Q Move I2 to S 7 5 6 I1 I2 I3 I4 S = {I1, I2} 17/3/2025 13

  14. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end Q = {I3, I4, 5, 7} 10 9 fanout(I3)= {6, 7} 8 Another fanin in S? If Y: Push node to Q Move I3 to S 7 5 6 I1 I2 I3 I4 S = {I1, I2, I3} 17/3/2025 14

  15. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {I4, 5, 7, 6} S = {I1, I2, I3, I4} 8 7 5 6 I1 I2 I3 I4 17/3/2025 15

  16. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {5, 7, 6, 8} S = {I1, I2, I3, I4, 5} 8 7 5 6 I1 I2 I3 I4 17/3/2025 16

  17. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {7, 6, 8} S = {I1, I2, I3, I4, 5, 7} 8 7 5 6 I1 I2 I3 I4 17/3/2025 17

  18. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {6, 8, 10} S = {I1, I2, I3, I4, 5, 7, 6} 8 7 5 6 I1 I2 I3 I4 17/3/2025 18

  19. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {8, 10, 9} S = {I1, I2, I3, I4, 5, 7, 6, 8} 8 7 5 6 I1 I2 I3 I4 17/3/2025 19

  20. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {10, 9} S = {I1, I2, I3, I4, 5, 7, 6, 8} 8 7 5 6 I1 I2 I3 I4 17/3/2025 20

  21. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end 10 9 Q = {} 8 S = {I1, I2, I3, I4, 5, 7, 6, 8, 9, 10} 7 5 6 I1 I2 I3 I4 17/3/2025 21

  22. Problem Consider a node with super large fanout Have to enumerate all fanout Tend to appear in many MFFWs, have to enumerate it many times 17/3/2025 22

  23. Solution: Structural Hashing Each node is assigned an unique number Each node will be recorded to a hash table depending on 4 elements n 3 4 1 2 a b strash( number(a) ,number(b) , is_compl(a) , is_compl(b) ) 17/3/2025 23

  24. Dynamic Enumeration Step 3. Process until Q = {}, S contains all nodes of MFFW in the end Two choices: Enumerate fanout Query structural hashing table S = { I1, I2, I3, I4, 7} Q = {5, 6} 10 9 8 For each n ?, check strash(n,5), strash( n,5), strash(n, 5), strash( n, 5) existence in AIG fanout(5) = {8, 10} Check another fanin of {8, 10} 7 5 6 depend on the cost I1 I2 I3 I4 17/3/2025 24

  25. Static Enumeration Lemma: Given a set of MFFW input nodes S, for any cut c=(r, L) where ? ?, r belongs to the MFFW e.g. 10 9 8 7 5 6 I1 I2 I3 I4 17/3/2025 25

  26. Static Enumeration Lemma: Given a set of MFFW input nodes S, for any cut c=(r, L) where ? ?, r belongs to the MFFW e.g. 10 9 Cut: (5, {I1, I2}) 8 7 5 6 I1 I2 I3 I4 17/3/2025 26

  27. Static Enumeration Lemma: Given a set of MFFW input nodes S, for any cut c=(r, L) where ? ?, r belongs to the MFFW e.g. 10 9 Cut: (5, {I1, I2}) 8 Cut: (6, {I3, I4}) 7 5 6 I1 I2 I3 I4 17/3/2025 27

  28. Static Enumeration Lemma: Given a set of MFFW input nodes S, for any cut c=(r, L) where ? ?, r belongs to the MFFW e.g. 10 9 Cut: (5, {I1, I2}) 8 Cut: (6, {I3, I4}) 7 Cut: (8, {I1, I2, I4}) 5 6 I1 I2 I3 I4 17/3/2025 28

  29. Static Enumeration Step 1: Compute all k-feasible cuts for each node Cache them in a hash table, indexed by leaf-nodes 3-input: (10, {5, 7, I4}) hash: {5, 7, I4} {9} 4-input: (9, {I1, I2, I3, I4}) hash: {I1, I2, I3, I4} {9} Compute All 4-feasible cut of node 9: 10 9 8 7 5 6 I1 I2 I3 I4 17/3/2025 29

  30. Static Enumeration Step 1: Compute all k-feasible cuts for each node Cache them in a hash table, indexed by leave-nodes 3-input: (10, {5, 7, I4}) hash: {5, 7, I4} {9} 4-input: (9, {I1, I2, I3, I4}) hash: {I1, I2, I3, I4} {9} Compute All 4-feasible cut of node 9: 10 9 8 7 4-input: (10, {I1, I2, I3, I4}) hash: {I1, I2, I3, I4} {9, 10} 5 6 Compute All 4-feasible cut of node 10: I1 I2 I3 I4 17/3/2025 30

  31. Static Enumeration Step 2: Select a set of cut leaf-nodes with K elements 10 9 8 7 5 6 I1 I2 I3 I4 17/3/2025 31

  32. Static Enumeration Step 3: Enumerate subsets of selected leaf-nodes Add nodes by hash table Query {I1, I2} {5} Query {I1, I3} {7} 10 9 8 Query {I3, I4} {6} Query {I1, I2, I4} {8} 7 5 6 Query {I1, I2, I3, I4} {9, 10} I1 I2 I3 I4 17/3/2025 32

  33. Differences: Static & Dynamic Enumeration Static Enumeration: o All cuts are pre-computed and cached o Limited modification on structure of AIG during enumeration o Works significantly faster on some circuits Dynamic Enumeration: o All cuts are computed during enumeration o Arbitrary modification on structure of AIG during enumeration 17/3/2025 33

  34. Experiments Setup Relative large circuits form EPFL Combinational Suite and IWLS 2005 oAll the circuits have more than 10,000 nodes Three enumeration algorithms: oBasic: Dynamic Enumeration without structural hash oDynamic oStatic 17/3/2025 34

  35. Experimental Results K=4 Dynamic time(s) K=5 K=6 Benchmarks Basic Static time(s) Basic time(s) Dynamic Static Basic time(s) 17.31 465.19 83.94 75.39 46.40 43.50 43.04 136.75 133.71 262.13 260.94 144.69 559.53 Dynamic time(s) 15.91 442.62 79.28 72.42 47.95 45.70 46.72 147.81 140.97 261.49 250.06 73.82 271.80 208.94 23.10 Static time(s) 32.58 496.99 82.82 74.38 49.29 27.72 27.90 96.68 98.18 178.40 177.21 60.31 304.31 281.80 31.23 6,959.78 8,680.80 9,911.83 name div hyp log2 multiplier square b17 b17_1 b18 b18_1 b19 b19_1 des_perf leon2 netcard vga_lcd sixteen twenty twentythree size 57,247 214,335 32,060 27,062 18,484 52,323 52,166 133,081 133,391 261,297 261,371 83,792 792,039 864,603 134,866 16,216,836 20,732,893 23,339,737 time(s) 0.69 4.47 1.10 0.78 0.34 1.60 1.61 4.13 4.08 8.52 8.56 3.16 46.51 524.92 33.36 time(s) 3.04 29.15 15.59 13.10 6.81 14.06 13.83 37.16 35.84 72.70 74.31 11.34 56.98 46.19 6.10 16,469.18 19,814.40 22,514.85 time(s) 4.96 35.38 12.90 10.50 6.01 4.73 4.78 14.86 14.63 27.77 28.85 9.31 63.29 57.88 7.78 2,147.21 2,571.32 2,867.43 0.55 3.49 0.92 0.72 0.37 1.40 1.40 3.56 3.54 7.01 7.05 1.84 9.83 9.17 1.39 0.96 4.77 0.94 0.67 0.40 0.90 0.89 2.51 2.50 5.08 5.09 1.60 13.88 14.20 2.06 583.40 786.37 1,056.62 INF means over 10 hours 3.32 32.58 17.83 14.58 6.33 14.05 13.81 37.57 35.19 75.72 75.26 20.43 156.68 1,366.95 89.61 INF 251.44 INF INF INF 2,691.97 3,932.82 4,471.13 INF INF INF INF INF INF INF INF INF 17/3/2025 35

  36. Application: Rewriting MFFW allows more nodes to be replaced in one replacement than cuts 17/3/2025 36

  37. Rewriting Framework Original MFFW Calculate canonical forms of outputs truth tables match MFFW database Key enumerate extract reply Found MFFW Compare AIG Better MFFW AIG Optimize Update Replace Optimized MFFW 17/3/2025 37

  38. Experimental Results for Rewriting State-of-the-art Methods Benchmark Our Method Drw Rewriting Window Rewriting Time 0.00 0.01 0.01 0.61 210,402 0.12 0.00 0.09 0.02 0.10 0.06 0.01 0.00 0.00 0.00 0.00 0.00 0.10 0.00 0.00 0.05 0.06 Name Adder Size 1,020 3,336 57,247 214,335 32,060 2,865 27,062 5,416 24,618 18,484 11,839 Time 0.01 0.02 0.69 2.29 0.33 0.02 0.25 0.06 0.79 0.16 0.09 0.00 0.00 0.00 0.01 0.00 0.34 0.01 0.00 0.14 0.26 Size 1,020 3,141 41,197 213,149 29,761 2,862 24,764 5,191 18,481 17,758 11,839 Improv. 0.00% 5.85% 28.04% 0.55% 7.17% 0.10% 8.49% 4.15% 24.93% 3.93% 0.00% 1.30% 29.89% 0.00% 4.62% 14.62% 1.16% 12.88% 4.28% 24.52% 8.83% Size 892 3,141 41,267 Improv. 12.55% 5.85% 27.91% 1.83% 5.51% 0.10% 5.01% 4.76% 23.96% 5.31% 0.00% 0.00% 33.33% 0.00% 0.15% 0.77% 0.47% 19.43% 5.06% 30.27% 9.13% Time 0.03 0.07 1.15 4.13 0.78 0.06 0.06 0.15 0.51 0.38 0.21 0.04 0.03 0.03 0.04 0.03 0.58 0.05 0.03 0.27 0.43 Size 892 3,141 40,432 206,794 29,494 2,862 24,279 5,092 18,369 17,081 11,839 Improv. 12.55% 5.85% 29.37% 3.52% 8.00% 0.10% 10.28% 5.98% 25.38% 7.59% 0.00% 3.03% 44.25% 0.00% 4.62% 16.54% 1.49% 48.98% 5.06% 37.41% 13.50% Barrel shifter Divisor Hypotenuse Log2 Max Multiplier Sine Square-root Square Arbiter Cavlc Ctrl Dec I2c Int2float Mem_ctrl Priority Router Voter Average 30,294 2,862 25,706 5,158 18,719 17,502 11,839 693 116 304 1,340 258 46,617 788 244 9,593 693 174 304 1,342 260 46,836 978 257 13,758 684 122 304 1,280 222 46,291 852 246 10,384 672 97 304 1,280 217 46,137 499 244 8,611 17/3/2025 38

  39. Summary Maximum Fanout Free Window: o A novel structure for local multi-output synthesis algorithms o Two enumeration methods, Static and Dynamic o Experimental results show the efficiency of the algorithms MFFW-based Rewriting: o Experimental results show the effectiveness of our proposed rewriting framework 17/3/2025 39

  40. References 1. Martinello O, Marques F S, Ribas R P, et al. KL-cuts: a new approach for logic synthesis targeting multiple output blocks[C]//2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010). IEEE, 2010: 777-782. 2. Chatterjee, S., Mishchenko, A., & Brayton, R. (2006, November). Factor cuts. In Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design (pp. 143-150). Mishchenko, A., Chatterjee, S., & Brayton, R. (2006, July). DAG-aware AIG rewriting a fresh look at combinational logic synthesis. In Proceedings of the 43rd annual Design Automation Conference (pp. 532-535). 3. 4. Yang, W., Wang, L., & Mishchenko, A. (2012, November). Lazy man's logic synthesis. In Proceedings of the International Conference on Computer-Aided Design (pp. 597-604). 5. Riener, H., Lee, S. Y., Mishchenko, A., & De Micheli, G. (2022, January). Boolean rewriting strikes back: Reconvergence- driven windowing meets resynthesis. In 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) (pp. 395-402). IEEE. 6. Huang, Z., Wang, L., Nasikovskiy, Y., & Mishchenko, A. (2013, December). Fast Boolean matching based on NPN classification. In 2013 International Conference on Field-Programmable Technology (FPT) (pp. 310-313). IEEE. 7. Amar , L., Gaillardon, P. E., & De Micheli, G. (2015). The EPFL combinational benchmark suite. In Proceedings of the 24th International Workshop on Logic & Synthesis (IWLS) (No. CONF). 8. Brayton, R., & Mishchenko, A. (2010). ABC: An academic industrial-strength verification tool. In Computer Aided Verification: 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings 22 (pp. 24-40). Springer Berlin Heidelberg. 17/3/2025 40

  41. Thank you! 17/3/2025 41

  42. Appendix 1 Benchmark Adder Size Cut node # MFFW node # MFFW output # 1,020 3,336 29,040 214,306 31,707 2,865 27,060 5,372 24,506 18,483 11,839 3.93 3.24 3.57 4.35 5.03 3.23 5.10 4.66 3.35 4.79 3.00 3.23 4.23 3.00 3.11 3.15 3.24 3.59 3.99 4.52 10.47 9.23 8.47 9.34 9.10 8.14 9.60 9.04 8.00 9.44 7.13 11.29 14.73 11.56 10.01 10.01 8.16 7.78 8.86 10.10 2.73 4.49 5.13 3.81 4.30 4.28 3.81 4.61 4.87 3.87 5.02 8.42 9.95 10.02 6.33 6.33 5.49 4.36 4.08 4.08 Barrel shifter Divisor Hypotenuse Log2 Max Multiplier Sine Square-root Square Arbiter Cavlc Ctrl Dec I2c Int2float Mem_ctrl Priority Router Voter 690 169 304 1,321 258 46,719 978 257 11,952 17/3/2025 42

  43. Appendix 2 dot-product vs. set insert {3, 4} n a b {1, 3, 4} {2, 3, 4} 17/3/2025 43

  44. Appendix 3 LUT-Mapping MFFW 1 FFW 1 Overlap Zone MFFW 2 MFFW 2 Cache Node Index Cache MFFWs 17/3/2025 44

More Related Content