
Customizable Application-Specific NOC Generation
"Explore the development of a flexible, app-specific NOC generation tool using CHISEL, enabling accurate, high-performance, power-efficient designs with support for parametric design exploration. Learn about the Chisel workflow developed at UC Berkeley and the features of the Network-on-Chip Generator with customizable parameters. Dive into parameterized routers and examples of 2D mesh configurations. Discover how CHISEL empowers efficient hardware design for various applications."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Synthesizable, Application-Specific NOC Generation using CHISEL Maysam Lavasani , Eric Chung , John Davis : The University of Texas at Austin : Microsoft Research Acknowledgement: Jonathan Bachrach and rest of CHISEL team.
Problem/motivation Goal: Flexible, App-specific NOC Generation Accuracy Performance Power Design space exploration Supports for parametric design Available solutions C-based software simulation (e.g. Orion) inaccurate RTL too low-level Bluespec is not free Web-based solutions are closed source This talk: Our experience building NOCs w/ CHISEL 2
Chisel Workflow Developed @ UC Berkeley Open-source Built on top of Scala Object-oriented Functional Hardware in Chisel Test-bench code in Scala Chisel compiler C++ simulation code C++ Verilog simulation Verilog code simulation Synthesis flow Functional/Performance results Tool Input/output 3
Network-on-Chip Generator Customizable Features Topology (e.g., mesh, ring, torus) Buffer sizes Link widths Routing Targeted for FPGA (evaluated) ASIC (future work) Fully synthesizable Xilinx ISE 13+ R R R R R R R R R R R R R R Big Big Small Router Small Router Router Router 4
Parameterized Router Input port Input port Output port Output port State State Mediator RR Arbiter Route logic Stored Route Switch Input port Input port Output port Output port State State Route logic Mediator RR Arbiter Stored Route 5
2D Mesh Example in Chisel val routers = Range(0, numRows, 1).map(i => new Range(0, numColumns, 1).map(j => new MyRouter(5, routerID(i, j), XYrouting))) R R R R R R R R R R R R R R R R 6
2D Mesh Example in Chisel for (i <- 0 until numRows) { for (j <- 1 until numColumns) { routers(i)(j).io.ins(south) <> routers(i)(j-1).io.outs(north) routers(i)(j).io.outs(south) <> routers(i)(j-1).io.ins(north)}} R R R R R R R R R R R R R R R R 7
2D Mesh Example in Chisel for (j <- 0 until numRows) { for (i <- 1 until numColumns) { routers(i)(j).io.ins(west) <> routers(i-1)(j).io.outs(east) routers(i)(j).io.outs(west) <> routers(i-1)(j).io.ins(east)}} R R R R R R R R R R R R R R R R 8
2D Mesh Example in Chisel for (i <- 0 until numRows) { for (j <- 0 until numColumns) { io.tap(routerID(i, j)).deq <> routers(i)(j).io.outs(cpu) io.tap(routerID(i, j)).enq <> routers(i)(j).io.ins(cpu)}} R R R R R R R R R R R R R R R R 9
2D Mesh Example in Chisel val routers = Range(0, numRows, 1).map(i => new Range(0, numColumns, 1).map(j => new MyRouter(5, routerID(i, j), XYrouting))) for (j <- 0 until numRows) { for (i <- 1 until numColumns) { routers(i)(j).io.ins(west) <> routers(i-1)(j).io.outs(east) routers(i)(j).io.outs(west) <> routers(i-1)(j).io.ins(east)}} Fits on 1 page! Fits on 1 page! for (i <- 0 until numRows) { for (j <- 1 until numColumns) { routers(i)(j).io.ins(south) <> routers(i)(j-1).io.outs(north) routers(i)(j).io.outs(south) <> routers(i)(j-1).io.ins(north)}} for (i <- 0 until numRows) { for (j <- 0 until numColumns) { io.tap(routerID(i, j)).deq <> routers(i)(j).io.outs(cpu) io.tap(routerID(i, j)).enq <> routers(i)(j).io.ins(cpu)}} 10
Application Case Study: K-means Cluster N points in D-dim space into C clusters Pick C initial centers Assign N points to nearest center Compute new centers No Yes Max Iterations or Converge? Done N = 12, C = 3, D = 2 11
Parallel K-means accelerator Core (Nearest Distance) Core (Nearest Distance) Core (Nearest Distance) R R R Streamer DMA R R R Customized Network- on-Chip Reduction Core Memory Banks 12
Performance Sensitivity to NOC K-means and Mesh Performance 4.5 4 3.5 Speedup 3 Number of Cores 2.5 2 1 2 4 1.5 1 0.5 0 8 16 32 8 16 32 8 16 32 8 16 32 2 6 16 32 Link width Number of clusters
My experience - positives Chisel (V.1.0) improves productivity Bulk interfaces Parameterized classes Type inference reduces errors Functional features Faster C++ based simulation Open source (BSD license) UCB support Tested on large-scale UCB projects 14
My experience - negatives Compiler (V.1.0) not as robust as commercial tools Long compile time Memory leak Large circuits loading time Single clock domain Cannot mix synthesizable and behavioral code 15
Thank you Please come and see my poster 16