
Efficient SRAM Optimization Using Device-Circuit Co-Optimization Framework
This study focuses on minimizing the Energy-Delay Product of SRAM arrays through a comprehensive Device-Circuit Architecture Co-Optimization framework. It explores techniques such as using high-Vt devices in SRAM cells, operating at low voltages, and employing assist circuits to enhance stability and performance metrics. The research highlights the advantages and challenges of implementing HVT devices, along with strategies to mitigate bitline delay and improve read stability in SRAM cells. Overall, the paper presents valuable insights into optimizing SRAM arrays for efficiency and performance.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Minimizing the Energy-Delay Product of SRAM Arrays using a Device-Circuit- Architecture Co-Optimization Framework Alireza Shafaei, Hassan Afzali-Kusha, and Massoud Pedram Department of Electrical Engineering University of Southern California http://sportlab.usc.edu/
Outline Introduction 6T SRAM cell with HVT devices Read/write-assist techniques SRAM array model Simulation results Conclusion 1
6T SRAM Cell Read stability (non-destructive read) requirement: BL BL WL WL M3 M4 ???,?5< ???,?1 ???,?6< ???,?2 Q QB Write-ability (successful write) requirement: M5 M6 M1 M2 ???,?5> ???,?3 ???,?6> ???,?4 Can easily fail under process variations o Sizing up transistors or adopting more robust SRAM cells (e.g., 8T) come at the cost of larger layout area o Width quantization property in FinFETs Ideal case: single-fin devices for all transistors PD: Pull-down transistors (M1 and M2) PU: Pull-up transistors (M3 and M4) AC: Access transistors (M5 and M6) 2
Area/Power-Efficient SRAMs All-single-fin 6T SRAM o Operating at low voltages To reduce the power consumption o Equipped with assist circuits To improve degraded stability and performance metrics Alternative solution: Use high-Vt (HVT) devices in SRAM cells Devices used in this paper A 7nm FinFET library under Vdd = 450mV (nominal supply voltage) o HVT compared with low-Vt (LVT) counterparts have 2 lower ON current, 20 lower OFF current, and 10 higher ON/OFF current ratio. http://sportlab.usc.edu/downloads/packages/ 3
6T SRAM with HVT Devices Advantages of using HVT devices in SRAM cells: o Lower OFF current significant leakage power reduction o Higher ON/OFF static noise margin (SNM) improvement 10.00 200 Leakage Power (nW) 6T-LVT 6T-HVT 6T-LVT 6T-HVT 1.692 160 HSNM (mV) 1.00 120 5.3 0.433 80 0.10 35% of Vdd 40 0.082 0.028 0.01 0 450 400 350 300 250 200 150 100 Vdd(mV) 450 400 350 300 250 200 150 100 Vdd(mV) Major drawback: o Lower ON current reduces the read current increases the bitline (BL) delay performance degradation 4
Mitigating the BL Delay BL delay can be modeled as follows: ???=??? ?? ????? Reduce ?? (sensing voltage): o Difficult to do due to the increased effect of process variations Increase ????? (read current): o By assist circuits. Such circuits are also needed to improve the RSNM and write margin of the all-single-fin 6T SRAM cell. Decrease ??? (BL capacitance): o By decreasing (i) the number of rows (or the number of SRAM cells in each column) of the array, and (ii) the number of pre- charger/write buffer transistors 5
Read-Assist Techniques Vdd Wordline Underdrive VWL WL Effective in improving RSNM VDDC Vdd Vdd Boost CVDD Vdd Vdd 0 1 Effective in reducing BL delay ~VSSC ~VDDC Negative Ground Gnd Iread VSSC CVSS 35 300 250 300 35 200 Bitline Delay (ps) Negative Gnd 30 30 250 250 Bitline Delay (ps) RSNM (mV) 200 Bitline Delay (ps) 160 RSNM (mV) RSNM (mV) 25 25 200 200 150 120 20 20 150 150 15 15 80 100 100 100 Wordline Underdrive 10 10 40 50 50 50 5 5 VddBoost 0 0 0 0 0 0 0 -50 -100 Vssc(mV) -150 -200 -250 450 600 Vddc(mV) 750 900 450 400 350 VWL(mV) 300 250 200 6
Write-Assist Techniques WL overdrive: VWL is set to a voltage level higher than Vdd to strongly turn on the access transistor. Negative BL: Write operation into the SRAM cell conventionally (without write-assist) occurs from a BL that is 0. By using a negative voltage for that BL, the gate-to-source voltage becomes larger, more strongly turning on the access transistor. 1.6 500 1.6 500 Cell Write Delay (ps) Wordline Overdrive Negative Bitline 400 Cell Write Delay (ps) 400 1.2 1.2 WM (mV) WM (mV) 300 300 0.8 0.8 200 200 0.4 0.4 100 100 0.0 0 0.0 0 450 525 600 675 750 825 VWL(mV) 0 -75 -150 -225 -300 -375 Bitline Voltage (mV) 7
SRAM Array Model Vdd VDDC 20X Vdd Vdd Pre ICVDD Npre Npre RE 20X 1 Cell Vdd VWL 27X CCVDD Vdd BL Row Address CVDD WE IWL WL Row Decoder 1X 3X 9X 27X CWL Iread Drow_dec Drow_drv Cell Vss ICVSS CVSS RE CCVSS 20X 20X RE RE: Read Enable WE: Write Enable Pre: Precharge signal Npre: # of fins of precharger transistors Nwr: # of fins of write buffer transistors VSSC Nwr Nwr Sense Amplifier WE WE WE WE Data In Data Out Data In 8
SRAM Array Model (2) An SRAM array with ?? rows and ?? columns is assumed. Capacity: ? = ?? ?? bits ????, ????, and ??? are provided by external sources or on-die DC-DC converters In each access cycle, ? bits are read or written If ??> ?, then column decoder/multiplexer are also needed ???? (???): # of fins of pre-charger (write buffer) transistor ? =? V ???= ? ? ? = ? ? ? Tables 1, 2, and 3 in the paper ? ??? (???): read (write) access delay ??????= max(???,???) ???,?? (???,??): read (write) access energy ??????,??= ? ???,??+ 1 ? ???,?? ?: ratio of read access to the total access ??????,????= ? ?????,???? ?????? ?: array activity factor ??????= ? ??????,??+ ??????,???? ?????,????: leakage power of an SRAM cell 9
Optimization Problem Given? (the capacity of SRAM array in bits) ? ??), ????, and ??? Find the values of ????, ????, ???, ?? (??= Such that?????? ?????? is minimized While yield requirements of SRAM cells are satisfied Yield constraint: min ? ??????, ? ??????, ? ???? 0 min ????,????,?? ? 1 ? 6, depending on yield requirements ?: the minimum acceptable noise margin level 10
Simulation Setup ? = 0.5 ? = 0.5 ? = 0.35 ???= 158?? ? = 64 ??= 120?? M1: only one extra voltage level (a high voltage) other than ??? is available, whose value is set to max(????,???), i.e., 640mV (550mV) for 6T-LVT (6T-HVT). M2: No restriction on the number of voltage levels is considered. Hence, we have three extra pins for LVT-based array (????=640mV, ???=490mV, and ????). However, since ???? and ??? are very close in 6T-HVT, only two pins are used for HVT-based array (????= ???=550mV, and ????). 11
Simulation Results: Delay 6T-LVT-M1 6T-HVT-M1 6T-LVT-M2 6T-HVT-M2 400 Delay (ps) 200 100 50 SRAM Array Capacity (M) 12
Simulation Results: Delay 500 6T-HVT BL Delay Total Delay 400 Delay (ps) 300 200 100 0 2KB 4KB 8KB 16KB 13
Simulation Results: Energy 81 6T-LVT-M1 6T-HVT-M1 6T-LVT-M2 6T-HVT-M2 Energy (fJ) 27 9 3 1 SRAM Array Capacity (M) 14
Simulation Results: DelayEnergy 31.25 6T-LVT-M1 6T-HVT-M1 6T-LVT-M2 6T-HVT-M2 EnergyDelay 6.25 1.25 0.25 0.05 SRAM Array Capacity (M) 15
Conclusion By using the proposed optimization framework, for SRAM array capacities ranging from 1KB to 16KB, on average 59% lower energy- delay product with maximum 12% (and on average 9%) performance penalty is achieved. 16