
Optimizing Computation through Pipelining Techniques
"Learn about the benefits and complexities of implementing pipelining in combination circuits for faster computation speed and improved system throughput. Explore the advantages and disadvantages to make informed decisions for optimizing system performance."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Pipeline Principle A non-pipelined system of combination circuits (A, B, C) that computation requires total of 300 picoseconds. Non-pipelined Diagram 100 ps 100 ps 100 ps OP1 Comb. logic A Comb. logic B Comb. logic C OP2 OP3 Time Cannot start new operation until previous one completes Delay = 300 ps / Throughput = 1/300 ps = 3.333 GOPS A pipelined version by adding register at each output of the combination circuits. Additional 20 picoseconds to save result in register. Begin new operation every 120 ps. Overall latency increases. Pipelined Diagram 100 ps 20 ps 100 ps 20 ps 100 ps 20 ps A B C OP1 A B C OP2 Comb. logic A R e g Comb. logic B R e g Comb. logic C R e g A B C OP3 Time Up to 3 operations in process simultaneously Clock Delay = 3x120 ps = 360 ps / Throughput = 1/120 ps = 8.33 GOPS
Non-uniform delays 50 ps 20 ps 150 ps 20 ps 100 ps 20 ps R e g Comb. logic B R e g Comb. logic C R e g Comb. logic A Delay = 3x170 ps = 510 ps Throughput = 1/170 ps = 5.88 GOPS slowest stage Clock OP1 A B C OP2 A B C OP3 A B C Time Throughput limited by slowest stage (170 ps) Other stages sit idle for much of the time Challenging to partition system into balanced stages www.cs.cmu.edu/afs/cs/academic/class/15349-s02/lectures/class4-pipeline-a.ppt
Individual functions are marked with their delay. You may apply two-way interleaving on a single component. Other components may not be further divided. Draw lines to indicate where you would insert pipelining flip flops.
The two-way interleaving circuit may be represented by:
Advantages of Pipelining 1.The cycle time of the processor is reduced. 2.It increases the throughput of the system 3.It makes the system reliable. Disadvantages of Pipelining 1.The design of pipelined processor is complex and costly to manufacture. 2.The instruction latency is more.
Fundamental Operation of Retiming A retiming move in a circuit is caused by moving all of the memory elements at the input of a combinational block to all of its outputs, or vice-versa. Example: The synthesis tool to move stages to balance combination delay on each side of the registers.
Cycle Time - Critical Path Delay + Setup time + FF Delay T Tmax + Tsetup +TCTQ TCTQ Tsetup Critical (longest) path 5 gates FF propagation delay (TCTQ) time from arrival of clock signal till change at FF output Longest (critical) path delay is a function of: Total gates + wire delays For FFs to correctly latch data, input data must be stable during the Setup time (Tsetup) before clock arrives
Min Path Delay - Hold Time For FFs to correctly latch data, input data must be stable during Hold time (Thold) after clock arrives. Determined by delay of shortest path in circuit Tmin Thold. Tmin Tmin Thold