Computer Arithmetic and ALU Design
Explore the intricate world of computer arithmetic and ALU design, covering topics such as addition, subtraction, logic operations, fast adders, multiplication, division, and more. Dive into designing a MIPS ALU and understanding functional specifications for ALU operations using innovative design tricks.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 1
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 2
Problem: Designing MIPS ALU Requirements: must support the following arithmetic and logic operations add, sub: two s complement adder/subtractor with overflow detection and, or, nor : logical AND, logical OR, logical NOR slt (set on less than): two s complement adder with inverter, check sign bit of result 3
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 4
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 5 5
A Bit-slice ALU Design trick 1: divide and conquer Break the problem into simpler problems, solve them and glue together the solution Design trick 2: solve part of the problem and extend 32 A B 32 a0 b0 m cin 4 a31 b31 m cin ALU0 cos0 ALU31 c31s31 ALUop Overflow Zero 32 Result 6
A 1-bit ALU Design trick 3: take pieces you know (or can imagine) and try to put them together Operation CarryIn and A 0 or Result 1 Mux add 1-bit Full Adder 2 B CarryOut 7
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 8 8
A 4-bit ALU 1-bit ALU 4-bit ALU Operation CarryIn0 Operation CarryIn A0 1-bit ALU Result0 A B0 CarryOut0 CarryIn1 A1 1-bit ALU Result1 Result B1 Mux CarryOut1 CarryIn2 A2 1-bit ALU Result2 B2 1-bit Full Adder CarryOut2 CarryIn3 A3 1-bit ALU B Result3 B3 CarryOut CarryOut3 9
How about Subtraction? 2 (at cin of first stage) A + B + 1 = A + (B + 1) = A + (-B) = A - B Bit-wise inverse of B is B s complement: take inverse of every bit and add 1 Subtract (Bnegate) CarryIn Operation A ALU Result Sel B 0 Mux 1 CarryOut B 10
Revised Diagram LSB and MSB need to do a little extra 32 A B 32 a0 b0 4 a31 b31 ALU0 cos0 ALU31 c31s31 ALUop cin ? cin Supply a 1 on subtraction 32 Combining the CarryIn and Bnegate Overflow Zero Result 11
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 12
R-Format Instructions (1/2) Define the following fields : 6 5 5 rt 5 rd 5 6 opcode opcode: partially specifies what instruction it is (Note: 0 for all R-Format instructions) funct: combined with opcode to specify the instruction Question: Why aren t opcode and funct a single 12-bit field? rs (Source Register): generally used to specify register containing first operand rt (Target Register): generally used to specify register containing second operand rd (Destination Register): generally used to specify register which will receive result of computation rs shamt funct 13
Nor Operation A nor B = (not A) and (not B) ALUop 2 Ainvert Operation CarryIn a 0 1 0 1 Bnegate Result b 0 1 2 CarryOut 14
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 15
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 16 16
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 17
Set on Less Than (I) 1-bit in ALU (for bits 1-30) ALUop Ainvert Operation CarryIn a 0 1 0 1 Bnegate Result b 0 1 2 3 Less (0:bits 1-30) CarryOut 18
Set on Less Than (II) Sign bit in ALU Operation Ainvert CarryIn a 0 1 0 1 Bnegate Result b 0 1 2 3 Less Set Overflow detection Overflow 19
Set on Less Than (III) Bit 0 in ALU ALUop Ainvert Operation CarryIn a 0 1 0 1 Bnegate Result b 0 1 2 3 Set CarryOut 20
A Ripple Carry Adder and Set on Less Than ALUop Function 0000 and 0001 or 0010 add 0110 subtract 0111 set-less-than 1100 nor 21
Functional Specification ALUop 4 A 32 Zero ALU Result 32 Overflow B 32 CarryOut ALU Control (ALUop) 0000 0001 0010 0110 0111 1100 Function and or add subtract set-on-less-than nor 22 22
Overflow Decimal 0 1 2 3 4 5 6 7 Ex: 7 + 3 = 10 but ... - 4 - 5 = - 9 but 0 1 1 1 1 0 0 0 0 1 1 1 7 1 1 0 0 -4 + 0 0 1 1 3 + 1 0 1 1 -5 1 0 1 0 -6 0 1 1 1 7 Binary 0000 0001 0010 0011 0100 0101 0110 0111 Decimal 0 -1 -2 -3 -4 -5 -6 -7 -8 2 s complement 0000 1111 1110 1101 1100 1011 1010 1001 1000 23
Overflow Detection Overflow: result too big/small to represent -8 4-bit binary number 7 When adding operands with different signs, overflow cannot occur! Overflow occurs when adding: 2 positive numbers and the sum is negative 2 negative numbers and the sum is positive => sign bit is set with the value of the result Overflow if: Carry into MSB Carry out of MSB 0 1 1 1 1 0 0 0 0 1 1 1 7 1 1 0 0 -4 + 0 0 1 1 3 + 1 0 1 1 -5 1 0 1 0 -6 0 1 1 1 7 24
Overflow Detection Logic Overflow = CarryIn[N-1] XOR CarryOut[N-1] CarryIn0 A0 1-bit ALU Result0 X Y X XOR Y B0 CarryOut0 0 0 1 1 0 1 0 1 0 1 1 0 CarryIn1 A1 1-bit ALU Result1 B1 CarryOut1 CarryIn2 A2 1-bit ALU Result2 B2 CarryIn3 Overflow A3 1-bit ALU Result3 B3 CarryOut3 25
Dealing with Overflow Some languages (e.g., C) ignore overflow Use MIPS addu addu, addui addui, subu Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add add, addi addi, sub sub instructions On overflow, invoke exception handler Save PC in exception program counter (EPC) register Jump to predefined handler address mfc0 mfc0 (move from coprocessor reg) instruction can retrieve (copy) EPC value (to a general purpose register), to return after corrective action (by jump register instruction) subu instructions 26
Zero Detection Logic Zero Detection Logic is a one BIG NOR gate (support conditional jump) CarryIn0 A0 Result0 1-bit ALU B0 CarryOut0 CarryIn1 A1 Result1 1-bit ALU B1 Zero CarryOut1 CarryIn2 A2 Result2 1-bit ALU B2 CarryOut2 CarryIn3 A3 Result3 1-bit ALU B3 CarryOut3 27
Problems with Ripple Carry Adder Carry bit may have to propagate from LSB to MSB => worst case delay: N-stage delay CarryIn0 CarryIn A0 1-bit ALU Result0 B0 A CarryOut0 CarryIn1 A1 1-bit ALU Result1 B1 CarryOut1 CarryIn2 A2 1-bit ALU Result2 B B2 CarryOut CarryOut2 CarryIn3 Design Trick: look for parallelism and throw hardware at it A3 1-bit ALU Result3 B3 CarryOut3 28 28
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 29 29
Carry Look-ahead: Theory (I) (Appendix C) B1 B0 A1 A0 Cin1 Cin0 Cin2 1-bit ALU 1-bit ALU Cout1 Cout0 CarryOut=(B*CarryIn)+(A*CarryIn)+(A*B) Cin1=Cout0= (B0 * Cin0)+(A0 * Cin0)+ (A0 * B0) Cin2=Cout1= (B1 * Cin1)+(A1 * Cin1)+ (A1 * B1) Substituting Cin1 into Cin2: Cin2=(A1*A0*B0)+(A1*A0*Cin0)+(A1*B0*Cin0) +(B1*A0*B0)+(B1*A0*Cin0)+(B1*B0*Cin0) +(A1*B1) 30
Carry Look-ahead: Theory (II) Now define two new terms: Generate Carry at Bit i: Propagate Carry via Bit i: We can rewrite: Cin1=g0+(p0*Cin0) Cin2=g1+(p1*g0)+(p1*p0*Cin0) Cin3=g2+(p2*g1)+(p2*p1*g0)+(p2*p1*p0*Cin0) Carry going into bit 3 is 1 if We generate a carry at bit 2 (g2) Or we generate a carry at bit 1 (g1) and bit 2 allows it to propagate (p2 * g1) Or we generate a carry at bit 0 (g0) and bit 1 as well as bit 2 allows it to propagate .. gi = Ai * Bi pi = Ai xor Bi 31
A Plumbing Analogy for Carry Loo-kahead (1, 2, 4 bits) 32
Common Carry Look-ahead Adder Expensive to build a full carry look-ahead adder Ex: Cin3=g2+(p2*g1)+(p2*p1*g0)+(p2*p1*p0*Cin0) Just imagine length of the equation for Cin31 Common practices: Cascaded carry look-ahead adder Multiple level carry look-ahead adder 33
Cascaded Carry Look-ahead Connects several N-bit look-ahead adders to form a big one A[31:24] B[31:24] A[23:16] B[23:16] A[15:8] B[15:8] A[7:0] B[7:0] 8 8 8 8 8 8 8 8 8-bit Carry Lookahead Adder 8-bit Carry Lookahead Adder 8-bit Carry Lookahead Adder 8-bit Carry Lookahead Adder C24 C16 C8 C0 8 8 8 8 Result[31:24] Result[23:16] Result[15:8] Result[7:0] 34
Example: Carry Look-ahead Unit cin cout Carry Look-ahead Unit 4 4 4 gi pi 35
Example: Cascaded Carry Look-ahead Connects several N-bit look-ahead adders to form a big one 4-bit Carry Lookahead Unit 4-bit Carry Lookahead Unit 4-bit Carry Lookahead Unit 4-bit Carry Lookahead Unit c12 c8 c4 c0 p[15:12] g[15:12] p[11:8] g[11:8] p[7:4] g[7:4] p[3:0] g[3:0] c[4:1] c[8:5] c[12:9] c[16:13] + + + + + + + + + + + + + + + + 36
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 37 37
Multiple Level Carry Look-ahead Adder View an N-bit look-ahead adder as a block Where to get Cin of the block ? A[31:24] B[31:24] A[23:16] B[23:16] A[15:8] B[15:8] A[7:0] B[7:0] 8 8 8 8 8 8 8 8 C16 C8 C24 8-bit Carry Lookahead Adder 8-bit Carry Lookahead Adder 8-bit Carry Lookahead Adder 8-bit Carry Lookahead Adder C0 8 8 8 8 Result[31:24] Result[23:16] Result[15:8] Result[7:0] Generate super Pi and Gi of the block Use next level carry look-ahead structure to generate block Cin 38
A Plumbing Analogy for Carry Look- ahead (Next Level P0 and G0) 39
CarryIn A Carry Look-ahead Adder a0 b0 a1 b1 a2 b2 a3 b3 CarryIn Result0--3 ALU0 pi gi P0 G0 Carry-lookahead unit C1 ci + 1 a4 b4 a5 b5 a6 b6 a7 b7 CarryIn A B Cout 0 0 0 0 1 Cin propagate 1 0 Cin propagate 1 1 1 generate Result4--7 ALU1 kill pi + 1 gi + 1 P1 G1 C2 ci + 2 CarryIn a8 b8 a9 b9 a10 b10 a11 b11 Result8--11 ALU2 pi + 2 gi + 2 P2 G2 C3 G = A * B P = A + B ci + 3 CarryIn a12 b12 a13 b13 a14 b14 a15 b15 Result12--15 ALU3 pi + 3 gi + 3 P3 G3 C4 ci + 4 CarryOut 40
Example: Carry Look-ahead Unit P G cin cout Carry Look-ahead Unit 4 4 4 gi pi 41
Example: Multiple Level Carry Look-ahead 4-bit Carry Lookahead Unit C[4:0] P3, G3 P0, G0 P2, G2 P1, G1 4-bit Carry Lookahead Unit 4-bit Carry Lookahead Unit 4-bit Carry Lookahead Unit 4-bit Carry Lookahead Unit c12 c8 c4 c0 p[15:12] g[15:12] p[11:8] g[11:8] p[7:4] g[7:4] p[3:0] g[3:0] c[4:1] c[8:5] c[12:9] c[16:13] + + + + + + + + + + + + + + + + + 42
Carry-select Adder CP(2n) = 2*CP(n) n-bit adder n-bit adder CP(2n) = CP(n) + CP(mux) n-bit adder n-bit adder 0 n-bit adder 1 Design trick: guess Cout 43
Arithmetic for Multimedia Graphics and media processing operates on vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain Operate on 8 8-bit, 4 16-bit, or 2 32-bit vectors SIMD (single-instruction, multiple-data) Saturating operations On overflow, result is largest representable value c.f. 2s-complement modulo arithmetic E.g., clipping in audio, saturation in video 44
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 45 45
Outline Addition and subtraction Constructing an arithmetic logic unit Building ALU Add, sub, and, or, nor Set-on-less-than, overflow detection, zero detection Fast adders Cascaded carry look-ahead adder Multiple level carry look-ahead adder Multiplication Unsigned multiply Signed multiply Division Floating point Representations Addition and multiplication 46 46
Memory MIPS R2000 Organization CPU Coprocessor 1 (FPU) Registers Registers $0 $0 $31 $31 Arithmetic unit Multiply divide Arithmetic unit Lo Hi Coprocessor 0 (traps and memory) Registers BadVAddr Cause Status EPC 47
Multiplication in MIPS mult $t1, $t2 # t1 * t2 No destination register: product could be ~264; need two special registers to hold it 3-step process: $t1 01111111111111111111111111111111 01000000000000000000000000000000 X $t2 00011111111111111111111111111111 11000000000000000000000000000000 Hi Lo mfhi $t3 $t3 00011111111111111111111111111111 mflo $t4 $t4 11000000000000000000000000000000 48
MIPS Multiplication Two 32-bit registers for product HI: most-significant 32 bits LO: least-significant 32-bits Instructions mult rs, rt / multu rs, rt 64-bit product in HI/LO mfhi rd / mflo rd Move from HI/LO to rd Can test HI value to see if product overflows 32 bits mul rd, rs, rt Least-significant 32 bits of product > rd 49