
Masking Floating-Point Multiplication Hardware for Side-Channel Attacks Research Overview
This research explores a pioneering hardware masking technique tailored for floating-point multiplication to enhance security against side-channel attacks. The study introduces a novel approach applicable to AI/ML algorithms and presents sub-circuits for efficient computation. The article discusses the importance of securing FALCON's Fourier Transform and the need for protecting floating-point arithmetic, especially in the context of embedded systems. The complexity associated with masking techniques and the challenges of non-linear operations in increasing costs are also addressed.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
TCHES 2024 Masking Floating-Point Multiplication Hardware for Side-Channel Attacks 19 Mar 2024 Emre Karabulut and Aydin Aysu North Carolina State University & MithrilAI Corp.
Company Logo BLUF The first hardware masking technique tailored for floating-point multiplication A pioneering technique to secure FALCON's Fourier Transform (FFT) multiplication Applicable to AI/ML s floating-point multiplications A generic solution and extendable to different floating-point precisions Novel sub-circuits that ensure efficient computation on nontraditional operations Security and functionality are empirically verified High area-performance overheads! 2
Company Logo Introduction Floating-point multiplication is getting targeted by side-channel attacks FALCON is (about to be) standardized by NIST for post-quantum signatures FALCON is favorable in embedded context due to smaller signatures AI/ML algorithms use floating-point arithmetic Floating point needs protection (too) 3
Company Logo Simplified Side-Channel Attack Leakage Hamming Weight a 2 3 0 2 1 2 2 2 a 2 3 0 2 a 2 3 0 2 1 1 b ? ? ? ? ? ? ? ? b ? ? ? ? ? ? b ? ? ? ? c ? ? ? ? ? ? 2 3 HW(c) 2 0 1 2 1 2 c 3 0 1 3 HW(c) 2 0 1 2 1 2 Public a b + c Secret How can we make the side-channels independent of secrets? 4
Masking Background Company Logo An effective countermeasure against side-channel attacks Splits sensitive variables into multiple randomized shares and then operates on these shares + b-r1 + r1 a-r0 a0= (a-r0), a1 = r0 and b0= (b-r1), b1= r1 a= a0 + a1 and b= b0 + b1 Hence, c= c0+c1 c0 a b + c r0 c1 Boolean shares: a (a0=a r , a1= r) Arithmetic shares : a (a0=(a-r) mod q, a1=r mod q) 5
Background: Complexity in Masking Company Logo $ Non-linear operations are complex and increase cost by more than 4x $$$ 6 6
Background: Complexity in Masking Company Logo $ Non-linear operations are complex and increase cost by more than 4x We need to handle glitches, memory transitions $$$$ 7 7
Background: Complexity in Masking Company Logo $ Non-linear operations are complex and increase cost by more than 4x We need to handle glitches, memory transitions We need to ensure composability $$$ $$$$$ 8 8
Background: Complexity in Masking Company Logo $ Non-linear operations are complex and increase cost by more than 4x We need to handle glitches, memory transitions We need to ensure composability Randomness generations are expensive $$$ $$$$$$ 9 9
Background: Complexity in Masking Company Logo $ Non-linear operations are complex and increase cost by more than 4x We need to handle glitches, memory transitions We need to ensure composability Randomness generations are expensive An arithmetic share cannot be processed within the Boolean masking domain (A2B, B2A) $$$ $$$$$$ 10 10
Company Logo Floating-point Multiplication in FALCON FALCON's reference implementation follows the IEEE-754 standard* The operations are not performed over a modulo field Integer multiplication, addition, logical shifts, and bitwise operations (AND, OR, XOR) 11
Company Logo Challenge-1: Large Operands in Masking This is the first attempt to tackle this research problem Large operand sizes in masked integer multiplication and addition Multiplication output is 106 bits: require 140 DSP48E1 blocks, 4480 registers Adds significant overhead on randomness: 1120-bit randomness Cascaded several DSP48E1 blocks might cause glitches 12
Company Logo Solution for Large Operands in Masking Choosing a smaller modulo field Enables multiplying large numbers chunk-by-chunk, significantly reducing DSP needs 8-bit 4-bit 4-bit How to decide a smaller modulo field Both the input and output need to remain in the same field DSP48E1 can handle multiplications up to 17-by-24 bits 13
Company Logo Challenge-2: Masking of Multiple Modulo Fields Assume: A=2 and B= 3, hence M = 6 (mod 16) A0=10, A1=8, B0=2, B1=1 (A0. B0) = 4, (A0. B1) = 10, (A1. B0) = 0, (A1. B1) = 8, so M0=14 and M1=8 Carry bits distort the computation {2, 1} {2, 1} {10, 8} {10, 8} 14 14
Challenge-2: Carry-bit Company Logo {2, 1} {2, 1} {2, 1} {2, 1} {10, 8} {10, 8} {10, 8} {10, 8} 22 = 5 b1 0110, while 4 b0110 = 6 352 = 9 b1 0110 0000, while 8 b0110 0000 = 96 5632 = 13 b1 0110 0000 0000, while 12 b0110 0000 0000= 1536 How do we find the carry-bit without combining the shares? 15
Company Logo Solution for Modulo Conversion Keep the shares in the arithmetic domain and still calculate the carry-bit A carry-bit calculator gadget Works in Boolean domain without a need for conversion One AND gate and three XOR gates 16 *HPC1: A masked AND gate 16
Company Logo Hardware Design: Integer Multiplication 32-bit randomness per cycle Randomness need is consistent and irrespective of operand size Efficient utilization of DSP resources and avoids A2B conv. 17
Company Logo Hardware Design: Boolean League 18 18
Company Logo Hardware Design: Boolean League 19 Performed remaining operations in the Boolean masking domain Masked zero-check gadget: bitwise OR operation is a foundational mechanism Masked Mantissa Selection: requires only four logical gates 19
Company Logo Implementation Results Hardware masking overhead is high! Carry-bit calculation is a bottleneck (3920 cycles in low-area design) aRounding omitted 20
Leakage Assessment Method Company Logo TVLA is an effective and commonly used technique for leakage detection Identifies statistically differences between the two groups of side-channel measurements Fixed vs Random tests ?????? ??????? 2 ?????? Welch s t-scoret = 21 2 ?????? ??????? ???????) ( 21
SCA Evaluation Experimental Setup Company Logo Target: SAKURA-X FPGA board @12MHz Measurement: Picoscope 3206D @ 125MHz Randomness activated/deactivated for masked vs. protected designs Traces up to 10M for protected designs Each trace contains up to 170k samples TVLA with first and second order tests 22
SCA Evaluation: Unmasked Design Company Logo First-order attack on masked Second-order attack on masked First-order attack on unmasked First-order attack on unmasked: balanced design First-order attack on masked: balanced design First-order fixed-vs-fixed 23 What is supposed to leak leaks, and what isn t supposed leak doesn t 23
Company Logo Conclusion The first hardware masking technique tailored for floating-point multiplication A pioneering technique to secure FALCON's Fourier Transform (FFT) multiplication Applicable to AI/ML s floating-point multiplications A generic solution and extendable to different floating-point precisions Novel sub-circuits that ensure efficient computation on nontraditional operations Security and functionality are empirically verified High area-performance overheads! 24