Convolutional Networks in Lecture 15

Slide Note

The concept of convolutional networks in Lecture 15 presented by Justin Johnson & David Fouhey. Topics covered include backpropagation, spatial structure of images, fully-connected vs. convolutional layers, and more to enhance your understanding of deep learning techniques.

jcray Follow

Uploaded on Apr 12, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Lecture 15: Convolutional Networks Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 1 March 11, 2021

Administrative Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 2 March 11, 2021

Last Time: Backpropagation During the backward pass, each node in the graph receives upstream gradients and multiplies them by local gradients to compute downstream gradients Represent complex expressions as computational graphs x s (scores) hinge? loss * L + W R f Downstream gradients Forward pass computes outputs Local? gradients Backward pass computes gradients Upstream? gradient Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 3 March 11, 2021

Problem: So far our classifiers don t respect the spatial structure of images! Stretch? pixels? into? column 56 56 231 231 24 2 24 Input: 3072 x W1 h s W2 Input? image (2,? 2) 2 Output:? 10 (4,) Hidden? layer: 100 Solution: Define new computational nodes that operate on images! Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 4 March 11, 2021

Components of a Fully-Connected Network Fully-Connected Layers Activation Function x h s ? = ?? + ? ? = max 0,? Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 5 March 11, 2021

Components of a Convolutional Network Fully-Connected Layers Activation Function x h s ? = ?? + ? ? = max 0,? Convolution Layers Pooling Layers Normalization ??,?=??,? ?? 2+ ? ?? Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 6 March 11, 2021

Components of a Convolutional Network Fully-Connected Layers Activation Function x h s ? = ?? + ? ? = max 0,? Convolution Layers Pooling Layers Normalization ??,?=??,? ?? 2+ ? ?? Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 7 March 11, 2021

Fully-Connected Layer 32x32x3 image -> stretch to 3072 x 1 Input Output 1 1 10 x 3072 weights 3072 10 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 8 March 11, 2021

Fully-Connected Layer 32x32x3 image -> stretch to 3072 x 1 Input Output 1 1 10 x 3072 weights 3072 10 1 number: the result of taking a dot product between a row of W and the input (a 3072- dimensional dot product) Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 9 March 11, 2021

Convolution Layer 3x32x32 image: preserve spatial structure 32 height width 32 depth / channels 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 10 March 11, 2021

Convolution Layer 3x32x32 image 3x5x5 filter Convolve the filter with the image: slide over the image spatially, computing dot products 32 height width 32 depth / channels 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 11 March 11, 2021

Convolution Layer Filters always extend the full depth of the input volume 3x32x32 image 3x5x5 filter Convolve the filter with the image: slide over the image spatially, computing dot products 32 height width 32 depth / channels 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 12 March 11, 2021

Convolution Layer 3x32x32 image 3x5x5 filter 1 number: the result of taking a dot product between the filter and a small 3x5x5 chunk of the image (i.e. 3*5*5 = 75-dimensional dot product + bias) 32 32 ??? + ? 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 13 March 11, 2021

Convolution Layer 1x28x28 activation map 3x32x32 image 3x5x5 filter 28 convolve (slide) over all spatial locations 32 28 32 1 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 14 March 11, 2021

Convolution Layer 1x28x28 activation map 3x32x32 image 3x5x5 filter 28 convolve (slide) over all spatial locations 32 28 32 1 3 Convolution Layer vs Image Filtering: >1 input and output channels Forget about convolution vs cross-correlation Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 15 March 11, 2021

Convolution Layer two 1x28x28 activation map Consider repeating with a second (green) filter: 3x32x32 image 3x5x5 filter 28 28 convolve (slide) over all spatial locations 32 28 32 1 1 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 16 March 11, 2021

Convolution Layer 6 activation maps, each 1x28x28 3x32x32 image Consider 6 filters, each 3x5x5 Convolution Layer 32 6x3x5x5 filters 32 Stack activations to get a 6x28x28 output image! 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 17 March 11, 2021

Convolution Layer 6 activation maps, each 1x28x28 3x32x32 image Also 6-dim bias vector: Convolution Layer 32 6x3x5x5 filters 32 Stack activations to get a 6x28x28 output image! 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 18 March 11, 2021

Convolution Layer 28x28 grid, at each point a 6-dim vector 3x32x32 image Also 6-dim bias vector: Convolution Layer 32 6x3x5x5 filters 32 Stack activations to get a 6x28x28 output image! 3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 19 March 11, 2021

Convolution Layer 2x6x28x28 Batch of outputs 2x3x32x32 Batch of images Also 6-dim bias vector: Convolution Layer 32 32 3 6x3x5x5 filters Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 20 March 11, 2021

Convolution Layer N x Coutx H x W Batch of outputs N x Cin x H x W Batch of images Bias: Cout-dim vector: Convolution Layer H W Cout Cin Weight: Cout x Cinx Kw x Kh Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 21 March 11, 2021

Stacking Convolutions 32 28 26 . Conv Conv Conv W1: 6x3x5x5 b1: 6 W2: 10x6x3x3 b2: 10 W3: 12x10x3x3 b3: 12 32 28 26 3 6 10 Input: First hidden layer: N x 6 x 28 x 28 Second hidden layer: N x 10 x 26 x 26 N x 3 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 22 March 11, 2021

Stacking Convolutions Q: What happens if we stack two convolution layers? 32 28 26 . Conv Conv Conv W1: 6x3x5x5 b1: 6 W2: 10x6x3x3 b2: 10 W3: 12x10x3x3 b3: 12 32 28 26 3 6 10 Input: First hidden layer: N x 6 x 28 x 28 Second hidden layer: N x 10 x 26 x 26 N x 3 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 23 March 11, 2021

(Recall y=W2W1x is a linear classifier) Stacking Convolutions Q: What happens if we stack two convolution layers? A: It s equivalent to just one convolution layer! 32 28 26 . Conv Conv Conv W1: 6x3x5x5 b1: 6 W2: 10x6x3x3 b2: 10 W3: 12x10x3x3 b3: 12 32 28 26 3 6 10 Input: First hidden layer: N x 6 x 28 x 28 Second hidden layer: N x 10 x 26 x 26 N x 3 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 24 March 11, 2021

(Recall y=W2W1x is a linear classifier) Stacking Convolutions Q: What happens if we stack two convolution layers? A: It s equivalent to just one convolution layer! 32 28 26 . Conv ReLU Conv ReLU Conv ReLU W1: 6x3x5x5 b1: 6 W2: 10x6x3x3 b2: 10 W3: 12x10x3x3 b3: 12 32 28 26 3 6 10 Input: First hidden layer: N x 6 x 28 x 28 Second hidden layer: N x 10 x 26 x 26 N x 3 x 32 x 32 Solution: Add a nonlinearity between each conv layer Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 25 March 11, 2021

What do Conv Filters Learn? 32 28 26 . Conv ReLU Conv ReLU Conv ReLU W1: 6x3x5x5 b1: 6 W2: 10x6x3x3 b2: 10 W3: 12x10x3x3 b3: 12 32 28 26 3 6 10 Input: First hidden layer: N x 6 x 28 x 28 Second hidden layer: N x 10 x 26 x 26 N x 3 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 26 March 11, 2021

What do Conv Filters Learn? Linear classifier: One template per class 32 28 Conv ReLU W1: 6x3x5x5 b1: 6 32 28 3 6 Input: First hidden layer: N x 6 x 28 x 28 N x 3 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 27 March 11, 2021

What do Conv Filters Learn? MLP: Bank of whole- image templates 32 28 Conv ReLU W1: 6x3x5x5 b1: 6 32 28 3 6 Input: First hidden layer: N x 6 x 28 x 28 N x 3 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 28 March 11, 2021

What do Conv Filters Learn? First-layer conv filters: local image templates (Often learns oriented edges, opposing colors) 32 28 Conv ReLU W1: 6x3x5x5 b1: 6 32 28 3 6 Input: First hidden layer: N x 6 x 28 x 28 N x 3 x 32 x 32 AlexNet: 64 filters, each 3x11x11 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 29 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Q: How big is output? 7 7 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 30 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Q: How big is output? 7 7 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 31 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Q: How big is output? 7 7 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 32 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Q: How big is output? 7 7 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 33 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Output: 5x5 7 7 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 34 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Output: 5x5 In general: Input: W Filter: K Output: W K + 1 Problem: Feature maps shrink with each layer! 7 7 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 35 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Output: 5x5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 In general: Input: W Filter: K Padding: P Problem: Feature maps shrink with each layer! 0 0 0 0 0 0 0 0 Solution: padding Add zeros around the input 0 0 0 0 0 0 0 0 0 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 36 March 11, 2021

Convolution Spatial Dimensions Input: 7x7 Filter: 3x3 Output: 5x5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 In general: Input: W Filter: K Padding: P Output: W K + 1 + 2P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Very common: same padding Set P = (K 1) / 2 Then output size = input size Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 37 March 11, 2021

Receptive Fields For convolution with kernel size K, each element in the output depends on a K x K receptive field in the input Input Output Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 38 March 11, 2021

Receptive Fields Each successive convolution adds K 1 to the receptive field size With L layers the receptive field size is 1 + L * (K 1) Input Output Careful receptive field in the input vs receptive field in the previous layer Hopefully clear from context! Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 39 March 11, 2021

Receptive Fields Each successive convolution adds K 1 to the receptive field size With L layers the receptive field size is 1 + L * (K 1) Input Output Problem: For large images we need many layers for each output to see the whole image image Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 40 March 11, 2021

Receptive Fields Each successive convolution adds K 1 to the receptive field size With L layers the receptive field size is 1 + L * (K 1) Input Output Problem: For large images we need many layers for each output to see the whole image image Solution: Downsample inside the network Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 41 March 11, 2021

Strided Convolution Input: 7x7 Filter: 3x3 Stride: 2 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 42 March 11, 2021

Strided Convolution Input: 7x7 Filter: 3x3 Stride: 2 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 43 March 11, 2021

Strided Convolution Input: 7x7 Filter: 3x3 Stride: 2 Output: 3x3 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 44 March 11, 2021

Strided Convolution Input: 7x7 Filter: 3x3 Stride: 2 Output: 3x3 In general: Input: W Filter: K Padding: P Stride: S Output: (W K + 2P) / S + 1 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 45 March 11, 2021

Convolution Example Input volume: 3 x 32 x 32 10 5x5 filters with stride 1, pad 2 Output volume size: ? Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 46 March 11, 2021

Convolution Example Input volume: 3 x 32 x 32 105x5 filters with stride 1, pad 2 Output volume size: (32+2*2-5)/1+1 = 32 spatially, so 10 x 32 x 32 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 47 March 11, 2021

Convolution Example Input volume: 3 x 32 x 32 10 5x5 filters with stride 1, pad 2 Output volume size: 10 x 32 x 32 Number of learnable parameters: ? Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 48 March 11, 2021

Convolution Example Input volume: 3 x 32 x 32 105x5 filters with stride 1, pad 2 Output volume size: 10 x 32 x 32 Number of learnable parameters: 760 Parameters per filter: 3*5*5 + 1 (for bias) = 76 10 filters, so total is 10 * 76 = 760 Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 49 March 11, 2021

Convolution Example Input volume: 3 x 32 x 32 10 5x5 filters with stride 1, pad 2 Output volume size: 10 x 32 x 32 Number of learnable parameters: 760 Number of multiply-add operations: ? Justin Johnson & David Fouhey EECS 442 WI 2021: Lecture 15 - 50 March 11, 2021

Convolutional Networks in Lecture 15

Download Presentation

Presentation Transcript

Related

More Related Content