From Linear Classifiers to Neural Networks - A Comprehensive Overview

From Linear Classifiers to Neural Networks - A Comprehensive Overview
Slide Note
Embed
Share

This content delves into the transition from linear classifiers to neural networks, covering topics such as discriminant functions, cost functions, loss functions, and the structure of linear classifiers. Explore the representation power of sigmoidal neural networks and the challenges posed by non-differentiable functions in the ideal case scenario.

  • Linear Classifiers
  • Neural Networks
  • Discriminant Functions
  • Sigmoidal Networks
  • Loss Functions

Uploaded on Mar 09, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. From linear classifiers to neural network

  2. Skeleton Recap linear classifier Nonlinear discriminant function Nonlinear cost function Example Linear Classifiers Introduction to neural network Representation power of sigmoidal neural network

  3. Linear Classifier: Recap & Notation We focus on two-class classification for the entire class ?- Input feature ??= ??1 ,??2 , ,??? ? index of training tokens, ? feature dimension ? = ?1,?2, ,?? ??= ? + ????- Linear output ??- Predicted class ??- Labelled class ?- Weight vector if ?? 0 otherwise if ?? 0 otherwise ??= 1 or ??= 1 0 1

  4. Discriminant function ??= ? + ????- Linear output if ?? 0 otherwise ??= 1 1 ??= ? ?? ? ? = 1 if ? 0 otherwise 1 ? ? - Nonlinear discriminant function 1 -1

  5. Loss function To evaluate the performance of the classifier ? ?1:?,?1:? ? ??,?? = ?=1 Loss function ? ?,? = ? ? ? ?,1 ? ?, 1 1 ? ? -1 -1 1

  6. Structure of linear classifier ? ?,1 ? ?, 1 ? ,?? 1 ? ? - 1 - 1 1 ?? ? ?? ??= ? + ???? ??1 ??? 1 ??

  7. Alternatively Nonlinear function ? ? = ? Loss function ? ?,? = ? ?? ? ?,1 ? ?, 1 ? ?

  8. Structure of linear classifier ? ?,1 ? ?,1 ? ?, 1 ? ?, 1 ? ,?? ? ? ? ? 1 - 1 - 1 1 ?? ? ?? ??= ? + ???? ??1 ??? 1 ??

  9. Ideal case - Problem? Nonlinear function 1 -1 Loss function ? ?,1 ? ?, 1 ? ? Not differentiable. Cannot train using gradient methods We need proxy for both.

  10. Skeleton Recap linear classifier Nonlinear discriminant function Nonlinear cost function Example Linear Classifiers Introduction to neural network Representation power of sigmoidal neural network

  11. Nonlinear Discriminant function Our Goal: Find a function that is Differentiable Approximates the step function Solution: Sigmoid function Definition: A bounded, differentiable and monotonically increasing function.

  12. Sigmoid Function - Examples Logistic Function: 1 ? ? = 1 + ? ? ? ? 1 + ? ? 2 ? ? = 1 0.5 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 0.25 0.2 0.15 0.1 0.05 0 -5 -4 -3 -2 -1 0 1 2 3 4 5

  13. Sigmoid Function - Examples Hyperbolic Tangent: ? ? = tanh ? =?? ? ? ?? ? ? ??+ ? ? 2 ??+ ? ? ?? ? ? ? ? =??+ ? ? ??+ ? ? = 1 tanh2? 1 0.5 0 -0.5 -1 -5 -4 -3 -2 -1 0 1 2 3 4 5 1 0.5 0 -5 -4 -3 -2 -1 0 1 2 3 4 5

  14. Skeleton Recap linear classifier Nonlinear discriminant function Nonlinear cost function Example Linear Classifiers Introduction to neural network Representation power of sigmoidal neural network

  15. Nonlinear Loss function Our Goal: Find a function that is Differentiable Is an UPPER BOUND of the step function Why? For training: min loss => error not large For test: generalized error < generalized loss < upper bounds

  16. Loss function example Square loss ? ?,? = ? ?2 ?? ??= 2 ? ? ? ?,1 ? ?, 1 ? ? Advantage: easy to solve, common for regression Disadvantage: punish right tokens

  17. Loss function example Hinge loss ? ?,? = ?? + 1 if ?? < 1 otherwise 0 ?? ??= ? ?? + 1 ? ?,1 ? ?, 1 ? ? Advantage: easy to solve, good for classification

  18. Skeleton Recap linear classifier Nonlinear discriminant function Nonlinear cost function Example Linear Classifiers Introduction to neural network Representation power of sigmoidal neural network

  19. Linear classifiers example Nonlinear Discriminant Function Loss Function ? ,?? ? ??,?? Linear Square ?? ??= ? ?? Sigmoid Hinge ? Linear + Square: MSE classifier Sigmoid + Squared: Nonlinear MSE classifier Linear + Hinge + Regularization : SVM ?? ??= ???? ?? ??1 ??? 1

  20. MSE Classifier 2= 2 ? ??,?? ?? ?? ???? ?? = ? ? ? 2= 2 ???? ?? ???? ?? ?? ?? = 0 ? ? ???? ?? = ???? ? ? ? ? = ?1, ,??, ? = ?1, ,?? ???? = ?? ? = ??? 1??

  21. Nonlinear MSE Classifier 2 ? ??,?? ?? ?? = ? ? 2 ???? ?? = ? 2 ? ??,?? ?? ?? = ? ? 2 ? ???? ?? = ?

  22. Training a Nonlinear MSE Classifier 2 ? ??,?? ?? ?? = ? ? 2 ? ???? ?? = ? Chain rule: 2 ? ???? ??? ?????? ??= ? Disadvantage: Can be stagnant.

  23. Skeleton Recap linear classifier Nonlinear discriminant function Nonlinear cost function Example Linear Classifiers Introduction to neural network Representation power of sigmoidal neural network

  24. Introduction of neural network ?? is a function of of ??, ? ?? For linear classifier, this function takes a simple form What if we need more complicated functions? ?? ?? ?? ?? ??1 ??? 1

  25. Introduction of neural network ? ? ?1 1 ?1 ?1 ? ? ?1 1 ?1 ?1 ? ? 1 ?0 1 ?0 ?0

  26. ? ? ?3 1 ?3 ?3 Introduction of neural network ?3 1 ? ? ?3 ?3 ? ? 1 ?2 1 ?2 ?2 ? ? ?2 1 ?2 ?2 ? ? 1 ?1 1 ?1 ?1 ? ? ?1 1 ?1 ?1 ? ? 1 ?0 1 ?0 ?0

  27. Introduction of neural network ?? ? ?? 1 ? ? 1 ?? 1 1 ?? 1 ?? 1 ? ? ?? 1 1 ?? 1 ?? 1

  28. Notation ?= ?2 ?2 ? ?2 ? ? ?2 1 ?2 ?2 ?= ?2+ ?2?2 ? ? ? ?2 ?2 1 ?2 ?2 ? ? 1 ?= ?1 ?1 ? ?1 1 ?1 ?1 ?1 Hidden layer 1 ?= ?1+ ?1?0 ? ? ? ?1 ?1 1 ?1 ?1 ? input ? ? 1 ?0 ?0 1 ?0 ?0

  29. Notation ? ??= ?? ?? ?? ?= ??+ ???? 1 ? ? ?? ?? 1 ? ? ?? 1 = ?? 1 ?? 1 ? ? 1 ?? 1 1 ?? 1 ?? 1 ? ? ?? 1 = ?? 1+ ?? 1?? 2 ? ? ?? 1 1 ?? 1 ?? 1

  30. Question ?, ? ?0 ? ?? is a function of ?0 of function can be represented by a neural net? Are sigmoid functions good candidates for ? ? , how many kinds Answer: Given enough nodes, a 3-layer network with sigmoid or linear activation functions can approximate ANY functions with bounded support sufficiently accurately.

  31. Skeleton Recap linear classifier Nonlinear discriminant function Nonlinear cost function Example Linear Classifiers Introduction to neural network Representation power of sigmoidal neural network

  32. Proof: representation power For simplicity, we only consider functions with 1 variable. Real input real output. In this case, two layers are enough. ? ? ? ? ? + 1 ? ? ? ? ? ?=1

  33. Proof: representation power First Layer: ? hidden nodes, each node represents ?1? = ? ?0 ? ?1 - logistic function, ?1? = 1,?1= ? Second (Output) layer: ?2? = ? ? + 1 ? ? ? ? ? ? ? + 1 ? ? ? ? ? ?=1

More Related Content