Machine Learning Concepts: From Basic Ideas to Case Studies

slide1 n.w
1 / 35
Embed
Share

Dive into the world of machine learning with this comprehensive overview covering fundamental concepts, strategies, case studies like Pokémon vs. Digimon, and the role of unknown parameters in creating classifiers. Explore loss functions, optimization, model complexities, and training examples to enhance your understanding of ML processes.

  • Machine Learning
  • Concepts
  • Strategies
  • Case Studies
  • Pokémon vs. Digimon

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Review: Basic Idea of ML https://youtu.be/Ye018rCVvOo https://youtu.be/bHcJCp2Fyxs Step 1: function with unknown Step 2: define loss Step 3: optimization

  2. Review: Strategy https://youtu.be/WeHM2xpYQpw More parameters, easier to overfit. Why?

  3. Case Study: Pokmon v.s. Digimon https://medium.com/@tyreeostevenson/teaching-a-computer-to-classify-anime-8c77bc89b881

  4. Pokmon vs. Digimon

  5. Pokmon vs. Digimon

  6. Pokmon/Digimon Classifier We want to find a function Pok mon or Digimon ? = Determine a function with unknown parameters (based on domain knowledge)

  7. Observation Digimon ? Pok mon ?

  8. Observation ? = 3558 Edge detection 3558 ? = 7389 7389

  9. Function with Unknown Parameters Digimon If ? ? = Pok mon ? : function with threshold If ? < : number of candidate functions (model complexity ) = 1,2, ,10,000

  10. Loss of a function (given data) Given a dataset ? Pok mon ?1, ?1, ?2, ?2, , ??, ?? ? = Loss of a threshold given data set ? ? I ? ?? ?? ? ,? =1 Error rate ? ,??, ?? ? ?=1 If ? ?? ?? Output 1 Otherwise Don t like it? Of course, you can choose cross-entropy. Output 0

  11. Training Examples If we can collect all Pok mons and Digimons in the universe ????, we can find the best threshold ??? ???= ???min ? ,???? We only collect some examples ?????? from ???? ?1, ?1, ?2, ?2, , ??, ?? ??????= ??, ??~???? independently and identically distributed (i.i.d.) ? ,?????? ?????= ???min

  12. Training Examples If we can collect all Pok mons and Digimons in the universe ????, we can find the best threshold ??? ???= ???min ? ,???? We only collect some examples ?????? from ???? ?????= ???min ? ,?????? We hope ? ?????,???? and ? ???,???? are close.

  13. We hope ? ?????,???? and ? ???,???? are close. All Pok mons and Digimons we know as ???? Pok mon: 819 Digimon: 971 In most applications, you cannot obtain ????. (Testing data ????? as the proxy of ????) ???= 4824 ? ???,???? = 0.28 Source of Digimon: https://github.com/mrok273/Qiita Source of Pok mon: https://www.kaggle.com/kvpratama/pokemon- images-dataset/data

  14. We hope ? ?????,???? and ? ???,???? are close. Sample 200 Pok mons and Digimons as ??????1 All Pok mons and Digimons we know as ???? ???= 4824 ? ???,???? = 0.28 ?????1= 4727 ? ?????1,??????1 = 0.27 Even lower than ? ???,????? ? ?????1,???? = 0.28

  15. We hope ? ?????,???? and ? ???,???? are close. Sample 200 Pok mons and Digimons as ??????2 All Pok mons and Digimons we know as ???? ???= 4824 ? ???,???? = 0.28 ?????2= 3642 ? ?????2,??????2 = 0.20 ? ?????2,???? = 0.37

  16. ? ?????,?????? can be smaller than ? ???,???? What do we want? ? ?????,???? ? ???,???? ? We want What kind of ??????fulfill it? ,|? ,?????? ? ,????| ?/2 ??????is a good proxy of ???? for evaluating loss ? given any .

  17. What do we want? ? ?????,???? ? ???,???? ? We want What kind of ??????fulfill it? ,|? ,?????? ? ,????| ?/2 ? ?????,???? ? ?????,?????? + ?/2 ?????= ???min ? ???,?????? + ?/2 ? ,?????? ? ???,???? + ?/2 + ?/2 = ? ???,???? + ?

  18. What do we want? ? ?????,???? ? ???,???? ? We want What kind of ??????fulfill it? ,|? ,?????? ? ,????| ?/2 ? = ?/2 We want to sample good?????? ,|? ,?????? ? ,????| ? What is the probability of sampling bad???????

  19. Very General! The following discussion is model-agnostic. In the following discussion, we don t have assumption about data distribution. In the following discussion, we can use any loss function.

  20. Probability of Failure good?????? ??????2 bad?????? ??????1 Each point is a training set.

  21. Each point is a training set. Probability of Failure ? ?????? ?? ???

  22. Each point is a training set. Probability of Failure If a ?????? is bad, at least one makes |? ,?????? ? ,????| > ? ? ?????? ?? ??? ??? ?? 1 ? ?????? ?? ??? ??? ?? 2 2 1 3

  23. ? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? 2 1 3

  24. ? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? ? ? ,? =1 ? ,??, ?? |? ,?????? ? ,????| > ? ? ?=1 Loss of an example ? ,??, ?? ? ,???? Average ? ,?????? Average

  25. ? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? Hoeffding s Inequality: 2??? 2??2 ? ?????? ?? ??? ??? ?? The range of loss ? is 0,1 ? is the number of examples in ??????

  26. ? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? 2??? 2??2 = 2??? 2??2 How to make ? ?????? ?? ??? smaller? Larger ? and smaller

  27. ? ?????? ?? ??? 2??? 2??2 Larger ? 1 2 1 2 3 3

  28. ? ?????? ?? ??? 2??? 2??2 Smaller 1 1 2 3 3

  29. = 1,2,,10,000 ?1, ?1, ?2, ?2, , ??, ?? ??????= Example ,|? ,?????? ? ,????| ? ? ?????? ?? ??? 2??? 2??2 = 10000, ? = 100,? = 0.1 Usually happen QQ ? ?????? ?? ??? 2707 = 10000, ? = 500,? = 0.1 ? ?????? ?? ??? 0.91 = 10000, ? = 1000,? = 0.1 ? ?????? ?? ??? 0.00004

  30. Example ? ?????? ?? ??? 2??? 2??2 If we want ? ?????? ?? ??? ? How many training examples do we need? ? ??? 2 /? 2??? 2??2 ? 2?2 = 10000, ? = 0.1,? = 0.1 ? 610

  31. Model Complexity ? ?????? ?? ??? 2??? 2??2 The number of possible functions you can select What if the parameters are continuous? Answer 1: Everything that happens in a computer is discrete. Answer 2: VC-dimension (not this course)

  32. Model Complexity ? ?????? ?? ??? 2??? 2??2 Why don t we simply use a very small ? ??????is good means ,|? ,?????? ? ,????| ? Larger loss ? ?????,???? ? ???,???? ? ? = ?/2 ???= ???min ? ,???? fewer candidates

  33. Tradeoff of Model Complexity ? ?????,???? ? ???,???? ? Larger ? and smaller Larger ? ???,???? Smaller smaller larger ? ?????,???? ? ?????,???? small large ? ???,???? large ? ???,???? small Yes, Deep Learning.

  34. https://forms.gle/FKGwMczbJPxnWe9o7

More Related Content