Machine Learning Concepts: From Basic Ideas to Case Studies

1 / 35

Embed Share

Dive into the world of machine learning with this comprehensive overview covering fundamental concepts, strategies, case studies like Pokémon vs. Digimon, and the role of unknown parameters in creating classifiers. Explore loss functions, optimization, model complexities, and training examples to enhance your understanding of ML processes.

lile Follow

Uploaded on Apr 04, 2025 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Review: Basic Idea of ML https://youtu.be/Ye018rCVvOo https://youtu.be/bHcJCp2Fyxs Step 1: function with unknown Step 2: define loss Step 3: optimization

Review: Strategy https://youtu.be/WeHM2xpYQpw More parameters, easier to overfit. Why?

Case Study: Pokmon v.s. Digimon https://medium.com/@tyreeostevenson/teaching-a-computer-to-classify-anime-8c77bc89b881

Pokmon vs. Digimon

Pokmon vs. Digimon

Pokmon/Digimon Classifier We want to find a function Pok mon or Digimon ? = Determine a function with unknown parameters (based on domain knowledge)

Observation Digimon ? Pok mon ?

Observation ? = 3558 Edge detection 3558 ? = 7389 7389

Function with Unknown Parameters Digimon If ? ? = Pok mon ? : function with threshold If ? < : number of candidate functions (model complexity ) = 1,2, ,10,000

Loss of a function (given data) Given a dataset ? Pok mon ?1, ?1, ?2, ?2, , ??, ?? ? = Loss of a threshold given data set ? ? I ? ?? ?? ? ,? =1 Error rate ? ,??, ?? ? ?=1 If ? ?? ?? Output 1 Otherwise Don t like it? Of course, you can choose cross-entropy. Output 0

Training Examples If we can collect all Pok mons and Digimons in the universe ????, we can find the best threshold ??? ???= ???min ? ,???? We only collect some examples ?????? from ???? ?1, ?1, ?2, ?2, , ??, ?? ??????= ??, ??~???? independently and identically distributed (i.i.d.) ? ,?????? ?????= ???min

Training Examples If we can collect all Pok mons and Digimons in the universe ????, we can find the best threshold ??? ???= ???min ? ,???? We only collect some examples ?????? from ???? ?????= ???min ? ,?????? We hope ? ?????,???? and ? ???,???? are close.

We hope ? ?????,???? and ? ???,???? are close. All Pok mons and Digimons we know as ???? Pok mon: 819 Digimon: 971 In most applications, you cannot obtain ????. (Testing data ????? as the proxy of ????) ???= 4824 ? ???,???? = 0.28 Source of Digimon: https://github.com/mrok273/Qiita Source of Pok mon: https://www.kaggle.com/kvpratama/pokemon- images-dataset/data

We hope ? ?????,???? and ? ???,???? are close. Sample 200 Pok mons and Digimons as ??????1 All Pok mons and Digimons we know as ???? ???= 4824 ? ???,???? = 0.28 ?????1= 4727 ? ?????1,??????1 = 0.27 Even lower than ? ???,????? ? ?????1,???? = 0.28

We hope ? ?????,???? and ? ???,???? are close. Sample 200 Pok mons and Digimons as ??????2 All Pok mons and Digimons we know as ???? ???= 4824 ? ???,???? = 0.28 ?????2= 3642 ? ?????2,??????2 = 0.20 ? ?????2,???? = 0.37

? ?????,?????? can be smaller than ? ???,???? What do we want? ? ?????,???? ? ???,???? ? We want What kind of ??????fulfill it? ,|? ,?????? ? ,????| ?/2 ??????is a good proxy of ???? for evaluating loss ? given any .

What do we want? ? ?????,???? ? ???,???? ? We want What kind of ??????fulfill it? ,|? ,?????? ? ,????| ?/2 ? ?????,???? ? ?????,?????? + ?/2 ?????= ???min ? ???,?????? + ?/2 ? ,?????? ? ???,???? + ?/2 + ?/2 = ? ???,???? + ?

What do we want? ? ?????,???? ? ???,???? ? We want What kind of ??????fulfill it? ,|? ,?????? ? ,????| ?/2 ? = ?/2 We want to sample good?????? ,|? ,?????? ? ,????| ? What is the probability of sampling bad???????

Very General! The following discussion is model-agnostic. In the following discussion, we don t have assumption about data distribution. In the following discussion, we can use any loss function.

Probability of Failure good?????? ??????2 bad?????? ??????1 Each point is a training set.

Each point is a training set. Probability of Failure ? ?????? ?? ???

Each point is a training set. Probability of Failure If a ?????? is bad, at least one makes |? ,?????? ? ,????| > ? ? ?????? ?? ??? ??? ?? 1 ? ?????? ?? ??? ??? ?? 2 2 1 3

? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? 2 1 3

? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? ? ? ,? =1 ? ,??, ?? |? ,?????? ? ,????| > ? ? ?=1 Loss of an example ? ,??, ?? ? ,???? Average ? ,?????? Average

? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? Hoeffding s Inequality: 2??? 2??2 ? ?????? ?? ??? ??? ?? The range of loss ? is 0,1 ? is the number of examples in ??????

? ?????? ?? ??? = ? ?????? ?? ??? ??? ?? ? ?????? ?? ??? ??? ?? 2??? 2??2 = 2??? 2??2 How to make ? ?????? ?? ??? smaller? Larger ? and smaller

? ?????? ?? ??? 2??? 2??2 Larger ? 1 2 1 2 3 3

? ?????? ?? ??? 2??? 2??2 Smaller 1 1 2 3 3

= 1,2,,10,000 ?1, ?1, ?2, ?2, , ??, ?? ??????= Example ,|? ,?????? ? ,????| ? ? ?????? ?? ??? 2??? 2??2 = 10000, ? = 100,? = 0.1 Usually happen QQ ? ?????? ?? ??? 2707 = 10000, ? = 500,? = 0.1 ? ?????? ?? ??? 0.91 = 10000, ? = 1000,? = 0.1 ? ?????? ?? ??? 0.00004

Example ? ?????? ?? ??? 2??? 2??2 If we want ? ?????? ?? ??? ? How many training examples do we need? ? ??? 2 /? 2??? 2??2 ? 2?2 = 10000, ? = 0.1,? = 0.1 ? 610

Model Complexity ? ?????? ?? ??? 2??? 2??2 The number of possible functions you can select What if the parameters are continuous? Answer 1: Everything that happens in a computer is discrete. Answer 2: VC-dimension (not this course)

Model Complexity ? ?????? ?? ??? 2??? 2??2 Why don t we simply use a very small ? ??????is good means ,|? ,?????? ? ,????| ? Larger loss ? ?????,???? ? ???,???? ? ? = ?/2 ???= ???min ? ,???? fewer candidates

Tradeoff of Model Complexity ? ?????,???? ? ???,???? ? Larger ? and smaller Larger ? ???,???? Smaller smaller larger ? ?????,???? ? ?????,???? small large ? ???,???? large ? ???,???? small Yes, Deep Learning.

https://forms.gle/FKGwMczbJPxnWe9o7

Machine Learning Concepts: From Basic Ideas to Case Studies

Download Presentation

Presentation Transcript

Related

More Related Content