
Machine Learning Introduction and Model Evaluation at IHEP
Explore machine learning concepts, model evaluation, and selection discussed at a tracking software meeting in a new building at IHEP. Topics include model prediction, data training, Occam's razor, inductive bias, overfitting, underfitting, and performance measurement methods like error rate, accuracy, precision, recall, and ROC curve.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Introduction of machine learning Chen Zhengyuan 2019.11.14 Tracking software meeting, new building, IHEP 1
outline Introduction Model evaluation and selection some jargon performance measurement evaluation methods compare test Preliminary linear model Learning algorithm decision tree neutral networks support vector machine bayes optimal classifier Summary 2
introduction ?????prediction ???????? ??????? ?model People: experience Machine: data training data attribute label { (?11, ?12 ?1?; ?1) , (?21, ?22 ?2?; ?2) (??1, ??2 ???; ??) (??1, ??2 ???; ??) } example1 example2 example i example n training (Machine learning is aimed at learning algorithm!) model (we hope the model have strong generalization ability) 3
introduction A model B model data Occam s razor: if more than one hypothesis agrees with the observation, choose the simplest one Inductive bias: The preference of machine learning algorithms for certain types of assumptions during learning 4
model evaluation and selection some jargon overfitting: Regard local features as global features result: not leaf training underfitting: Not fully understand the feature of samples training set: To learn the model testing set: The predicted samples validation set: Data sets used to evaluate and test in the model evaluation and selection training result: leaf choose model and adjust super parameters adjust model parameters testing set training set Learning algorithm validation set model 5
model evaluation and selection performance measure The quality of the model is relative, also depend on mission requirements! error rate: The proportion of the number of incorrect samples in the total number of samples accuracy: 1 error rate mean squared error: ? ?;? = ? precision: The proportion of good examples in the selected examples recall: The proportion of good selected examples in all good examples ? = x ?? ? ?2? ? ?? precision true positive rate 1 1 Break-Even point 1 C B A (0,FPR) (1,FNR) 1 1 false positive rate 1 recall P-R curve ROC curve cost curve 6
model evaluation and selection evaluation methods hold out: Divide data set into two mutually exclusive sets. Multiple stratified sampling were used to take the average value. cross validation: Divide data set into k subsets. Every time use one subsets as test, others as training. bootstrapping: Training set ? used put back sampling. D subtract ? as testing set. compare test hypothesis test: Test the hypothesis of single learner. paired t-test: Compare two learners used cross validation McNemar test: As for dichotomous problem Friedman test and Nemenyi post-hoc test: Solve multiple data sets and multiple algorithm problem, if hypothesis rejected, then draw the Friedmann test plot. a b c 7 (average of order value) 1 2 3
preliminary linear model linear regression ? ? = ??? + ? generalized linear model: ? = ? 1(??? + ?), g( ) is link function logistic regression achieve classification logit: ?? consider binary classification 1, y = 0.5, (??? + ?) = 0; 0, (??? + ?) < 0; ? 1 ? 1 ?= ??? + ? 1 0.5 ? = 1 + ? (???+?) (??? + ?) > 0; 0 ??? + ? 8
preliminary linear model linear Discriminant Analysis x2 ? Projection Direction As far as possible As close as possible x1 Two-dimensional diagram of LDA multiple classification learning Split into binary classification class-imbalance rescaling oversampling undersampling 9
learning algorithm Decision tree(consider binary classification) Watermelon for example: Special: good good bad black white green ? Color good good strong bad clearturbidity ? All the sample belong to the same category Without attribute Curled roots ? sound soft Hard ? touch rough good flat bad sag ? texture internal node leaf node root node 10 determine sequence
learning algorithm Decision tree(consider binary classification) division information gain: An indicator that measures the purity of a sample set gain ratio: information gain / intrinsic value pruning prepruning post-pruning continuous and missing process continuous: discretization missing: use complete events to divide attribute, and divide missing event to all child nodes according to weight. multivariate decision tree ++ + + + + + + y + + + + + + + + + + + + ++ + + + + + ++ + + + + + + + + + + + 11 x
learning algorithm neutral networks x1 x2 ? w1 ? = ? ???? w2 ?=1 wn xn activation function: ????(?) ???(?) ??????(?) 1 1 1 0.5 0 0 0 1 ? ? ? 1 ??? ? = 1, ? 0 ? < 0 ????(?) = max(0,?) ??????(?) = 12 0, 1 + ? ?
learning algorithm neutral networks perceptron and multi-layer neural networks: ? ? = 1.5 ?2 ?2 1 1 (1,1) - = 0.5 = 1.5 (0,1) (1,1) + (0,1) + 1 1 1 1 - - ?2 ?1 ?1 ?2 ?1 (1,0) (0,0) ?1 (1,0) (0,0) ? = ??? ?1+ ?2 2 ? = ??? ?1?1+ ?2?2 ?? ??+ ?? ??= ?(? ?)?? and xor local minimum and global minimum: search for the optimal parameters to minimize the error gradient descent leaving local minimum: multiple initialization, simulated annealing, random gradient 13
learning algorithm support vector machine Continue