Introduction to Machine Learning: Important Elements and Data Formats

introduction to machine learning n.w
1 / 64
Embed
Share

Explore the foundational concepts of machine learning through the lens of data formats, mathematical foundations, and prediction functions. Delve into the distinctions between univariate and multivariate distributions, essential for understanding supervised learning problems. Gain insights into classic and adaptive machines, deep learning, and bio-inspired adaptive systems as Prof. V.B. More from MET's IOE BKC Nashik guides you through the world of machine learning.

  • Machine Learning
  • Data Formats
  • Mathematical Foundations
  • Univariate Distribution
  • Multivariate Distribution

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Introduction to Machine Learning Prof V B More MET s IOE BKC Nashik

  2. Introduction to Machine Learning Introduction to Machine Learning Unit 1: Introduction to Machine learning Classic and adaptive machines, Machine learning matters, Beyond machine learning- deep learning and bio inspired adaptive systems, Machine learning and Big data. Important Elements of Machine Learning- Data formats, Learnability, Statistical learning approaches, Elements of information theory. Prof V B More, MET BKC IOE Nashik 2

  3. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning It is important to understand the mathematical foundation formats and prediction functions. In most algorithms, these concepts are treated in different ways, but the goal is always the same. More recent techniques, such as deep learning, extensively use energy/loss functions of data Prof V B More, MET BKC IOE Nashik 3

  4. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Data formats In a supervised learning problem, there will always be a dataset, defined as a finite set of Real valued vectors with m features each: Prof V B More, MET BKC IOE Nashik 4

  5. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Data formats Consider each X as drawn from a statistical multivariate distribution D. All samples are Independent and Identically Distributed (i.i.d). Multivariate distributions show comparisons between two or more measurements and the relationships among them. A multivariate normal distribution is a vector in multiple normally distributed variables, such that any linear combination of the variables is also normally distributed. Prof V B More, MET BKC IOE Nashik 5

  6. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Data formats What is the difference between univariate and multivariate distributions? Any linear combination of the variables has a univariate normal distribution. Any conditional distribution for a subset of the variables conditional on known values for another subset of variables is a multivariate distribution. Prof V B More, MET BKC IOE Nashik 6

  7. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Data formats This means all variables belong to the same distribution D, and considering an arbitrary subset of m values, Prof V B More, MET BKC IOE Nashik 7

  8. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Data formats The corresponding output values can be both numerical-continuous (regression) or (classification) categorical Prof V B More, MET BKC IOE Nashik 8

  9. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Data formats Examples of numerical outputs are: Examples of categorical outputs are: Prof V B More, MET BKC IOE Nashik 9

  10. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning We define generic regressor, a vector- valued function which associates an input value to a continuous output and generic classifier, a vector-values function whose predicted output is categorical (discrete). Prof V B More, MET BKC IOE Nashik 10

  11. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning If they also depend on an internal parameter, the approach is called parametric learning: Regressor Classifier Prof V B More, MET BKC IOE Nashik 11

  12. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning On the other hand, non-parametric learning doesn't make initial assumptions about the family of predictors. A very common non-parametric family is called instance-based learning and makes real- time predictions based on hypothesis determined only by the training samples. Prof V B More, MET BKC IOE Nashik 12

  13. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning A generic parametric training process must find the best parameter vector which minimizes the regression/classification error given a specific training dataset and it should also generate a predictor that can correctly generalize samples are provided. when unknown Prof V B More, MET BKC IOE Nashik 13

  14. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Another interpretation: additive noise: Expect value population mean, 2 variance, n additive noise Prof V B More, MET BKC IOE Nashik 14

  15. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning We can expect zero-mean low-variance Gaussian noise added to a perfect prediction. A training task must increase the signal- noise ratio by optimizing the parameters. High noise variance means that X is dirty and its measures are not reliable. Prof V B More, MET BKC IOE Nashik 15

  16. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning In unsupervised learning, we have an input set X with m-length vectors, and we define clustering function (with n target clusters) with the following expression: Prof V B More, MET BKC IOE Nashik 16

  17. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning In most scikit-learn models, there is an instance variable coef_ which contains all trained parameters. For example, in a single parameter linear regression, the output will be: >>> model = LinearRegression() >>> model.fit(X, Y) >>> model.coef_ array([ 9.10210898]) Prof V B More, MET BKC IOE Nashik 17

  18. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies When the number of output classes is greater than one, there are two main possibilities to manage a classification problem: One-vs-all One-vs-one Prof V B More, MET BKC IOE Nashik 18

  19. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies One-vs-all This is the most common strategy and is widely adopted by scikit-learn for most of its algorithms. If there are n output classes, n classifiers will be trained in parallel Prof V B More, MET BKC IOE Nashik 19

  20. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies One-vs-all This approach is relatively lightweight (at most, n-1 checks are needed to find the right class, so it has an O(n) complexity) and, for this reason, it's normally the default choice and there's no need for further actions. Prof V B More, MET BKC IOE Nashik 20

  21. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies One-vs-one Training a model for each pair of classes. The complexity is O(n2) and the right class is determined by a majority vote. In general, this choice is more expensive and should be adopted only when a full dataset comparison is not required. Prof V B More, MET BKC IOE Nashik 21

  22. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Classification task performed with more than two classes. Each sample can only be labelled as one class. Prof V B More, MET BKC IOE Nashik 22

  23. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies For ex., classification using features extracted from a set of images of fruit, where each image is either orange, apple, or a pear. Each image is one sample and is labelled as one of the 3 possible classes. Prof V B More, MET BKC IOE Nashik 23

  24. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Multiclass assumption that each sample is assigned to one and only one label - one sample cannot be both a pear and an apple. classification makes the Prof V B More, MET BKC IOE Nashik 24

  25. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Multiclass classifiers: Inherent Multiclass Classifiers: sklearn.naive_bayes.BernoulliNB sklearn.tree.DecisionTreeClassifier sklearn.tree.ExtraTreeClassifier sklearn.ensemble.ExtraTreesClassifier sklearn.naive_bayes.GaussianNB sklearn.neighbors.KNeighborsClassifier sklearn.semi_supervised.LabelPropagation sklearn.semi_supervised.LabelSpreading sklearn.discriminant_analysis.LinearDiscriminantAnalysis sklearn.svm.LinearSVC Prof V B More, MET BKC IOE Nashik 25

  26. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Multiclass classifiers: Inherent Multiclass Classifiers: sklearn.linear_model.LogisticRegression sklearn.linear_model.LogisticRegressionCV sklearn.neural_network.MLPClassifier sklearn.neighbors.NearestCentroid sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis sklearn.neighbors.RadiusNeighborsClassifier sklearn.ensemble.RandomForestClassifier sklearn.linear_model.RidgeClassifier sklearn.linear_model.RidgeClassifierCV Prof V B More, MET BKC IOE Nashik 26

  27. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Multiclass classifiers: Multiclass as One-Vs-One: sklearn.svm.NuSVC sklearn.svm.SVC. sklearn.gaussian_process.GaussianProcessClassifier Multiclass as One-Vs-The-Rest: sklearn.ensemble.GradientBoostingClassifier sklearn.gaussian_process.GaussianProcessClassifier sklearn.svm.LinearSVC sklearn.linear_model.LogisticRegression sklearn.linear_model.LogisticRegressionCV sklearn.linear_model.SGDClassifier sklearn.linear_model.Perceptron sklearn.linear_model.PassiveAggressiveClassifier Prof V B More, MET BKC IOE Nashik 27

  28. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Support multilabel: sklearn.tree.DecisionTreeClassifier sklearn.tree.ExtraTreeClassifier sklearn.ensemble.ExtraTreesClassifier sklearn.neighbors.KNeighborsClassifier sklearn.neural_network.MLPClassifier sklearn.neighbors.RadiusNeighborsClassifier sklearn.ensemble.RandomForestClassifier sklearn.linear_model.RidgeClassifierCV Support multiclass-multioutput: sklearn.tree.DecisionTreeClassifier sklearn.tree.ExtraTreeClassifier sklearn.ensemble.ExtraTreesClassifier sklearn.neighbors.KNeighborsClassifier sklearn.neighbors.RadiusNeighborsClassifier sklearn.ensemble.RandomForestClassifier Prof V B More, MET BKC IOE Nashik 28

  29. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Multiclass strategies Multiclass strategies implemented by scikit-learn, Visit following link for more info. http://scikit- learn.org/stable/modules/multiclass.html Prof V B More, MET BKC IOE Nashik 29

  30. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability A parametric model can be one of two types: static or dynamic. Static is determined by choice of a specific algorithm and is generally un-changeable. Dynamic is based on learning hypothesis and can operate on dynamic set of parameters. Prof V B More, MET BKC IOE Nashik 30

  31. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability The goal of a parametric learning process is to find the best hypothesis Having less prediction error and the avoid overfitting in generalization. Prof V B More, MET BKC IOE Nashik 31

  32. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability In example dataset, points must be classified as red (Class A) or blue (Class B). Prof V B More, MET BKC IOE Nashik 32

  33. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Three hypotheses are shown: the first one (the middle line starting from left) misclassifies one/two samples, while the lower and upper ones misclassify 13 respectively. and 23 samples Prof V B More, MET BKC IOE Nashik 33

  34. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability The first hypothesis is optimal and should be selected; The dataset X is linearly separable if there exists a hyperplane which divides the sample space into two subspaces containing belonging to the same class. only elements Prof V B More, MET BKC IOE Nashik 34

  35. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Overfitting must also be taken into consideration while separating the classes. Parametric model adopts only a family of non- periodic and approximate functions whose ability to oscillate and fit the dataset is determined by the number of parameters. Prof V B More, MET BKC IOE Nashik 35

  36. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability In example dataset, the blue classifier is linear while the red one is cubic Prof V B More, MET BKC IOE Nashik 36

  37. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability For generalization in classification / categorization there should be a function that separate the sample data into respective categories. Prof V B More, MET BKC IOE Nashik 37

  38. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability If we use linear function some of the sample points misclassified because of oscillation in data points. Classification function must also consider future trend. Prof V B More, MET BKC IOE Nashik 38

  39. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability If we apply cubic approach, it can fit this data almost perfectly but, at the same time, loses its ability to keep a global linear trend. Prof V B More, MET BKC IOE Nashik 39

  40. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Therefore, there are two possibilities: If we expect future data to be exactly distributed as training samples, a more complex model can be a good choice. In this case, a linear model will lead to underfitting, because it won't be able to capture all samples classification. for correct Prof V B More, MET BKC IOE Nashik 40

  41. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Therefore, there are two possibilities: If we think that future data can be locally distributed differently but keeps a global trend, it's preferable to have a higher remaining misclassification error as well as a more precise generalization ability. If we focus only on training data, it can lead to overfitting. Prof V B More, MET BKC IOE Nashik 41

  42. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Underfitting and overfitting The purpose of a machine learning model is to approximate an unknown associates input elements to the best possible output. Whereas, a training set is normally a representation of a global distribution, but it cannot contain all possible elements; otherwise the problem could be solved with a one-to-one association. function that Prof V B More, MET BKC IOE Nashik 42

  43. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Underfitting and overfitting If we don't know the future trend while training, it is necessary to think about fitting the model but keeping it free to generalize when an unknown input is presented. Unfortunately, this ideal condition is not always easy to find. Prof V B More, MET BKC IOE Nashik 43

  44. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Two different dangers to consider: Underfitting: It means that the model isn't able to capture the dynamics show by the same training set (probably because its capacity is too limited). Prof V B More, MET BKC IOE Nashik 44

  45. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Two different dangers to consider: Overfitting: the model has an excessive capacity and it is not more able to generalize with respect to the original dynamics provided by the training set. It can associate almost all the known samples to the corresponding output values, but when an unknown input is presented, the prediction error will be very high. Prof V B More, MET BKC IOE Nashik 45

  46. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Prof V B More, MET BKC IOE Nashik 46

  47. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Prof V B More, MET BKC IOE Nashik 47

  48. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability It's very important to avoid both underfitting and overfitting. Underfitting is easier to detect by observing prediction error, while Overfitting may prove to be more difficult to discover as it could be initially considered the result of a perfect fitting. Prof V B More, MET BKC IOE Nashik 48

  49. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Error measures When working with a supervised learning, we define a non-negative error measure em which takes two arguments (expected and predicted output) for computing total error: Prof V B More, MET BKC IOE Nashik 49

  50. Introduction to Machine Learning Introduction to Machine Learning Important Elements of Machine Learning Learnability Error measures This value is implicitly dependent on the specific hypothesis H through the parameter set, therefore optimizing the error implies finding an optimal hypothesis. In many cases, it's useful to consider the mean square error (MSE): Prof V B More, MET BKC IOE Nashik 50

Related


More Related Content