Supervised Learning Algorithms: Techniques and Steps
"Explore various supervised learning algorithms for classification tasks, including statistical-based methods, distance-based methods, decision tree methods, kernel-based methods, and biological-based methods. Learn the steps involved in building a classification model from data collection to deployment and monitoring."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
ENG6600: Advanced Machine Learning ENG6600: Advanced Machine Learning Introduction to Supervised Learning Classification, Part II S. Areibi School of Engineering University of Guelph
Supervised Learning Algorithms Many Machine Learning algorithms have been developed in the past 50-60 years Today a data scientist has several machine learning algorithms at his/her disposal to solve hard problems The main task of Machine Learning is to search through the solution space of possible hypotheses for one that will perform well, even on new examples beyond the training set (i.e. Generalize) Typical Algorithms Logistic Regression Decision Trees Rule-based induction Neural networks K-Nearest Neighbor Bayesian networks Support Vector Machines Ensemble Based Methods How to Categorize Them? 3
Supervised Learning Algorithms A number of classification algorithms techniques are known, which can be broadly classified into the following categories: 1. Statistical-Based Methods: Logistic Regression Bayesian Classifier (Na ve Bayes) 2. Distance-Based Methods: K-Nearest Neighbours 3. Decision Tree (Rule-Based) Methods: ID3, C 4.5, CART 4. Kernel Based Methods: Support Vector Machines (SVM) 5. Biological Based Methods: Artificial Neural Networks (ANN) 4
Supervised Learning Steps The following are the steps involved in building a classification model: 1. Prepare your data, clean it, preprocess it, extract features . 2. Initialize the classifier to be used. 3. Train the classifier: All classifiers in scikit-learn uses a fit(X, y) method to fit the model(training) for the given train data X and train label y. 4. Predict the target: Given an unlabeled observation X, the predict(X) returns the predicted label y. 5. Evaluate the classifier model (Accuracy, TPR, TNR, ..) 5
Supervised Learning Algorithms 1. Data collection or gathering is collecting relevant data required for the supervised learning algorithm. This data can be originated via regular activities like transactions, demographics, surveys, etc. 2. Data Preparation is where we modify and transform the data using the necessary steps. It is highly required to remove unwanted data points and fill-in the inconsistencies in the data. This step ensures accuracy. 3. Modeling or training phase where the relationship between label and other variables are established. 4. In the Evaluation phase, we check for errors and try to improve the model. 5. Deploymentand monitoring happen on unseen data, a stage where the model is implemented and prediction outputs are generated. We may spend lots of Time in this loop!! 6
Classification Tasks: Types There are many different types of classification tasks that you may encounter in machine learning and specialized approaches to modeling that may be used for each. 1. Binary Classification: Refers to those classification tasks that have only two class labels. Multi-Class Classification: (each sample belongs to one class) Refers to those classification that have more than two class labels. Multi-Label Classification: (each sample can belong to multiple classes) Refers to those classification tasks that have two or more class labels, where one or more class labels may be predicted for each example Imbalanced Classification Refers to classification tasks where the number of examples/records in each class is unequally distributed. Typically, imbalanced classification tasks are binary classification tasks where the majority of examples in the training dataset belong to the normal class and a minority belong to the abnormal class. 2. 3. 4. 7
Classification Tasks: Types 1. Binary Classification: refers to those classification tasks that have only 2 class labels. 2. Multi-Class Classification: refers to those classification that have more than 2 class labels. Binary Classification Multi-Class Classification 8
Classification Tasks: Types 1. Binary Classification: Only One Classifier Model is needed! Examples include: o Email spam detection, Cancer detection, Conversion prediction Popular algorithms: o Logistic Regression, Support Vector Machines, 2. Multi-Class Classification: Examples include: o Face Classification, Plant Species Classification, OCR Unlike binary classification, multi-class classification does not have the notion of normal and abnormal outcomes. Instead, examples are classified as belonging to one among a range of known classes. Algorithms that are designed for binary classification can be adapted for use for multi-class problems. How? o This involves using a strategy of fitting multiple binary classification models for each class vs. all other classes (called one-vs-all) e.g. Easy Level Class vs. (All .. In this case Intermediate, Advanced, ) o One-vs-All: Fit one binary classification model for each class vs. all other classes. o If we have a Multi-Class=5 problem, then we need 5 Binary Classifiers to accomplish the task 9
One-vs-All (One-vs-Rest) Problem with this approach? Data Imbalance may be an issue when we tend to use One-vs-All 10
One-vs-All (One-vs-Rest) What are the steps that we need to take for 3-Classes? 1. Make 3 copies of the data. Data#1: Class_A and all other Classes Class_X Data#2: Class_B and all other Classes Class_X Data#3: Class_C and all other Classes Class_X 2. We train each Binary Classifier with its own dataset. 3. We then pass the Test Data to the Classifier Models Classifier #1 Positive with Prob. Score of (0.9) Classifier #2 Positive with Prob. Score of (0.4) Classifier #3 Negative with Prob. Score (0.5) Hence, based on the Positive Responses and decisive probability score, we can say that our test input belongs to Class_A 11
One-vs-One One-vs-One (OvO for short) is another heuristic method for using binary classification algorithms for multi-class classification. Like one-vs-rest, one-vs-one splits a multi-class classification dataset into binary classification problems. Unlike one-vs-rest that splits it into one binary dataset for each class, the one-vs-one approach splits the dataset into one dataset for each class versus every other class. For example, consider a multi-class classification problem with four classes: red, blue, and green, black. This could be divided into six binary classification datasets as follows: Binary Classification Problem 1: red vs. blue Binary Classification Problem 2: red vs. green Binary Classification Problem 3: red vs. black Binary Classification Problem 4: blue vs. green Binary Classification Problem 5: blue vs. black Binary Classification Problem 6: green vs. black Each binary classification model may predict one class label and the model with the most predictions or votes is predicted by the one-vs-one strategy. 12
Classification Tasks: Types 3. Multi-Label Classification: Examples include: o Photo Classification (multiple objects), object detection o Identifying pedestrians and traffic signs at same time in an image o Decision making: Among several algorithms in two or three stages Popular algorithms: Multi-label o Decision Trees, Random Forests, Gradient Boosting o Another approach is to use a separate classification algorithm to predict the labels for each class. 4. Imbalanced Classification: Examples include: o Fraud detection, Outlier detection, Medical diagnostic tests. Popular algorithms: o Since data is imbalanced, specialized techniques may be used to change the composition of samples in the training dataset by under- sampling the majority class or oversampling the minority class. o Techniques: Random Under-sampling, SMOTE Oversampling. o Cost sensitive (Logistic Regression, Decision Treed, SVM) Evaluation and performance metrics: o Precision, Recall, F-Measure, ROC 13
Multi-Label Classification For example, If we are building a model which predicts all the clothing articles a person is wearing, we can use a multi-label classification model since there can be more than one possible option at once. Recommender System 14
SKLearn Example We show how to synthesize data in Scikit Learn that can be used with : Binary Classification Multi-Class Classification Multi-Label Classification Imbalanced Classification We can use the make_blobs() function to generate the synthetic datasets. 16
SKLearn: Binary Classification # example of binary classification task from numpy import where from collections import Counter from sklearn.datasets import make_blobs from matplotlib import pyplot # define dataset X, y = make_blobs(n_samples=1000, centers=2, random_state=1) # summarize dataset shape print(X.shape, y.shape) # summarize observations by class label counter = Counter(y) print(counter) # summarize first few examples for i in range(10): print(X[i], y[i]) (1000, 2) (1000,) Counter({0: 500, 1: 500}) [-3.05837272 4.48825769] 0 [-8.60973869 -3.72714879] 1 [1.37129721 5.23107449] 0 [-9.33917563 -2.9544469 ] 1 [-11.57178593 -3.85275513] 1 [-11.42257341 -4.85679127] 1 [-10.44518578 -3.76476563] 1 [-10.44603561 -3.26065964] 1 [-0.61947075 3.48804983] 0 [-10.91115591 -4.5772537 ] 1 17
SKLearn: Binary Classification # plot the dataset and color the by class label for label, _ in counter.items(): row_ix = where(y == label)[0] pyplot.scatter(X[row_ix, 0], X[row_ix, 1], label=str(label)) pyplot.legend() pyplot.show() A scatter plot is created for the input variables in the dataset and the points are colored based on their class value. We can see two distinct clusters that we might expect would be easy to discriminate. 18
SKLearn: Multi-Class # example of multi-class classification task from numpy import where from collections import Counter from sklearn.datasets import make_blobs from matplotlib import pyplot # define dataset X, y = make_blobs(n_samples=1000, centers=3, random_state=1) # summarize dataset shape print(X.shape, y.shape) # summarize observations by class label counter = Counter(y) print(counter) # summarize first few examples for i in range(10): print(X[i], y[i]) (1000, 2) (1000,) Counter({0: 334, 1: 333, 2: 333}) [-3.05837272 4.48825769] 0 [-8.60973869 -3.72714879] 1 [1.37129721 5.23107449] 0 [-9.33917563 -2.9544469 ] 1 [-8.63895561 -8.05263469] 2 [-8.48974309 -9.05667083] 2 [-7.51235546 -7.96464519] 2 [-7.51320529 -7.46053919] 2 [-0.61947075 3.48804983] 0 [-10.91115591 -4.5772537 ] 1 19
SKLearn: Multi-Class # plot the dataset and color the by class label for label, _ in counter.items(): row_ix = where(y == label)[0] pyplot.scatter(X[row_ix, 0], X[row_ix, 1], label=str(label)) pyplot.legend() pyplot.show() A scatter plot is created for the input variables in the dataset and the points are colored based on their class value. We can see three distinct clusters that we might expect would be easy to discriminate. 20
SKLearn: Multi-Label # example of a multi-label classification task from sklearn.datasets import make_multilabel_classification # define dataset X, y = make_multilabel_classification(n_samples=1000, n_features=2, n_classes=3, n_labels=2, random_state=1) # summarize dataset shape print(X.shape, y.shape) # summarize first few examples for i in range(10): print(X[i], y[i]) (1000, 2) (1000, 3) [18. 35.] [1 1 1] [22. 33.] [1 1 1] [26. 36.] [1 1 1] [24. 28.] [1 1 0] [23. 27.] [1 1 0] [15. 31.] [0 1 0] [20. 37.] [0 1 0] [18. 31.] [1 1 1] [29. 27.] [1 0 0] [29. 28.] [1 1 0] The code above generates a dataset with 1,000 examples, each with two input features. There are three classes, each of which may take on one of two labels (0 or 1). 21
SKLearn: Imbalanced # example of an imbalanced binary classification task from numpy import where from collections import Counter from sklearn.datasets import make_classification from matplotlib import pyplot # define dataset X, y = make_classification(n_samples=1000, n_features=2, n_informative=2, n_redundant=0, n_classes=2, n_clusters_per_class=1, weights=[0.99,0.01], random_state=1) # summarize dataset shape print(X.shape, y.shape) # summarize observations by class label counter = Counter(y) print(counter) # summarize first few examples for i in range(10): print(X[i], y[i]) (1000, 2) (1000,) Counter({0: 983, 1: 17}) [0.86924745 1.18613612] 0 [1.55110839 1.81032905] 0 [1.29361936 1.01094607] 0 [1.11988947 1.63251786] 0 [1.04235568 1.12152929] 0 [1.18114858 0.92397607] 0 [1.1365562 1.17652556] 0 [0.46291729 0.72924998] 0 [0.18315826 1.07141766] 0 [0.32411648 0.53515376] 0 https://machinelearningmastery.com/cost-sensitive-logistic-regression/ 22
SKLearn: Imbalanced # plot the dataset and color the by class label for label, _ in counter.items(): row_ix = where(y == label)[0] pyplot.scatter(X[row_ix, 0], X[row_ix, 1], label=str(label)) pyplot.legend() pyplot.show() We can see one main cluster for examples that belong to class 0 and a few scattered examples that belong to class 1. The intuition is that datasets with this property of imbalanced class labels are more challenging to model. 23
Summary o Supervised learning (SL) is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. o It infers a function from labeled training data consisting of a set of training examples. o In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (i.e. supervisory signal). o A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. o An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. o This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias). o This statistical quality of an algorithm is measured through the so-called generalization error. 25
Misc. Resources o YouTube https://www.youtube.com/watch?v=xtOg44r6dsE https://www.youtube.com/watch?v=Ig1nfPjrETc https://www.youtube.com/watch?v=Quh6x4kG6VY https://www.youtube.com/watch?v=olFxW7kdtP8 https://www.youtube.com/watch?v=I7NrVwm3apg https://www.youtube.com/watch?v=WACw0UPl3BA o Tutorials: https://machinelearningmastery.com/introduction-machine-learning-scikit-learn/ https://machinelearningmastery.com/machine-learning-in-python-step-by-step/ https://www.guru99.com/machine-learning-tutorial.html https://scikit-learn.org/stable/tutorial/basic/tutorial.html o Documents: https://www.toptal.com/machine-learning/machine-learning-theory-an-introductory-primer https://monkeylearn.com/blog/classification-algorithms/ https://machinelearningmastery.com/types-of-learning-in-machine-learning/ https://www.digitalocean.com/community/tutorials/an-introduction-to-machine-learning https://monkeylearn.com/blog/classification-algorithms/ https://www.section.io/engineering-education/parametric-vs-nonparametric/ 27
ML Types & Categories o YouTube https://www.youtube.com/watch?v=qY-dzPR4Al8 https://www.youtube.com/watch?v=dRECy6Bv9jk https://www.youtube.com/watch?v=OXcV0EXSRb4 o Documents: https://www.linkedin.com/pulse/parametric-non-parametric-machine-learning-algorithm- mayank-verma/ 28
Misc. Resources o YouTube Serrano Academy: https://www.youtube.com/c/LuisSerrano Serrano Logistic Regression and Perceptron algorithm: https://www.youtube.com/watch?v=jbluHIgBmBo&t=1816s One Hot Encoding: https://www.youtube.com/watch?v=W_eH0oIXDRI o Documents: One Hot Encoding: https://www.educative.io/blog/one-hot-encoding Introduction to Machine Learning: https://machinelearningmastery.com/types-of-classification-in-machine-learning/ One vs One for Multi-Class Classification https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi- class-classification/ Cost Sensitive Logistic Regression for Imbalanced Classification: https://machinelearningmastery.com/cost-sensitive-logistic-regression/ 29
Misc. Resources: Multi Label o YouTube Multi0label Classification with ScikitLearn https://www.youtube.com/watch?v=nNDqbUhtIRg https://www.youtube.com/watch?v=qMhNYVJH5Ag https://www.youtube.com/watch?v=265-t5HxOR4 o Tutorial: https://machinelearningmastery.com/multi-label-classification-with-deep-learning/ http://scikit.ml/tutorial.html o Documents: https://towardsdatascience.com/multi-label-text-classification-with-scikit-learn- 30714b7819c5 http://manikvarma.org/downloads/XC/XMLRepository.html o Source Code: https://www.analyticsvidhya.com/blog/2017/09/common-machine-learning- algorithms/ 30