
Probabilistic Formulation in Linear Regression: Foundations and Algorithms
Explore the foundational concepts of probabilistic models for linear regression, including least squares formulation, regularization techniques, and maximum likelihood estimation. Understand how regression overfit occurs and how to mitigate it using regularization methods such as Ridge and Lasso regression. Dive into the probabilistic formulation by modeling X and Y as random variables with a linear equation and Gaussian noise.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Probabilistic Models for Linear Regression 1 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Regression Problem N iid training samples {??,??} Response / Output / Target : ?? ? Input / Feature vector: ? ?? Linear Regression ??= ????+ ?? Polynomial Regression ??= ??? ?? + ?? ??? = ?? Still linear function of w 2 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Least Squares Formulation Deterministic error term ?? Minimize total error ? ? = ??? 2 ? = argmin? ?(?) Find gradient wrt ? and equate to 0 ? = ??? 1??? 3 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Regularization for Regression How does regression overfit? Adding regularization to regression ?1?,? + ??2? 4 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Regularization for Regression Possibilities for regularizers ?2 norm ??? (Ridge regression) Quadratic: Continuous, convex ? = ?? + ??? 1??? ?1 norm (Lasso) Choosing ? Cross validation: wastes training data 5 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Probabilistic formulation Model X and Y as random variables Directly model conditional distribution of Y IID ??| ??= ? ??? ?(?|?) Linear ??= ????+ ??,?? ??? ? ? Gaussian noise ? ? = ? 0,?2 ? ???2 2?2 1 ? ? ? = 2??exp{ } 6 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Probabilistic formulation Image from Michael Jordan s book 7 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Maximum Likelihood Estimation Formulate loglikelihood ?/2 1 1 ? ???? 2 } ? ? = ? ????;? = exp{ 2?? 2??^2 ? ? ? ???? 2 ? ? = ? Recovers LMS formulation! Maximize to get MLE ???= ??? 1??? ?2??=1 ? ? ???) (?? ??? 8 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Bayesian Linear Regression Model W as random variable with prior distribution ? ? = ? ?0,?0;?, ?0 is ? 1, ?0 is ? ? Derive posterior distribution ? ? ? = ? ??,?? (for some??,??) Derive mean of posterior distribution ??= ? ? ? = ?? 9 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Iterative Solutions for Normal Equations Direct solutions have limitations Iterative solutions First order method: Gradient descent ?(?+1) ?(?)+ ? ?? ????? ?? ? Convergence guarantees Convergence in probability to correct solution for appropriate fixed step size Sure convergence with decreasing step sizes Stochastic gradient descent Update based on a single data point as each step Often converges faster 10 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Advantages of Probabilistic Modeling Makes assumptions explicit Modularity Conceptually simple to change a model by replacing with appropriate distributions 11 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Summary Probabilistic formulation of linear regression Recovers least squares formulation Iterative algorithms for training Forms of regularization 12 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya