Understanding Statistical Learning Theory and Applications

statistical learning n.w
1 / 19
Embed
Share

Dive into the world of statistical learning theory and its significance in inference, prediction, decision-making, and modeling from data. Explore how statistical regularities impact visual and linguistic data, such as Zipf's law for word frequency. Uncover the implications of communicative efficiency on language and the arguments supporting it.

  • Statistical Learning
  • Inference
  • Prediction
  • Communicative Efficiency
  • Language

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Statistical Learning Overview_2_21_2018_reading

  2. What is statistical Learning and why do we care? Statistical Learning (Introduction to Statistical Learning Theory Olivier Bousquet, Stephane Boucheron, and Gabor Lugosi) The main goal of statistical learning theory is to provide a framework for studying the problem of inference, that is of gaining knowledge, making predictions, making decisions or constructing models from a set of data. 1. Observe a phenomenon 2. Construct a model of that phenomenon 3. Make predictions using this model Recent decades there has been enormous progress in machine learning techniques for extracting structure from data supervised learning unsupervised learning

  3. Are there statistical regularities in the world? For visual data, yes What about for language? Let s take two simple examples: Words and their structure Grammatical categories

  4. Words Frequency: Zipf s law: Log frequency as a function of rank frequency is approximately linear

  5. What does Zipfs law reflect? Let s brainstorm:

  6. What does this tell us? Everything (no one has made this claim) Communicative efficiency shapes language Lots of people working within information theoretic frameworks Brief example of information in bits Shaped by speakers and listeners Nothing Chomsky s argument from ambiguity Miller s argument from chance

  7. Some arguments for communicative efficiency Ambiguity Piantadosi et al. Requires that people implicitly know frequencies Requires that people use contexts Assumes communicative efficiency requires taking into account mutual effort by speaker and listener Do people know frequencies? Word frequency effects Do people use context? Context effects in language processing Do speakers care about efficiency? Hyper-articulation

  8. Bringing context and frequency together Frequency and contextual diversity CD vs Frequency in low constraint contexts) CD wins CD vs Frequency in high constraint contexts Frequency wins Why? Frequency and CD are estimates of predictability Are people predictors? Predictive processing and learning

  9. Are infants an young children statistical learners? Saffran, Aslin & Newport (1998) Next sequences What are children doing? Passive learning (children as sponges) Active learning (children engaged in predictive learning Attention is directed towards information that informs internal models

  10. One type of statistic: Transitional Probabilities Syllable-to-syllable TPs within a word > between words hap-py > py-bay Look at the happy baby py-kit < kit-ty Look at the happy kitty

  11. Artificial language consists of concatenated CV syllables pabikutibudogolatudaropi pabiku tibudo golatu daropi

  12. Experiment 2: Transitional Probabilities Design: Saffran, Aslin & Newport (1996) 1.0 .33 1.0 1.0 1.0 TPs = GO LA TU DA RO PI word part-word Test Items

  13. Headturn Preference Procedure

  14. Sequence of syllables: A-B-C-D-E-F-G-H-I-J-K-L . . . Test triplets: D-E-F vs. I-J-K

  15. Are infants an young children statistical learners? Saffran, Aslin & Newport (1998) Next sequences What are children doing? Passive learning (children as sponges) Active learning (children engaged in predictive learning Attention is directed towards information that informs internal models

  16. Why are statistical learning and big data important? Let s brainstorm

Related


More Related Content