
Tail Bounds
Explore the concept of tail bounds in probability theory, along with confidence intervals for random variables. Learn about Markov's Inequality and how it provides upper bounds for probabilities, illustrated with examples involving dice rolls and online advertising.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Tail Bounds CSE 312 Spring 24 Lecture 21
Confidence Intervals A confidence interval tells you the probability (how confident you should be) that your random variable fell in a certain range (interval) Usually close to its expected value ? ? > ? ? equivalently ? ? ? > 1 ? If your RV has expectation equal to the value you re searching for (like our polling example) you get a probability of being close enough to the target value.
Whats a Tail Bound? When we were finding our margin of error, we didn t need an exact calculation of the probability. We needed an inequality: the probability of being outside the margin of error was at most at most 5%. A tail bound (or concentration inequality) is a statement that bounds the probability in the tails of the distribution (says there s very little probability far from the center) or (equivalently) says that the probability is concentrated near the expectation.
Our First bound Two statements are equivalent. Left form is often easier to use. Right form is more intuitive. Markov s Inequality Markov s Inequality Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ? ?[?] Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ??[?] ? ? ? To apply this bound you only need to know: 1. it s non-negative 2. Its expectation.
Proof ? ? = ? ? (? = ?) = ? (? = ?) + ? (? = ?) ?:? ? ?:?<? ? (? = ?) + 0 ? 0 whenever ? = ? > 0 ?:? ? Markov s Inequality ? ? = ? ?:? ? Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ? ?[?] = ? ? = ? ?:? ? = ? (? ?) ? ? ? (? ?) ?
Example with geometric RV Suppose you roll a fair (6-sided) die until you see a 6. Let ? be the number of rolls. Bound the probability that ? 12 ? 12 ? ? 6 12=1 2. 12= Markov s Inequality Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ? ?[?] Exact probability? 1 ? < 12 1 0.865 = .135 ?
A Second Example Suppose the average number of ads you see on a website is 25. Give an upper bound on the probability of seeing a website with 75 or more ads. Markov s Inequality Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ? ?[?] ?
A Second Example Suppose the average number of ads you see on a website is 25. Give an upper bound on the probability of seeing a website with 75 or more ads. ? 75 ? ? 75=25 75=1 3 Markov s Inequality Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ? ?[?] ?
Useless Example Suppose the average number of ads you see on a website is 25. Give an upper bound on the probability of seeing a website with 20 or more ads. Markov s Inequality Let ? be a random variable supported (only) on non-negative numbers. For any ? > ? ? ? ?[?] ?
Useless Example Suppose the average number of ads you see on a website is 25. Give an upper bound on the probability of seeing a website with 20 or more ads. ? 20 ? ? 20=25 20= 1.25 Well, that s true. Technically. But without more information we couldn t hope to do much better. What if every page gives exactly 25 ads? Then the probability really is 1.
Sowhat do we do? A better inequality! We re trying to bound the tails of the distribution. What parameter of a random variable describes the tails? The variance!
Chebyshevs Inequality Two statements are equivalent. Left form is often easier to use. Right form is more intuitive. Chebyshev s Inequality Chebyshev s Inequality Let ? be a random variable. For any ? > ? ? ???(?) Let ? be a random variable. For any ? > ? ? ?? ? ? ? ? ??? ? ? ? ? ??
Proof of Chebyshev Chebyshev s Inequality Let ? be a random variable. For any ? > ? ? ???(?) ? ? ? ?? Let ? = ? ? ? Markov s Inequality ? ? = 0 ? ?2 ?2 ? ?2 ? ? 2 =Var ? =Var(?) |?| ? = ?2 ?2 = ?2 ?2 ?2 Inequalities are equivalent (square each side). ? is just ? shifted. Variance is unchanged.
Example with geometric RV Suppose you roll a fair (6-sided) die until you see a 6. Let ? be the number of rolls. Bound the probability that ? 12 5/6 1/36 62=5 ? 12 ? 6 6 6 Chebyshev s Inequality Let ? be a random variable. For any ? > ? ? ???(?) Not any better than Markov ? ? ? ??
Example with geometric RV Suppose you roll a fair (6-sided) die until you see a 6. Let ? be the number of rolls. Bound the probability that ? 12 5/6 1/36 62=5 ? 12 ? 6 6 6 Chebyshev s Inequality Let ? be a random variable. For any ? > ? ? ???(?) Not any better than Markov ? ? ? ??
Example with geometric RV Let ? be a geometric rv with parameter ? Bound the probability that ? 2 ? 1 ? ?2 1/?2= 1 ? ? 2/? ? 1/? 1/? Markov gives: Chebyshev s Inequality 2 ? =? ? 1 ? ? 2=1 Let ? be a random variable. For any ? > ? ? ???(?) 2. ? 2/?= For large ?, Chebyshev is better. ? ? ? ??
Better Example Suppose the average number of ads you see on a website is 25. And the variance of the number of ads is 16. Give an upper bound on the probability of seeing a website with 30 or more ads.
Better Example Suppose the average number of ads you see on a website is 25. And the variance of the number of ads is 16. Give an upper bound on the probability of seeing a website with 30 or more ads. ? 30 ? 25 5 16 25
Chebyshevs Repeated Experiments How many coin flips (each head with probability ?) are needed until you get ? heads. Let ? be the number necessary. What is probability ? 2?/?? Markov Chebyshev
Chebyshevs Repeated Experiments How many coin flips (each head with probability ?) are needed until you get ? heads. Let ? be the number necessary. What is probability ? 2?/?? Markov ? 2? ?/? 2?/?=1 ? 2 ?2/?2=?(1 ?)/?2 Chebyshev ? 2? ? ? ? ? Var(?) =1 ? ?2/?2 ? ? ?
Tail Bounds Takeaways Useful when an experiment is complicated and you just need the probability to be small (you don t need the exact value). Choosing a minimum ? for a poll don t need exact probability of failure, just to make sure it s small. Designing probabilistic algorithms just need a guarantee that they ll be extremely accurate Learning more about the situation (e.g. learning variance instead of just mean) usually lets you get more accurate bounds. Next time: more assumptions to get better bounds.