{"id":3712,"date":"2018-04-13T13:15:50","date_gmt":"2018-04-13T18:15:50","guid":{"rendered":"https:\/\/scottaaronson.blog\/?p=3712"},"modified":"2019-01-28T14:32:55","modified_gmt":"2019-01-28T20:32:55","slug":"how-to-upper-bound-the-probability-of-something-bad","status":"publish","type":"post","link":"https:\/\/scottaaronson.blog\/?p=3712","title":{"rendered":"How to upper-bound the probability of something bad"},"content":{"rendered":"<p>Scott Alexander has a <a href=\"http:\/\/slatestarcodex.com\/2018\/04\/12\/recommendations-vs-guidelines\/\">new post<\/a> decrying how rarely experts encode their knowledge in the form of detailed guidelines with conditional statements and loops&#8212;or what one could also call flowcharts or expert systems&#8212;rather than just blanket recommendations.&nbsp; He gives, as an illustration of what he&#8217;s looking for, an algorithm that a psychiatrist might use to figure out which antidepressants or other treatments will work for a specific patient&#8212;with the huge proviso that you shouldn&#8217;t try his algorithm at home, or (most importantly) sue him if it doesn&#8217;t work.<\/p>\n<p>Compared to a psychiatrist, I have the huge advantage that if <em>my<\/em> professional advice fails, normally no one gets hurt or gets sued for malpractice or commits suicide or anything like that.&nbsp; OK, but what do I actually know that can be encoded in if-thens?<\/p>\n<p>Well, one of the commonest tasks in the day-to-day life of any theoretical computer scientist, or mathematician of the Erd\u00f6s flavor, is to <em>upper bound the probability that something bad will happen<\/em>: for example, that your randomized algorithm or protocol will fail, or that your randomly constructed graph or code or whatever it is won&#8217;t have the properties needed for your proof.<\/p>\n<p>So without further ado, here are my secrets revealed, my ten-step plan to probability-bounding and computer-science-theorizing success.<\/p>\n<p><strong>Step 1.<\/strong> &#8220;1&#8221; is definitely an upper bound on the probability of your bad event happening.&nbsp; Check whether that upper bound is good enough.&nbsp; (Sometimes, as when this is an inner step in a larger summation over probabilities, the answer will actually be yes.)<\/p>\n<p><strong>Step 2.<\/strong> Try using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Markov%27s_inequality\">Markov\u2019s inequality<\/a> (a nonnegative random variable exceeds its mean by a factor of k at most a 1\/k fraction of the time), combined with its close cousin in indispensable obviousness, the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Boole%27s_inequality\">union bound<\/a> (the probability that any of several bad events will happen, is at most the sum of the probabilities of each bad event individually).&nbsp; About half the time, you can stop right here.<\/p>\n<p><strong>Step 3.<\/strong> See if the bad event you\u2019re worried about involves a sum of independent random variables exceeding some threshold. If it does, hit that sucker with a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chernoff_bound\">Chernoff<\/a> or <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hoeffding%27s_inequality\">Hoeffding<\/a> bound.<\/p>\n<p><strong>Step 4.<\/strong> If your random variables aren&#8217;t independent, see if they at least form a&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Martingale_(probability_theory)\">martingale<\/a>: a fancy word for a sum of terms, each of which has a mean of 0 conditioned on all the earlier terms, even though it might depend on the earlier terms in subtler ways.&nbsp; If so,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Azuma%27s_inequality\">Azuma<\/a>&nbsp;your problem into submission.<\/p>\n<p><strong>Step 5.<\/strong> If you don\u2019t have a martingale, but you still feel like your random variables are only weakly correlated, try calculating the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Variance\">variance<\/a> of whatever combination of variables you care about, and then using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chebyshev%27s_inequality\">Chebyshev\u2019s inequality<\/a>: the probability that a random variable differs from its mean by at most k times the standard deviation (i.e., the square root of the variance) is at most 1\/k<sup>2<\/sup>.&nbsp; If the variance doesn&#8217;t work, you can try calculating some higher moments too&#8212;just beware that, around the 6<sup>th<\/sup> or 8<sup>th<\/sup> moment, you and your notebook paper will likely both be exhausted.<\/p>\n<p><strong>Step 6.<\/strong>&nbsp;OK, umm &#8230; see if you can upper-bound the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Total_variation_distance_of_probability_measures\">variation distance<\/a> between your probability distribution and a different distribution for which it\u2019s already known (or is easy to see) that it\u2019s unlikely that anything bad happens. A good example of a tool you can use to upper-bound variation distance is <a href=\"https:\/\/en.wikipedia.org\/wiki\/Pinsker%27s_inequality\">Pinsker\u2019s inequality<\/a>.<\/p>\n<p><strong>Step 7.<\/strong> Now is the time when you start ransacking Google and Wikipedia for things like the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Lov%C3%A1sz_local_lemma\">Lov\u00e1sz Local Lemma<\/a>, and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/03\/polycon.pdf\">concentration bounds for low-degree polynomials<\/a>, and <a href=\"https:\/\/en.wikipedia.org\/wiki\/H%C3%B6lder%27s_inequality\">H\u00f6lder&#8217;s inequality<\/a>, and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Talagrand%27s_concentration_inequality\">Talagrand&#8217;s inequality<\/a>, and other <a href=\"https:\/\/en.wikipedia.org\/wiki\/Isoperimetric_inequality\">isoperimetric-type inequalities<\/a>, and <a href=\"https:\/\/www.cs.cmu.edu\/~odonnell\/boolean-analysis\/lecture16.pdf\">hypercontractive inequalities<\/a>, and other stuff that you\u2019ve heard your friends rave about, and have even seen successfully used at least twice, but there\u2019s no way you\u2019d remember off the top of your head under what conditions any of this stuff applies, or whether any of it is good enough for your application. (Just between you and me: you may have <em>already<\/em> visited Wikipedia to refresh your memory about the earlier items in this list, like the Chernoff bound.) &#8220;Try a hypercontractive inequality&#8221; is surely the analogue of the psychiatrist\u2019s \u201ctry electroconvulsive therapy,\u201d for a patient on whom all milder treatments have failed.<\/p>\n<p><strong>Step 8.<\/strong> So, these bad events &#8230; how bad are they, anyway? Any chance you can live with them?&nbsp; (See also: Step 1.)<\/p>\n<p><strong>Step 9.<\/strong> You <em>can\u2019t<\/em> live with them? Then back up in your proof search tree, and look for a whole different approach or algorithm, which would make the bad events less likely or even kill them off altogether.<\/p>\n<p><strong>Step 10.<\/strong> Consider the possibility that the statement you\u2019re trying to prove is false&#8212;or if true, is far beyond any existing tools.&nbsp; (This might be the analogue of the psychiatrist&#8217;s: consider the possibility that evil conspirators&nbsp;<em>really&nbsp;are<\/em>&nbsp;out to get your patient.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Scott Alexander has a new post decrying how rarely experts encode their knowledge in the form of detailed guidelines with conditional statements and loops&#8212;or what one could also call flowcharts or expert systems&#8212;rather than just blanket recommendations.&nbsp; He gives, as an illustration of what he&#8217;s looking for, an algorithm that a psychiatrist might use to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"_wpas_customize_per_network":false},"categories":[29],"tags":[],"class_list":["post-3712","post","type-post","status-publish","format-standard","hentry","category-nerd-self-help"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts\/3712","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3712"}],"version-history":[{"count":5,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts\/3712\/revisions"}],"predecessor-version":[{"id":4103,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts\/3712\/revisions\/4103"}],"wp:attachment":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3712"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3712"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3712"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}