{"id":2293,"date":"2015-05-22T21:41:23","date_gmt":"2015-05-23T01:41:23","guid":{"rendered":"https:\/\/scottaaronson.blog\/?p=2293"},"modified":"2017-01-12T16:30:00","modified_gmt":"2017-01-12T21:30:00","slug":"nsa-in-ppoly-the-power-of-precomputation","status":"publish","type":"post","link":"https:\/\/scottaaronson.blog\/?p=2293","title":{"rendered":"NSA in P\/poly: The Power of Precomputation"},"content":{"rendered":"<p>Even after the Snowden revelations, there remained at least one big mystery about what the NSA was doing and how. \u00a0The NSA&#8217;s classified 2013 budget request mentioned, as a priority item,\u00a0&#8220;groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.&#8221; \u00a0There was a requested increase, of several hundred million dollars, for &#8220;cryptanalytic IT services&#8221; and &#8220;cryptanalysis and exploitation services program C&#8221; (whatever that was). \u00a0And a classified presentation slide showed encrypted data being passed to a high-performance computing system called &#8220;TURMOIL,&#8221; and decrypts coming out. \u00a0But whatever was going on inside TURMOIL\u00a0seemed to be secret even <em>within<\/em> NSA; someone at\u00a0Snowden&#8217;s level wouldn&#8217;t have had access\u00a0to the\u00a0details.<\/p>\n<p>So, what was (or is) inside the NSA&#8217;s cryptanalytic black box? \u00a0A quantum computer? \u00a0Maybe even one that they bought from D-Wave? \u00a0(Rimshot.) \u00a0A fast classical factoring algorithm? \u00a0A proof of P=NP? \u00a0Commentators on the Internet rushed to suggest each of these far-reaching possibilities. \u00a0Some of us tried to\u00a0<a href=\"https:\/\/scottaaronson.blog\/?p=1517\">pour cold water<\/a> on these speculations&#8212;pointing out that one could envision\u00a0many\u00a0scenarios\u00a0that were a little\u00a0more prosaic, a little more tied to the details of how public-key crypto is actually used in the real world. \u00a0Were we just na\u00efve?<\/p>\n<p>This week, a\u00a0<a href=\"https:\/\/weakdh.org\/imperfect-forward-secrecy.pdf\">new bombshell 14-author paper<\/a>\u00a0(see also the <a href=\"https:\/\/weakdh.org\/\">website<\/a>) advances an exceedingly plausible hypothesis about what <em>may<\/em> have been\u00a0the NSA&#8217;s greatest\u00a0cryptanalytic secret of recent years. \u00a0One of the authors is <a href=\"https:\/\/jhalderm.com\/\">J. Alex Halderman<\/a> of the University of Michigan, my\u00a0best friend since junior high school, who I&#8217;ve <a href=\"https:\/\/scottaaronson.blog\/?p=32\">blogged about<\/a> <a href=\"https:\/\/scottaaronson.blog\/?p=507\">before<\/a>. \u00a0Because of that,\u00a0I had some advance knowledge\u00a0of this scoop, and found myself having to do what regular <em>Shtetl-Optimized<\/em> readers will know is the single hardest thing in the world for me: <em>bite my tongue and not say\u00a0anything.<\/em> \u00a0Until now, that is.<\/p>\n<p>Besides Alex, the other authors are\u00a0David Adrian, Karthikeyan Bhargavan, Zakir Durumeric, Pierrick Gaudry, Matthew Green,\u00a0Nadia Heninger, Drew Springall, Emmanuel Thom\u00e9, Luke Valenta,\u00a0Benjamin VanderSloot, Eric Wustrow, Santiago Zanella-B\u00e9guelink, and Paul Zimmermann (two of these, Green and Heninger, have previously turned up on <em>Shtetl-Optimized<\/em>).<\/p>\n<p>These authors study vulnerabilities in <a href=\"http:\/\/en.wikipedia.org\/wiki\/Diffie%E2%80%93Hellman_key_exchange\">Diffie-Hellman key exchange<\/a>, the\u00a0&#8220;original&#8221; (but still widely-used) public-key cryptosystem, the one that predates even RSA. \u00a0Diffie-Hellman is the thing where Alice and Bob first agree on a huge prime number <em>p<\/em> and a number <em>g<\/em>, then Alice picks a secret <em>a<\/em> and sends Bob <em>g<sup>a<\/sup><\/em> (mod <em>p<\/em>), and Bob picks a secret <em>b<\/em> and sends Alice <em>g<sup>b<\/sup><\/em> (mod <em>p<\/em>), and then Alice and Bob\u00a0can both compute (<em>g<sup>a<\/sup><\/em>)<em><sup>b<\/sup><\/em>=(<em>g<sup>b<\/sup><\/em>)<em><sup>a<\/sup><\/em>=<em>g<sup>ab<\/sup><\/em> (mod <em>p<\/em>), but an eavesdropper who&#8217;s listening in only knows <em>p<\/em>, <em>g<\/em>, <em>g<sup>a<\/sup><\/em> (mod <em>p<\/em>), and <em>g<sup>b<\/sup><\/em> (mod <em>p<\/em>), and one can plausibly conjecture that it&#8217;s hard from those things alone to get <em>g<sup>ab<\/sup><\/em> (mod <em>p<\/em>). \u00a0So then Alice and Bob share a secret unknown to the eavesdropper, which they didn&#8217;t before, and they can use that secret to start doing cryptography.<\/p>\n<p>As far as anyone knows today, the best\u00a0way to break Diffie-Hellman is simply by calculating discrete logarithms: that is, solving the problem of recovering <em>a<\/em> given only <em>g<\/em> and <em>h<\/em>=<em>g<sup>a<\/sup><\/em> (mod <em>p<\/em>). \u00a0At least on a classical computer, the fastest known algorithm for discrete logarithms (over fields of prime order) is the number field sieve (NFS). \u00a0Under plausible conjectures about the distribution of &#8220;smooth&#8221; numbers, NFS uses time that grows like exp((1.923+o(1))(log <em>p<\/em>)<sup>1\/3<\/sup>(log log <em>p<\/em>)<sup>2\/3<\/sup>), where the exp and logs are base <em>e<\/em> (and yes, even the lower-order stuff like (log log <em>p<\/em>)<sup>2\/3<\/sup>\u00a0makes a big difference in practice). \u00a0Of course, once you know the running time of the best-known algorithm, you can then try to choose a key size (that is, a value of log(<em>p<\/em>)) that&#8217;s out of reach for that algorithm on the computing hardware of today.<\/p>\n<p>(Note that the recent <a href=\"https:\/\/www-fourier.ujf-grenoble.fr\/JC2\/exposes\/joux.pdf\">breakthrough<\/a> of Antoine Joux, solving discrete log in quasipolynomial time in fields of small characteristic, also relied heavily on sieving\u00a0ideas. \u00a0But there are no improvements from this\u00a0yet for the &#8220;original&#8221; discrete log problem, over prime fields.)<\/p>\n<p>But there&#8217;s one\u00a0crucial further fact, which has been understood\u00a0for at least a decade by theoretical cryptographers, but somehow was slow to\u00a0filter out to the people who deploy\u00a0practical cryptosystems. \u00a0The further fact is that in NFS, <em>you can arrange things so that almost all the discrete-logging effort\u00a0depends only on the prime number p, and not at all on the specific numbers g and h for which you&#8217;re trying to take\u00a0the discrete log<\/em>. \u00a0After this initial &#8220;precomputation&#8221; step, you then have a massive database\u00a0that you can use to speed up the &#8220;descent&#8221; step: the step of solving <em>g<sup>a<\/sup><\/em>=<em>h<\/em> (mod <em>p<\/em>), for any\u00a0(<em>g<\/em>,<em>h<\/em>) pair that you want.<\/p>\n<p>It&#8217;s a little like the complexity class <a href=\"http:\/\/en.wikipedia.org\/wiki\/P\/poly\">P\/poly<\/a>, where a single, hard-to-compute &#8220;advice string&#8221; unlocks exponentially many inputs once you have it. \u00a0(Or\u00a0a bit more precisely, one could say that NFS reveals\u00a0that exponentiation modulo a prime number is <em>sort of<\/em>\u00a0a <a href=\"http:\/\/en.wikipedia.org\/wiki\/Trapdoor_function\">trapdoor one-way function<\/a>, except that the trapdoor information is subexponential-size, and given the trapdoor, inverting the function is still\u00a0subexponential-time, but a milder subexponential than before.)<\/p>\n<p>The kicker is that,\u00a0in practice, a large percentage of all clients and servers that\u00a0use Diffie-Hellman key exchange use the same few prime numbers <em>p<\/em>. \u00a0This means that, if you wanted to decrypt a large fraction of all the traffic encrypted with Diffie-Hellman, you wouldn&#8217;t need\u00a0to do NFS over and over: you could just do it for a few <em>p<\/em>&#8216;s and cache the results. \u00a0This fact can singlehandedly change the outlook for\u00a0breaking Diffie-Hellman.<\/p>\n<p>The story is different depending on the key size, log(<em>p<\/em>). \u00a0In the 1990s, the US government insisted on &#8220;export-grade&#8221; cryptography for products sold overseas (what a quaint concept!), which meant that the key size was restricted to 512 bits. \u00a0For 512-bit keys, Adrian et al. were able to implement NFS and use it to do the precomputation\u00a0step in about 7 days on a cluster with a few thousand cores. \u00a0After this initial precomputation step (which produced 2.5GB of data), doing the descent, to find the discrete log for a specific (<em>g<\/em>,<em>h<\/em>) pair, took only about 90 seconds on a 24-core machine.<\/p>\n<p>OK, but no one <em>still<\/em> uses 512-bit keys, do they? \u00a0The first part of Adrian et al.&#8217;s paper demonstrates\u00a0that, because of implementation issues, even today you can force many servers to &#8220;downgrade&#8221; to the 512-bit, export-grade keys&#8212;and then, having done so, you can stall for time for 90 seconds as you figure out the session key, and then do a man-in-the-middle attack and take over and impersonate the server. \u00a0It&#8217;s an impressive example of the sort of game computer security researchers have been playing for a long time&#8212;but it&#8217;s really just a warmup to the main act.<\/p>\n<p>As you&#8217;d expect, many servers today are configured more intelligently, and will only agree to 1024-bit keys. \u00a0But even there, Adrian et al. found that a large fraction of servers\u00a0rely on just a single 1024-bit prime (!), and many of the ones that don&#8217;t rely on just a few other primes. \u00a0Adrian et al. estimate that, for a single 1024-bit prime, doing the NFS precomputation would take about 45 million years using a single core&#8212;or to put it more ominously, 1 year using 45 million cores. \u00a0If you built special-purpose hardware, that could go down by almost two orders of magnitude, putting the monetary cost at a few hundred million dollars, completely within the reach of a sufficiently determined nation-state. \u00a0Once the precomputation was done, and the terabytes\u00a0of output stored in a data center somewhere, computing a particular discrete log would then take about 30 days using 1 core, or mere minutes using a supercomputer. \u00a0Once again, none of this is assuming any algorithmic advances beyond what&#8217;s publicly known. \u00a0(Of course, it&#8217;s possible that the NSA <em>also<\/em> has some algorithmic advances; even modest ones could obviate the need for special-purpose hardware.)<\/p>\n<p>While writing this post, I did my own back-of-the-envelope, and got that using NFS, calculating\u00a0a 1024-bit discrete log should be about\u00a07.5 million times harder than calculating\u00a0a 512-bit discrete log. \u00a0So, extrapolating from the 7 days it took Adrian et al.\u00a0to do it for 512 bits, this suggests that it might&#8217;ve\u00a0taken them about\u00a0143,840 years to calculate\u00a01024-bit discrete logs with the few thousand\u00a0cores\u00a0they had, or 1 year if they had 143,840 times as many cores\u00a0(since almost all this stuff is extremely parallelizable). \u00a0Adrian et al. mention optimizations that they expect would improve this by a factor of 3,\u00a0giving us about 100 million core-years, very\u00a0similar to\u00a0Adrian et al.&#8217;s\u00a0estimate of 45 million core-years (the lower-order terms in the running time of NFS might account for some of the remaining discrepancy).<\/p>\n<p>Adrian et al. mount a detailed argument in their paper that all\u00a0of the details about NSA&#8217;s &#8220;groundbreaking cryptanalytic capabilities&#8221; that we learned from the Snowden documents match what <em>would<\/em>\u00a0be true if the NSA were doing something like the above. \u00a0The way Alex put it to me is that, sure, the NSA might not have been doing this, but if not, then he would like\u00a0to\u00a0understand <em>why<\/em> not&#8212;for it would&#8217;ve been completely\u00a0feasible within the cryptanalytic budget they had, and the NSA would&#8217;ve known that, and it would&#8217;ve been a\u00a0very good\u00a0codebreaking value for the money.<\/p>\n<p>Now that we know about this weakness of Diffie-Hellman key exchange, what can be done?<\/p>\n<p>The most obvious solution&#8212;but a good one!&#8212;is just to use longer keys. \u00a0For decades, when applied cryptographers would announce some attack like this, theorists like me would say with exasperation: &#8220;dude, why don&#8217;t you fix\u00a0<em>all<\/em> these problems in one stroke by\u00a0just, like, increasing the key sizes by a\u00a0factor of 10? \u00a0when it&#8217;s an exponential against a polynomial, we all know the exponential\u00a0will win eventually, so why not just go out to where it does?&#8221; \u00a0The applied cryptographers explain to us, with equal exasperation in their voices, that there are all sorts of reasons why not, from efficiency to\u00a0(maybe the biggest thing) <em>backwards-compatibility.<\/em>\u00a0 You can&#8217;t unilaterally demand 2048-bit keys, if millions of your customers are using browsers that only understand\u00a01024-bit keys. \u00a0On the other hand, given the new revelations, it looks like there really will be\u00a0a big\u00a0push\u00a0to migrate to larger key sizes, as the theorists would&#8217;ve suggested from their ivory towers.<\/p>\n<p>A second, equally-obvious solution is to <em>stop relying so much on the same few prime numbers<\/em> in Diffie-Hellman key exchange. \u00a0(Note that the reason RSA isn&#8217;t vulnerable to this particular attack is that it inherently requires a different composite number <em>N<\/em> for each public key.) \u00a0In practice, generating a new huge\u00a0random prime number tends to be expensive&#8212;taking, say, a few minutes&#8212;which is why people so often rely on &#8220;standard&#8221; primes. \u00a0At the least, we could use\u00a0libraries of millions of &#8220;safe&#8221; primes, from which a prime for a given session is\u00a0chosen randomly.<\/p>\n<p>A third\u00a0solution is to migrate to <a href=\"http:\/\/en.wikipedia.org\/wiki\/Elliptic_curve_cryptography\">elliptic-curve cryptography<\/a> (ECC), which as far as anyone knows today, is much less vulnerable to descent attacks than the original Diffie-Hellman scheme. \u00a0Alas, there&#8217;s been a lot of understandable distrust of ECC after the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Dual_EC_DRBG\">DUAL_EC_DBRG<\/a> scandal, in which it came out\u00a0that the NSA backdoored some of NIST&#8217;s elliptic-curve-based\u00a0pseudorandom generators by choosing particular parameters\u00a0that it knew how handle. \u00a0But maybe the right lesson to draw is mod-p\u00a0groups and elliptic-curve groups\u00a0<em>both<\/em>\u00a0seem to be pretty good for cryptography, but the mod-p groups are less good if everyone is using the same few prime numbers p (<em>and<\/em>\u00a0those primes are &#8220;within nation-state range&#8221;), and the elliptic-curve groups are less good if everyone is using the same few parameters. \u00a0(A lot of these things do seem pretty\u00a0predictable\u00a0with hindsight, but how many did <em>you<\/em> predict?)<\/p>\n<p>Many people will use this paper to ask political questions, like: hasn&#8217;t the NSA&#8217;s codebreaking mission once again usurped\u00a0its mission to ensure the nation&#8217;s information security? \u00a0Doesn&#8217;t the 512-bit vulnerability that many Diffie-Hellman implementations still face, as a holdover from the 1990s export rules, illustrate why\u00a0encryption should never\u00a0be deliberately weakened for purposes of &#8220;national security&#8221;? \u00a0How can we get over the issue of backwards-compatibility, and get everyone using strong crypto? \u00a0People absolutely should be asking such\u00a0questions.<\/p>\n<p>But for\u00a0readers of this blog, there&#8217;s one question that probably looms even larger than those of freedom versus security, openness versus secrecy, etc.: namely, the question of theory versus practice. \u00a0Which &#8220;side&#8221; should be said to have &#8220;won&#8221; this round? \u00a0Some will say: those useless theoretical cryptographers, they didn&#8217;t even know how their coveted Diffie-Hellman system could be broken in the real world! \u00a0The theoretical cryptographers might reply: <em>of course<\/em>\u00a0we\u00a0knew about the ability to do precomputation with NFS! \u00a0This wasn&#8217;t some NSA secret; it&#8217;s something we discussed openly for years. \u00a0And if someone told us\u00a0how Diffie-Hellman\u00a0was actually being used (with much of the\u00a0world relying on the same few primes), we could&#8217;ve immediately spotted\u00a0the potential for such an attack. \u00a0To which others might reply: then why didn&#8217;t you spot it?<\/p>\n<p>Perhaps the right lesson to draw is how silly such debates really are. \u00a0In the end, piecing this story together\u00a0took a team that was willing to do everything from learning some fairly difficult number theory to\u00a0coding up\u00a0simulations\u00a0to\u00a0poring over the Snowden documents for clues about the NSA&#8217;s budget. \u00a0Clear thought doesn&#8217;t respect the boundaries between disciplines, or between theory and practice.<\/p>\n<p><em>(Thanks very much to Nadia Heninger and Neal Koblitz for reading this post and correcting a few errors in it. \u00a0For more about this, see <a href=\"https:\/\/www.schneier.com\/blog\/archives\/2015\/05\/the_logjam_and_.html\">Bruce Schneier&#8217;s post<\/a>\u00a0or <a href=\"http:\/\/blog.cryptographyengineering.com\/2015\/05\/attack-of-week-logjam.html\">Matt Green&#8217;s post<\/a>.)<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Even after the Snowden revelations, there remained at least one big mystery about what the NSA was doing and how. \u00a0The NSA&#8217;s classified 2013 budget request mentioned, as a priority item,\u00a0&#8220;groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.&#8221; \u00a0There was a requested increase, of several hundred million dollars, for &#8220;cryptanalytic IT services&#8221; [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"_wpas_customize_per_network":false},"categories":[5,11],"tags":[],"class_list":["post-2293","post","type-post","status-publish","format-standard","hentry","category-complexity","category-nerd-interest"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts\/2293","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2293"}],"version-history":[{"count":10,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts\/2293\/revisions"}],"predecessor-version":[{"id":2306,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=\/wp\/v2\/posts\/2293\/revisions\/2306"}],"wp:attachment":[{"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2293"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2293"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scottaaronson.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2293"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}