Why are amplitudes complex?

[By prior agreement, this post will be cross-posted on Microsoft’s Q# blog, even though it has nothing to do with the Q# programming language.  It does, however, contain many examples that might be fun to implement in Q#!]

Why should Nature have been quantum-mechanical?  It’s totally unclear what would count as an answer to such a question, and also totally clear that people will never stop asking it.

Short of an ultimate answer, we can at least try to explain why, if you want this or that piece of quantum mechanics, then the rest of the structure is inevitable: why quantum mechanics is an “island in theoryspace,” as I put it in 2003.

In this post, I’d like to focus on a question that any “explanation” for QM at some point needs to address, in a non-question-begging way: why should amplitudes have been complex numbers?  When I was a grad student, it was his relentless focus on that question, and on others in its vicinity, that made me a lifelong fan of Chris Fuchs (see for example his samizdat), despite my philosophical differences with him.

It’s not that complex numbers are a bad choice for the foundation of the deepest known description of the physical universe—far from it!  (They’re a field, they’re algebraically closed, they’ve got a norm, how much more could you want?)  It’s just that they seem like a specific choice, and not the only possible one.  There are also the real numbers, for starters, and in the other direction, the quaternions.

Quantum mechanics over the reals or the quaternions still has constructive and destructive interference among amplitudes, and unitary transformations, and probabilities that are absolute squares of amplitudes.  Moreover, these variants turn out to lead to precisely the same power for quantum computers—namely, the class BQP—as “standard” quantum mechanics, the one over the complex numbers.  So none of those are relevant differences.

Indeed, having just finished teaching an undergrad Intro to Quantum Information course, I can attest that the complex nature of amplitudes is needed only rarely—shockingly rarely, one might say—in quantum computing and information.  Real amplitudes typically suffice.  Teleportationsuperdense coding, the Bell inequality, quantum money, quantum key distribution, the Deutsch-Jozsa and Bernstein-Vazirani and Simon and Grover algorithms, quantum error-correction: all of those and more can be fully explained without using a single i that’s not a summation index.  (Shor’s factoring algorithm is an exception; it’s much more natural with complex amplitudes.  But as the previous paragraph implied, their use is removable even there.)

It’s true that, if you look at even the simplest “real” examples of quantum systems—or as a software engineer might put it, at the application layers built on top of the quantum OS—then complex numbers are everywhere, in a way that seems impossible to remove.  The Schrödinger equation, energy eigenstates, the position/momentum commutation relation, the state space of a spin-1/2 particle in 3-dimensional space: none of these make much sense without complex numbers (though it can be fun to try).

But from a sufficiently Olympian remove, it feels circular to use any of this as a “reason” for why quantum mechanics should’ve involved complex amplitudes in the first place.  It’s like, once your OS provides a certain core functionality (in this case, complex numbers), it’d be surprising if the application layer didn’t exploit that functionality to the hilt—especially if we’re talking about fundamental physics, where we’d like to imagine that nothing is wasted or superfluous (hence Rabi’s famous question about the muon: “who ordered that?”).

But why should the quantum OS have provided complex-number functionality at all?  Is it possible to answer that question purely in terms of the OS’s internal logic (i.e., abstract quantum information), making minimal reference to how the OS will eventually get used?  Maybe not—but if so, then that itself would seem worthwhile to know.

If we stick to abstract quantum information language, then the most “obvious, elementary” argument for why amplitudes should be complex numbers is one that I spelled out in Quantum Computing Since Democritus, as well as my Is quantum mechanics an island in theoryspace? paper.  Namely, it seems desirable to be able to implement a “fraction” of any unitary operation U: for example, some V such that V2=U, or V3=U.  With complex numbers, this is trivial: we can simply diagonalize U, or use the Hamiltonian picture (i.e., take e-iH/2 where U=e-iH), both of which ultimately depend on the complex numbers being algebraically closed.  Over the reals, by contrast, a 2×2 orthogonal matrix like $$ U = \left(\begin{array}[c]{cc}1 & 0\\0 & -1\end{array}\right)$$

has no 2×2 orthogonal square root, as follows immediately from its determinant being -1.  If we want a square root of U (or rather, of something that acts like U on a subspace) while sticking to real numbers only, then we need to add another dimension, like so: $$ \left(\begin{array}[c]{ccc}1 & 0 & 0\\0 & -1 & 0\\0 & 0&-1\end{array}\right)=\left(\begin{array}[c]{ccc}1 & 0 & 0\\0 & 0 & 1\\0 & -1 & 0\end{array}\right) ^{2} $$

This is directly related to the fact that there’s no way for a Flatlander to “reflect herself” (i.e., switch her left and right sides while leaving everything else unchanged) by any continuous motion, unless she can lift off the plane and rotate herself through the third dimension.  Similarly, for us to reflect ourselves would require rotating through a fourth dimension.

One could reasonably ask: is that it?  Aren’t there any “deeper” reasons in quantum information for why amplitudes should be complex numbers?

Indeed, there are certain phenomena in quantum information that, slightly mysteriously, work out more elegantly if amplitudes are complex than if they’re real.  (By “mysteriously,” I mean not that these phenomena can’t be 100% verified by explicit calculations, but simply that I don’t know of any deep principle by which the results of those calculations could’ve been predicted in advance.)

One famous example of such a phenomenon is due to Bill Wootters: if you take a uniformly random pure state in d dimensions, and then you measure it in an orthonormal basis, what will the probability distribution (p1,…,pd) over the d possible measurement outcomes look like?  The answer, amazingly, is that you’ll get a uniformly random probability distribution: that is, a uniformly random point on the simplex defined by pi≥0 and p1+…+pd=1.  This fact, which I’ve used in several papers, is closely related to Archimedes’ Hat-Box Theorem, beloved by friend-of-the-blog Greg Kuperberg.  But here’s the kicker: it only works if amplitudes are complex numbers.  If amplitudes are real, then the resulting distribution over distributions will be too bunched up near the corners of the probability simplex; if they’re quaternions, it will be too bunched up near the middle.

There’s an even more famous example of such a Goldilocks coincidence—one that’s been elevated, over the past two decades, to exalted titles like “the Axiom of Local Tomography.”  Namely: suppose we have an unknown finite-dimensional mixed state ρ, shared by two players Alice and Bob.  For example, ρ might be an EPR pair, or a correlated classical bit, or simply two qubits both in the state |0⟩.  We imagine that Alice and Bob share many identical copies of ρ, so that they can learn more and more about it by measuring this copy in this basis, that copy in that basis, and so on.

We then ask: can ρ be fully determined from the joint statistics of product measurements—that is, measurements that Alice and Bob can apply separately and locally to their respective subsystems, with no communication between them needed?  A good example here would be the set of measurements that arise in a Bell experiment—measurements that, despite being local, certify that Alice and Bob must share an entangled state.

If we asked the analogous question for classical probability distributions, the answer is clearly “yes.”  That is, once you’ve specified the individual marginals, and you’ve also specified all the possible correlations among the players, you’ve fixed your distribution; there’s nothing further to specify.

For quantum mixed states, the answer again turns out to be yes, but only because amplitudes are complex numbers!  In quantum mechanics over the reals, you could have a 2-qubit state like $$ \rho=\frac{1}{4}\left(\begin{array}[c]{cccc}1 & 0 & 0 & -1\\0 & 1 & 1 & 0\\0 & 1 & 1 & 0\\-1& 0 & 0 & 1\end{array}\right) ,$$

which clearly isn’t the maximally mixed state, yet which is indistinguishable from the maximally mixed state by any local measurement that can be specified using real numbers only.  (Proof: exercise!)

In quantum mechanics over the quaternions, something even “worse” happens: namely, the tensor product of two Hermitian matrices need not be Hermitian.  Alice’s measurement results might be described by the 2×2 quaternionic density matrix $$ \rho_{A}=\frac{1}{2}\left(\begin{array}[c]{cc}1 & -i\\i & 1\end{array}\right), $$

and Bob’s results might be described by the 2×2 quaternionic density matrix $$ \rho_{B}=\frac{1}{2}\left(\begin{array}[c]{cc}1 & -j\\j & 1\end{array}\right), $$

and yet there might not be (and in this case, isn’t) any 4×4 quaternionic density matrix corresponding to ρA⊗ρB, which would explain both results separately.

What’s going on here?  Why do the local measurement statistics underdetermine the global quantum state with real amplitudes, and overdetermine it with quaternionic amplitudes, being in one-to-one correspondence with it only when amplitudes are complex?

We can get some insight by looking at the number of independent real parameters needed to specify a d-dimensional Hermitian matrix.  Over the complex numbers, the number is exactly d2: we need 1 parameter for each of the d diagonal entries, and 2 (a real part and an imaginary part) for each of the d(d-1)/2 upper off-diagonal entries (the lower off-diagonal entries being determined by the upper ones).  Over the real numbers, by contrast, “Hermitian matrices” are just real symmetric matrices, so the number of independent real parameters is only d(d+1)/2.  And over the quaternions, the number is d+4[d(d-1)/2] = 2d(d-1).

Now, it turns out that the Goldilocks phenomenon that we saw above—with local measurement statistics determining a unique global quantum state when and only when amplitudes are complex numbers—ultimately boils down to the simple fact that $$ (d_A d_B)^2 = d_A^2 d_B^2, $$

but $$\frac{d_A d_B (d_A d_B + 1)}{2} > \frac{d_A (d_A + 1)}{2} \cdot \frac{d_B (d_B + 1)}{2},$$

and conversely $$ 2 d_A d_B (d_A d_B – 1) < 2 d_A (d_A – 1) \cdot 2 d_B (d_B – 1).$$

In other words, only with complex numbers does the number of real parameters needed to specify a “global” Hermitian operator, exactly match the product of the number of parameters needed to specify an operator on Alice’s subsystem, and the number of parameters needed to specify an operator on Bob’s.  With real numbers it overcounts, and with quaternions it undercounts.

A major research goal in quantum foundations, since at least the early 2000s, has been to “derive” the formalism of QM purely from “intuitive-sounding, information-theoretic” postulates—analogous to how, in 1905, some guy whose name I forget derived the otherwise strange-looking Lorentz transformations purely from the assumption that the laws of physics (including a fixed, finite value for the speed of light) take the same form in every inertial frame.  There have been some nontrivial successes of this program: most notably, the “axiomatic derivations” of QM due to Lucien Hardy and (more recently) Chiribella et al.  Starting from axioms that sound suitably general and nontechnical (if sometimes unmotivated and weird), these derivations perform the impressive magic trick of deriving the full mathematical structure of QM: complex amplitudes, unitary transformations, tensor products, the Born rule, everything.

However, in every such derivation that I know of, some axiom needs to get introduced to capture “local tomography”: i.e., the “principle” that composite systems must be uniquely determined by the statistics of local measurements.  And while this principle might sound vague and unobjectionable, to those in the business, it’s obvious what it’s going to be used for the second it’s introduced.  Namely, it’s going to be used to rule out quantum mechanics over the real numbers, which would otherwise be a model for the axioms, and thus to “explain” why amplitudes have to be complex.

I confess that I was always dissatisfied with this.  For I kept asking myself: would I have ever formulated the “Principle of Local Tomography” in the first place—or if someone else had proposed it, would I have ever accepted it as intuitive or natural—if I didn’t already know that QM over the complex numbers just happens to satisfy it?  And I could never honestly answer “yes.”  It always felt to me like a textbook example of drawing the target around where the arrow landed—i.e., of handpicking your axioms so that they yield a predetermined conclusion, which is then no more “explained” than it was at the beginning.

Two months ago, something changed for me: namely, I smacked into the “Principle of Local Tomography,” and its reliance on complex numbers, in my own research, when I hadn’t in any sense set out to look for it.  This still doesn’t convince me that the principle is any sort of a-priori necessity.  But it at least convinces me that it’s, you know, the sort of thing you can smack into when you’re not looking for it.

The aforementioned smacking occurred while I was writing up a small part of a huge paper with Guy Rothblum, about a new connection between so-called “gentle measurements” of quantum states (that is, measurements that don’t damage the states much), and the subfield of classical CS called differential privacy.  That connection is a story in itself; here’s our paper and here are some PowerPoint slides.

Anyway, for the paper with Guy, it was of interest to know the following: suppose we have a two-outcome measurement E (let’s say, on n qubits), and suppose it accepts every product state with the same probability p.  Must E then accept every entangled state with probability p as well?  Or, a closely-related question: suppose we know E’s acceptance probabilities on every product state.  Is that enough to determine its acceptance probabilities on all n-qubit states?

I’m embarrassed to admit that I dithered around with these questions, finding complicated proofs for special cases, before I finally stumbled on the one-paragraph, obvious-in-retrospect “Proof from the Book” that slays them in complete generality.

Here it is: if E accepts every product state with probability p, then clearly it accepts every separable mixed state (i.e., every convex combination of product states) with the same probability p.  Now, a well-known result of Braunstein et al., from 1998, states that (surprisingly enough) the separable mixed states have nonzero density within the set of all mixed states, in any given finite dimension.  Also, the probability that E accepts ρ can be written as f(ρ)=Tr(Eρ), which is linear in the entries of ρ.  OK, but a linear function that’s determined on a subset of nonzero density is determined everywhere.  And in particular, if f is constant on that subset then it’s constant everywhere, QED.

But what does any of this have to do with why amplitudes are complex numbers?  Well, it turns out that the 1998 Braunstein et al. result, which was the linchpin of the above argument, only works in complex QM, not in real QM.  We can see its failure in real QM by simply counting parameters, similarly to what we did before.  An n-qubit density matrix requires 4n real parameters to specify (OK, 4n-1, if we demand that the trace is 1).  Even if we restrict to n-qubit density matrices with real entries only, we still need 2n(2n+1)/2 parameters.  By contrast, it’s not hard to show that an n-qubit real separable density matrix can be specified using only 3n real parameters—and indeed, that any such density matrix lies in a 3n-dimensional subspace of the full 2n(2n+1)/2-dimensional space of 2n×2n symmetric matrices.  (This is simply the subspace spanned by all possible tensor products of n Pauli I, X, and Z matrices—excluding the Y matrix, which is the one that involves imaginary numbers.)

But it’s not only the Braunstein et al. result that fails in real QM: the fact that I wanted for my paper with Guy fails as well.  As a counterexample, consider the 2-qubit measurement that accepts the state ρ with probability Tr(Eρ), where $$ E=\frac{1}{2}\left(\begin{array}[c]{cccc}1 & 0 & 0 & -1\\0 & 1 & 1 & 0\\0 & 1 & 1 & 0\\-1 & 0 & 0 & 1\end{array}\right).$$

I invite you to check that this measurement, which we specified using a real matrix, accepts every product state (a|0⟩+b|1⟩)(c|0⟩+d|1⟩), where a,b,c,d are real, with the same probability, namely 1/2—just like the “measurement” that simply returns a coin flip without even looking at the state at all.  And yet the measurement can clearly be nontrivial on entangled states: for example, it always rejects $$\frac{\left|00\right\rangle+\left|11\right\rangle}{\sqrt{2}},$$ and it always accepts $$ \frac{\left|00\right\rangle-\left|11\right\rangle}{\sqrt{2}}.$$

Is it a coincidence that we used exactly the same 4×4 matrix (up to scaling) to produce a counterexample to the real-QM version of Local Tomography, and also to the real-QM version of the property I wanted for the paper with Guy?  Is anything ever a coincidence in this sort of discussion?

I claim that, looked at the right way, Local Tomography and the property I wanted are the same property, their truth in complex QM is the same truth, and their falsehood in real QM is the same falsehood.  Why?  Simply because Tr(Eρ), the probability that the measurement E accepts the mixed state ρ, is a function of two Hermitian matrices E and ρ (both of which can be either “product” or “entangled”), and—crucially—is symmetric under the interchange of E and ρ.

Now it’s time for another confession.  We’ve identified an elegant property of quantum mechanics that’s true but only because amplitudes are complex numbers: namely, if you know the probability that your quantum circuit accepts every product state, then you also know the probability that it accepts an arbitrary state.  Yet, despite its elegance, this property turns out to be nearly useless for “real-world applications” in quantum information and computing.  The reason for the uselessness is that, for the property to kick in, you really do need to know the probabilities on product states almost exactly—meaning (say) to 1/exp(n) accuracy for an n-qubit state.

Once again a simple example illustrates the point.  Suppose n is even, and suppose our measurement simply projects the n-qubit state onto a tensor product of n/2 Bell pairs.  Clearly, this measurement accepts every n-qubit product state with exponentially small probability, even as it accepts the entangled state 
$$\left(\frac{\left|00\right\rangle+\left|11\right\rangle}{\sqrt{2}}\right)^{\otimes n/2}$$

with probability 1.  But this implies that noticing the nontriviality on entangled states, would require knowing the acceptance probabilities on product states to exponential accuracy.

In a sense, then, I come back full circle to my original puzzlement: why should Local Tomography, or (alternatively) the-determination-of-a-circuit’s-behavior-on-arbitrary-states-from-its-behavior-on-product-states, have been important principles for Nature’s laws to satisfy?  Especially given that, in practice, the exponential accuracy required makes it difficult or impossible to exploit these principles anyway?  How could we have known a-priori that these principles would be important—if indeed they are important, and are not just mathematical spandrels?

But, while I remain less than 100% satisfied about “why the complex numbers? why not just the reals?,” there’s one conclusion that my recent circling-back to these questions has made me fully confident about.  Namely: quantum mechanics over the quaternions is a flaming garbage fire, which would’ve been rejected at an extremely early stage of God and the angels’ deliberations about how to construct our universe.

In the literature, when the question of “why not quaternionic amplitudes?” is discussed at all, you’ll typically read things about how the parameter-counting doesn’t quite work out (just like it doesn’t for real QM), or how the tensor product of quaternionic Hermitian matrices need not be Hermitian.  In this paper by McKague, you’ll read that the CHSH game is winnable with probability 1 in quaternionic QM, while in this paper by Fernandez and Schneeberger, you’ll read that the non-commutativity of the quaternions introduces an order-dependence even for spacelike-separated operations.

But none of that does justice to the enormity of the problem.  To put it bluntly: unless something clever is done to fix it, quaternionic QM allows superluminal signaling.  This is easy to demonstrate: suppose Alice holds a qubit in the state |1⟩, while Bob holds a qubit in the state |+⟩ (yes, this will work even for unentangled states!)  Also, let $$U=\left(\begin{array}[c]{cc}1 & 0\\0 & j\end{array}\right) ,~~~V=\left(\begin{array}[c]{cc}1 & 0\\0& i\end{array}\right).$$

We can calculate that, if Alice applies U to her qubit and then Bob applies V to his qubit, Bob will be left with the state $$ \frac{j \left|0\right\rangle +
k \left|1\right\rangle}{\sqrt{2}}.$$

By contrast, if Alice decided to apply U only after Bob applied V, Bob would be left with the state 
$$ \frac{j \left|0\right\rangle – k \left|1\right\rangle}{\sqrt{2}}.$$

But Bob can distinguish these two states with certainty, for example by applying the unitary $$ \frac{1}{\sqrt{2}}\left(\begin{array}[c]{cc}j & k\\k & j\end{array}\right). $$

Therefore Alice communicated a bit to Bob.

I’m aware that there’s a whole literature on quaternionic QM, including for example a book by Adler.  Would anyone who knows that literature be kind enough to enlighten us on how it proposes to escape the signaling problem?  Regardless of the answer, though, it seems worth knowing that the “naïve” version of quaternionic QM—i.e., the version that gets invoked in quantum information discussions like the ones I mentioned above—is just immediately blasted to smithereens by the signaling problem, without the need for any subtle considerations like the ones that differentiate real from complex QM.

Update (Dec. 20): In response to this post, Stephen Adler was kind enough to email me with further details about his quaternionic QM proposal, and to allow me to share them here. Briefly, Adler completely agrees that quaternionic QM inevitably leads to superluminal signaling—but in his proposal, the surprising and nontrivial part is that quaternionic QM would reduce to standard, complex QM at large distances. In particular, the strength of a superluminal signal would fall off exponentially with distance, quickly becoming negligible beyond the Planck or grand unification scales. Despite this, Adler says that he eventually abandoned his proposal for quaternionic QM, since he was unable to make specific particle physics ideas work out (but the quaternionic QM proposal then influenced his later work).

Unrelated Update (Dec. 18): Probably many of you have already seen it, and/or already know what it covers, but the NYT profile of Donald Knuth (entitled “The Yoda of Silicon Valley”) is enjoyable and nicely written.

114 Responses to “Why are amplitudes complex?”

  1. Joshua Zelinsky Says:

    So one take away is that if we want to make a scifi story that has FTL communication, one reasonable technobabble is to have people discover a class of particles who act with a quaternionic version of QM? I’m surprised this isn’t more well known (it certainly is more interesting than just saying the word “tachyon”).

    More serious comments: My understanding is that non-linearity allows both larger complexity classes than BQP and allows FTL signaling. But quaternionic QM despite allowing FTL only is giving BQP for its polynomial time class. How big is quaternionic QM if it is done with special relativity? My guess would be that this should allow you do all of PSPACE in polynomial time. Is this correct? Is this well-defined? (It isn’t completely obvious to me that there’s a consistent version of quaternionic QM with special relativity).

    Second, how much worse do things get if one tries to do QM over the octonions? Is this even something we can reasonably define? What about generic levels of the Cayley-Dickson construction?

  2. Sanketh Says:

    How could you mention Chris Fuchs and not talk about SICs?? Maybe I am disillusioned but it seems like you are talking about SICs all the way through the post.

    A natural question that comes up when trying to implement a quantum OS is quantum mechanics over a finite field. My understanding is that a similar question comes up in cryptography. I wonder if all these properties stick if we are working over (the algebraic closure of) a finite field?

  3. Scott Says:

    Sanketh #2: I’m going to institute a blanket ban on comments of the form “how could you not have mentioned X??,” if they don’t concretely explain how X relates to what I was talking about (the same person having thought about both is not a sufficient relationship).

    Yes, I’ve seen some amusing papers about QM over finite fields; the obvious difficulty there is that you’ve got no “greater than” and “less than” relationships and no clear way to get probabilities. Or if you do try to put an operational interpretation on it, you find you can solve NP-hard problems in polynomial time and things like that.

  4. Peter Morgan Says:

    There are two elementary ways in which a complex structure appears: fourier analysis and the Hodge dual. Engineers introduce a j just to give a basis for the 2-dimensional space spanned by sine and cosine components, which of course comes in handy. Fourier analysis appears naturally in field theories just because of translation invariance, which plays down into finite-dimensional subsystems, but it also appears in probability theory if we ever find it useful to introduce a characteristic function, which leads to a curious elementary paper by Leon Cohen, Foundations of Physics 18, 983 (1988), “Rules of Probability in Quantum Mechanics”. The Hodge dual more-or-less only works cleanly if we work only with 2-forms in electromagnetism/quantum optics, but anyway it’s there.
    I’d be glad to know of other elementary ways in which a complex structure appears?

  5. Sniffnoy Says:

    Scott: So with the particular result that didn’t work over the reals, does that work over the quaternions, since you don’t mention it? It seems naively like it might, since quaternions fail local tomography in the opposite direction, even if the quaternions don’t work for other reasons. Or maybe it really is equivalent to local tomography and they fail it too!

    Also, sorry, basic question, but I’m confused by the superluminal signalling — how do Alice and Bob’s operations affect the other’s qubits? I thought they had two separate qubits? There must be something implicit in the setup I’m missing here.

    Josh: I’m pretty sure the non-associativity wrecks things. Like, consider that L_{ab} is not necessarily L_a L_b. (Where L_a is left-multiplication by a.) So — even in one dimension, where the matrices are individual numbers — matrix multiplication and matrix composition are no longer the same thing. I mean, composition has to be associative, while multiplication here isn’t. That… well, OK, maybe that’s possible to recover from somehow, but I’d be very surprised. I mean, will L_a L_b even in general be of the form L_c at all? Probably not, right? “Composition” is probably not well-defined unless you pass to a larger space. This seems pretty hard to recover from.

    This does makes me wonder how bad other sorts of pathologies are… like, rather than confining ourself to Cayley-Dickson, we could ask about algebras over the reals more generally. What about ones that aren’t division algebras, that have zero-divisors? Could those work at all? (There’s likely an obvious problem with those too but I don’t know enough QM for it to be obvious to me. 😛 )

    What I’m wondering about, and this might also be a dumb question, is what if you didn’t start from the reals but rather, say, the rationals. Because then you get all sorts of other algebras over those, right? Do those all embed in the quaternions? I don’t know this area well enough. Maybe some of those might produce something interesting (if totally unphysical 😛 ).

  6. Scott Says:

    Joshua #1:

      My understanding is that non-linearity allows both larger complexity classes than BQP and allows FTL signaling. But quaternionic QM despite allowing FTL only is giving BQP for its polynomial time class.

    Interesting observation, yes!

      How big is quaternionic QM if it is done with special relativity?

    If you say: quaternionic QM gives you superluminal signaling, and superluminal signaling + SR gives you CTCs, and if you model CTCs the way Deutsch did, then my result with Watrous would indeed give an answer of PSPACE!

      Second, how much worse do things get if one tries to do QM over the octonions?

    Given what a garbage fire I decided quaternionic QM already was, I shudder to think… 🙂

  7. Scott Says:

    Sniffnoy #5: Yes, I believe you do have unique reconstruction of the global state (or circuit) from its behavior on product probes in quaternionic QM. But there, again, you have to be extremely careful what you mean, since the global state might not even exist!

    For present purposes, I was treating the distinction between (e.g.) the reals and the rationals as “unphysical,” since no finite experiment could ever tell us whether we were dealing with real amplitudes, or rational ones that approximated them arbitrarily closely.

  8. Job Says:

    Moreover, these variants turn out to lead to precisely the same power for quantum computers—namely, the class BQP—as “standard” quantum mechanics, the one over the complex numbers.

    Does that mean quantum computers can efficiently simulate a quaternionic world?

    That is confusing to me given your superluminal example. I would expect superluminal signaling to have applications beyond BQP.

  9. Dmitry Says:

    Hi Scott,

    Thanks for this post, it’s a topic I am interested in.
    Is there a good writeup somewhere attempting to get “as far as possible” into a derivation of QM over the reals? I feel like it might be useful pedagogically to introduce QM without complex numbers if such a formulation captures most of the essential properties of QM.

  10. murmur Says:

    Like Sniffnoy #5 I also don’t understand how Alice and Bob’s local operations (in quaternionic QM) affect others’ qubits. Can you explain?

  11. Sniffnoy Says:

    For present purposes, I was treating the distinction between (e.g.) the reals and the rationals as “unphysical,” since no finite experiment could ever tell us whether we were dealing with real amplitudes, or rational ones that approximated them arbitrarily closely.

    But what if you were using not rationals, but some weird algebra over the rationals, that doesn’t work over the reals, is what I was wondering? Like I said, this is possible a stupid question, but it’s not about just using Q directly (or Q(i), or rational quaternions).

  12. Scott Says:

    Sniffnoy #5 and murmur #10: The key is that, because quaternions don’t commute, multiplying the wavefunction by a global phase can actually matter. And multiplying by a global phase is something that Bob can notice even if Alice does it. If you don’t believe me, then go through the example in the OP with pen and paper and see for yourself! 🙂

  13. Scott Says:

    Sniffnoy #11: Can you give me an example of the sort of algebra you’re talking about? Keeping in mind that, for physics purposes, I’m making no distinction between A and B if A is dense in B?

  14. Sniffnoy Says:

    Well, I guess come to think of it, my “what about over the rationals” question is closely related to my “what if we allow zero-divisors” question. Since we can always perform extension of scalars, right? The only question is whether the result has zero-divisors or not. So, maybe forget the rationals and just focus on the zero-divisors. 🙂

    (A silly example here would be, like, Q(√2); obviously if you perform extension of scalars to R there you will get zero-divisors. As for less-silly examples… um, I dunno, like I said I don’t know this area well enough. 😛 I was under the impression they exist but maybe they don’t.)

  15. Job Says:

    If Alice and Bob’s qubits are not entangled, then what distinguishes Alice’s qubit from any other qubit in the universe?

    How would Bob know that U was applied to Alice’s qubit rather than a qubit belonging to someone else?

  16. Sniffnoy Says:

    Er to be clear Q(√2) is a silly example because it already embeds in R. Ideally we’d want something that doesn’t embed in H, like. This is where once again I admit that I don’t know this area well enough.

  17. Scott Says:

    Job #15: Yes, that’s precisely the kind of question that caused me to characterize quaternionic QM as a “flaming garbage fire” that makes zero physical sense—unless Adler or the others have some clever way to tame the nonlocality that I don’t understand.

  18. Anonymous Says:

    One obvious thing about QM that bugs almost everyone is how does a particle know about the paths it didn’t take and how can they all interfere with measurement. I was thinking maybe a particle is always entangled to at least the universe. The universe would have the resources to calculate the paths the particle could have taken and that could be beneficial to the universe acting sort of like a Maxwell’s Demon that generates more energy than it uses and allows the universe to assert it’s will on the particles because it has way more knowledge of the possible futures.

    If a free single particle is entangled with the universe, only real numbers can be used instead of a complex numbers the universe would control and store the imaginary part the particle would have the real part. With two entangled particles the universe would be the hidden third partner — instead of 4 complex numbers — 8 real numbers.

    I searched for QM methods that just use real numbers and found several from at least 2005 that only used Hadamard and Toffoli gates both of which just have real numbers. One qubit was added to store the imaginary part used in ordinary QM. This not only uses just real numbers but a resticted set of real numbers that are related to roots of 1/2 since that is the only non integer in the Hadamard gate and the Toffoli gate is all integers.

    Although none of the authors in the Arxiv offered the interpretation of entangled with the universe, if you aren’t theophobic the idea is fascinating.

  19. Sniffnoy Says:

    Scott #12: So, to see if I have this straight, those |1>’s in the descriptions of Bob’s state are not the actual original |1>, but rather we have “renamed” |1> to mean the state that Alice now has, rather than the one she originally had? And renamed |0> similarly because we are applying a global phase change?

  20. Scott Says:

    Sniffnoy #19: No, nothing remotely like that is going on; don’t know where you got the idea that it was. There’s just the usual matrix-vector multiplication that happens in any quantum evolution, except with the non-commutativity of the quaternions now wreaking havoc. In particular, taking a single state |ψ⟩=(|0⟩+i|1⟩)/√2 and multiplying all amplitudes by j can now lead to two orthogonal states, depending on whether the multiplication takes place on the left or on the right!

    Which presumably means: either you need to fix a total ordering on all spatial locations (determining who multiplies to the left and to the right of whom), or else you need to accept signaling between the locations even when they aren’t entangled.

  21. Aaron Denney Says:

    I couldn’t believe the nonlocal communication example until I worked through the math explicitly myself:

    Thankfully, we always have pure states.

    Starting with [ 0 0 1 1]^T / sqrt(2), and applying U=diag(1 1 j j) followed by V=diag(1 i 1 i) (i.e. the matrix VU = diag (1 i j k) ) we get
    [ 0 0 j k]^T / sqrt(2) vs [ 0 0 j -k]^T / sqrt(2) for V then U (i.e. the matrix UV=diag(1 i k j)) .

    Bob applying W = [ j k; k j]/sqrt(2) takes these to [0 0 -1 0]^T and
    [0 0 0 i]^T, which are indeed disjoint in the standard basis.

    And W is indeed unitary: W W^\dagger = W W^* = I_2 (jj + kk = -2, and jk + kj = 0), though I’m not sure why you didn’t take the simpler looking [1 i; i 1]/sqrt(2) for W.

    The key difference is that in complex-QM applying standard-basis phases in a local manner always commutes, but *doesn’t* in quaternionic-QM, and not merely in an up-to-global-phase manner.

  22. Sniffnoy Says:

    Scott #20: I’m quite confused then. Can we do this out?

    Alice has |1>. Bob has |+>, which means (|0>+|1>)/√2, right? (Or is it (|0>+i|1>)/√2, based on the rest of the comment? Well, it won’t make much difference.) Alice applies U to her qubit. So she now has j|1>, and Bob still has (|0>+|1>)/√2. Then Bob applies V to his qubit, so Alice still has j|1>, and Bob now has (|0>+i|1>)/√2. I’m not seeing anything like what you say; and as mentioned if I had what |+> means wrong it doesn’t really affect anything. Indeed, since the operations are acting on different qubits, they commute. I’m assuming I’m misunderstanding something here. What is it?

    Pre-posting edit: Oh, I think I see what my mistake is. The mistake is the assumption that Alice acting on her qubit doesn’t affect Bob’s; rather, when I actually do it out… well, I didn’t actually do it all the way out because I’m too lazy, but rather it looks like there’s a phase change on Bob’s. Which normally we could ignore, and say that Bob’s qubit is unaffected — because, I mean, duh — but due to the issues you mention, we can’t do that here. Yikes! So yeah, global phase becomes a real thing you have to keep track of, meaning the normally-fictional effect on everyone else’s qubits of acting on your own suddenly becomes very real, allowing superluminal signalling. Hoo boy…

    In particular, taking a single state |ψ⟩=(|0⟩+i|1⟩)/√2 and multiplying all amplitudes by j can now lead to two orthogonal states, depending on whether the multiplication takes place on the left or on the right!

    That at least is an easy problem to solve; you have to fix a side to multiply on in advance. I mean, when we speak of linear algebra over H, we just mean modules over H, not bimodules. Only multiplication on one side is meaningful, and you have to pick in advance which it is.

    …except that, due to noncommutativity, whichever side you’re doing your multiplications on, your linear transformations will be given by multiplication on the other side. So I guess I should say, the two sides have different meanings, rather than one side not being meaningful. In any case, you don’t get to choose and end up with two different results.

  23. Matt V. Says:

    Scott– A question that’s been bugging me: How is extending quantum theory from the complex numbers to the quaternions materially different from just taking quantum theory over the complex numbers and adjoining an ancillary spin-1/2 system, given that the 2×2 Pauli matrices give a representation of the quaternions?

  24. James B Says:

    John Baez made an argument for the complex numbers in QM from the correspondence of observables with symmetry generators. (See e.g. https://johncarlosbaez.files.wordpress.com/2018/09/noether_cqt_web2.pdf ) Any thoughts?

  25. Spencer Bliven Says:

    Scott #6:

    Second, how much worse do things get if one tries to do QM over the octonions?

    Given what a garbage fire I decided quaternionic QM already was, I shudder to think… ????

    Assuming it doesn’t work with octonions, would that provide evidence against string theories that rely on spinors?

  26. Scott Says:

    Sniffnoy #22: If we had to pick in advance which side to multiply on, what would that physically mean? Suppose, for example, that we had three agents A, B, and C, and we decided that A always multiplies to the left of B who always multiplies to the left of C. And suppose it happens that first A applies a unitary operation, then C, then B. Do B’s contributions now need to get inserted in the middle? Meaning: does our “state” now need to record, not merely the quaternionic amplitude for each basis state, but also the detailed formula that gave rise to that amplitude, so that we know how to perform such an insertion if necessary?

  27. Scott Says:

    Matt V. #23: Indeed, what you say is precisely how you simulate an n-qubit quaternionic quantum computer using an (n+1)-qubit standard quantum computer—which I alluded to in the post as being possible (see the Fernandez-Schneeberger paper for a fully explicit proof). But notice that that simulation, because of its reliance on a “magical (n+1)st qubit in the sky,” breaks spatial locality (even as it preserves computational efficiency).

    In a completely analogous way, you can simulate an n-qubit standard quantum computer using an (n+1)-qubit real amplitude quantum computer (indeed, I typically give this as homework in my QC courses). Again, the simulation relies on a “magical (n+1)st qubit in the sky” and so breaks spatial locality.

    But the difference is the following: complex QM is a perfectly sensible theory in its own right (well, modulo the measurement problem and so forth 🙂 )—one where, in particular, a No-Communication Theorem can be formulated and proved. With quaternionic QM, by contrast, the nonlocality in the simulation of the theory, is gesturing toward a nonlocality that actually manifests in the theory itself (because of quaternions’ noncommutativity), and that leads to superluminal signaling!

  28. Scott Says:

    Spencer #25:

      Assuming it doesn’t work with octonions, would that provide evidence against string theories that rely on spinors?

    No, the things have nothing to do with each other. The Dirac theory of the electron, supersymmetry, and string theory are all examples of theories that involve algebraic structures more complicated than the complex numbers—but crucially, they all use those structures in what this post called the “application layer,” while still running on the same old quantum OS underneath. I.e., in all these theories, the amplitude for an event to happen is still a complex number, the same as it was in 1926.

  29. Matt V. Says:

    Thanks Scott. That’s exactly what I thought – the ancillary spin-1/2 system (what you call “the (n+1)st qubit in the sky”) violates commutativity between spatially separated systems and thereby violates the no-communication theorem, and that’s one way to understand why quaternionic quantum theory is nonlocal.

    A lingering question: naively, going from complex to quaternionic quantum theory is trivial in this way, but going from real to complex quantum theory seems less trivial, because adding an “(n+1)st qubit in the sky” to real quantum theory likewise adds new noncommutativity and violates spatial locality, no? Actually, I’m also not quite clear on how this approach supposedly gives us just “i,” which commutes with everything.

  30. Scott Says:

    Matt #29: If you like, “adding an (n+1)st qubit in the sky” has the potential to break locality and introduce noncommutativity of spacelike-separated operations. But when you pass from real to complex QM, it ends up not doing so—precisely because all you’re ever using the extra qubit for is to simulate complex QM, and we already know that complex QM doesn’t have those bad properties. (Here I’m assuming that you never actually measure the qubit in the sky, or do anything else with it that wouldn’t be allowed in the complex QM that you’re simulating.)

    Other possible extensions involving “qubits in the sky” do indeed have the bad properties—with quaternionic QM (or at least the naïve version of it) furnishing a perfect example.

  31. Matt V. Says:

    Scott– Right, but how do we get “i” in this way from starting with real QM and adding an ancillary qubit? The Pauli matrices (let’s stick with sigma_x and sigma_z to stay intrinsically real-valued) for the (n+1)st ancillary qubit do commute with all the operators for the first n qubits, just as “i” should commute with everything in the n-qubit system, but we need more properties than that – we need the real-transpose operation to be the complex-conjugation/adjoint operation. We also need “i” to show up in energy eigenstates, [x,p]=ihbar, etc.

  32. Scott Says:

    Matt #31: See the Fernandez-Schneeberger paper for the details. It’s just a mathematical point about embedding a unitary group into a higher-dimensional orthogonal group, and encoding a single complex amplitude by a pair of real amplitudes (on basis states that you’ll never distinguish)—you seem to be reading way too much into it.

  33. Ezra Says:

    You can already find quaternions hiding in regular, complex QM – in the words of Wikipedia, “The real linear span of {I, iσ[1], iσ[2], iσ[3]} [for σ[a] the Pauli matrices] is isomorphic to the real algebra of quaternions ℍ.”

    One effect of that is that expressions over complex numbers and Pauli matrices can be rewritten as expressions over complex numbers and quaternions, without reference to matrices. Which is at least pleasing to algebraists.

  34. Veit Elser Says:

    The complex numbers, unlike the reals, have a non-trivial automorphism. Apologies if this was already pointed out, but it didn’t show up when I searched “automorphism” on this page and your post. As you know, complex-conjugation is the mathematical formalism’s way of letting us know about discrete symmetries in the underlying model (Nature).

  35. Scott Says:

    Veit #34: We’re going to have to institute a policy against “drive-by factings”! 🙂 I’m aware, of course, that the complex numbers have a nontrivial automorphism—but why exactly is it important for physics that the quantum-mechanical amplitudes should take values in a field that has such an automorphism?

  36. Sniffnoy Says:

    Scott #26: Yeah that would appear to be a problem, yes.

  37. Job Says:

    The key is that, because quaternions don’t commute, multiplying the wavefunction by a global phase can actually matter. And multiplying by a global phase is something that Bob can notice even if Alice does it.

    Does that also apply to the global phase for a subset of qubits, or is there only one global phase for the universe?

    What constitutes a system in this model? That’s the part i find confusing.

    I guess there’s only one system (the universe) since apparently there’s no locality.

    But then the superluminal channel would operate on the shared global phase for the whole universe, which makes it really noisy.

  38. Ted Says:

    I really like this post, partially because I think it clarifies the “degrees-of-freedom” argument which, respectfully, I think you slightly oversimplified in Quantum Computing Since Democritus.

    In QCSD you say (slightly reordering phrases for clarity) “There are exactly N^2 independent real parameters in an N-dimensional mixed state – provided we assume, for convenience, that the state doesn’t have to be normalized. … Intuitively, it seems like the number of parameters needed to describe [a composite system] AB … should equal the product of the number of parameters needed to describe A and the number of parameters needed to describe B.”

    Indeed that would be a natural supposition, but it doesn’t hold in complex QM! I never understood your argument, because your neglect of normalization “for convenience” always seemed to completely undermine the argument. A physical mixed state *doesn’t* uniquely correspond to a (positive-semidefinite) Hermitian operator, but to a *trace-one* (positive-semidefinite) Hermitian operator. When your entire argument hinges on counting degrees of freedom, it seems awfully suspicious to insert an extra dummy d.o.f. “for convenience”.

    But I now see (I think) that what you were really getting at was the axiom of local tomography, not just “d.o.f.’s of a composite system is product of d.o.f.’s of individual systems.”

  39. Alex Wilce Says:

    Scott, a nice post, and a nice example. Just two comments:

    1) There’s a point of view according to which, while “quanternionic quantum mechanics” isn’t a thing (because there’s no sensible tensor product), there *does* exist a sensible hybrid of real and quaternionic QM, in which the tensor product of two quaternionic Hilbert spaces is a real Hilbert space, and the tensor product of a real and a quaternionic Hilbert space is quaternionic. This is explained well, and I think compellingly, by Baez in “Division algebras and quantum theory”, https://arxiv.org/pdf/1101.5690.pdf. Thus, the various pathologies of the quaternionic tensor product aren’t in themselves a conclusive argument against the reasonableness of individual quaternionic quantum systems.

    2) While I don’t find local tomography completely compelling as an axiom, neither do I think it’s just an ad hoc technical device for ruling out non-complex quantum systems. In fact, I think it’s a sufficiently principle that anyone starting to think abstractly about what a composite system (one describing joint measurements and joint probabilities) should look like, would be very likely to assume it, at least on a first pass.

  40. Will Says:

    Is there any reason why quantum mechanics needs to have a concept of space? There’s nothing about space in the basic axioms of QM, right? This would “solve” the superluminal signalling problem–nothing can be superluminal if everything is happening in the same “place”.

  41. Scott Says:

    Will #40: Indeed, abstract QM (by which I mean, quantum information) doesn’t necessarily have any notion of “space.” However, one difficulty with this is that the world we live in does have space. A second difficulty is that, while quantum information doesn’t have “space,” it does have Alice and Bob, and they need to be in tensor product with each other.

  42. jkl; Says:

    Cool fact: The group SU(2) is isomorphic to the unit quaternions, much in the same way as SO(2) is isomorphic to the complex numbers. So, much as the complex numbers represent rotation and scaling operations in 2D *real* space, the quaternions represent rotations and scalings in 2D *complex* space.

    Said another way, unitary operators that act on a qubit include as a subset all the unit quaternions.

  43. Will Says:

    Or maybe more to the point–what’s wrong with superluminal signalling? Doesn’t ordinary non-relativistic QM also allow superluminal signalling? Does that make it a “flaming garbage fire” or does it just make it the wrong theory for our particular reality where special relativity holds? Yes, it means that the quaternionic “quantum OS” can’t support the special relativity app, but is that really a devastating blow? We’re already talking about totally imaginary realities here. Is the problem with quaternionic QM you are trying to point out not exactly superluminal signalling, but something more like “maximal nonlocality”, where Alice’s measurement instantly affects every system in the Universe in a potentially measurable way?

  44. Scott Says:

    Will #43: Yes, that’s a good way to put it. In nonrelativistic QM, a sum of local Hamiltonians can propagate signals arbitrarily far in an arbitrarily short time, but at least the effect falls off exponentially with the distance. So it “could have been” a garbage fire but something douses the flame. With quaternionic QM, by contrast, you effectively don’t seem to have a useful tensor product decomposition at all.

  45. Mark Says:

    in 1905, some guy whose name I forget derived the otherwise strange-looking Lorentz transformations purely from the assumption that the laws of physics (including a fixed, finite value for the speed of light) take the same form in every inertial frame

    You might be thinking of Vladimir Ignatowski, who attempted to derive the Lorentz transform without the speed of light assumption. Some modern papers along the same line include:

    – Palash B. Pal, Nothing But Relativity

    – Joel W. Gannett, Nothing But Relativity, Redux

    – Jean-Marc Lévy-Leblond, One More Derivation of the Lorentz Transformation

  46. Scott Says:

    Mark #45: No, Ignatowski’s not the one, Wikipedia says that was 1910 … am I thinking of Poincare? Minkowski?

  47. jkl; Says:

    This is way outside my area, but let me see if I can help:

    The complex dot product is weird because: a.b = (b.a)*
    But this weirdness can be “justified” by doing the following trick to the expression a.b:

    – Replace the components of ‘a’ and ‘b’ with 2×2 matrices. These matrices are the standard representation of complex numbers as 2×2 real matrices. Call these matrices A and B.

    – Write A.B as A^T B

    – Turn the 2×2 blocks in the matrices A^T and B back into complex numbers

    – You get the weird conjugation operation in the dot product out. Congratulations!

    Similarly, and I’m asking this as a question: can you interpret a quaternionic quantum-mechanical operator as actually being a complex operator, with the quaternions simply representing complex numbers? That would be similar to what I wrote above concerning “a.b”. If you do that, then quaternionic QM should (I guess) be a *restriction* of ordinary QM, where the complex operators have to be made up of 2×2 blocks, where each such 2×2 block is “quaternionic”. See my previous comment for more context.


  48. Veit Elser Says:

    Scott: Imagine you are Schroedinger. You have just written down the equation that, you speculate, may supersede the equations of mechanics. “Classical mechanics”, if you’re right, would turn out to be an approximate solution. Backward compatibility is a concern, and the single time derivative has you worried. But then it hits you: time-reversibility is beautifully upheld if t -> -t is combined with i -> -i. Suddenly you’re feeling a lot better about that wave being complex-valued!

  49. fred Says:

    Is this question fundamentally different in its nature than asking why quaternions so elegantly describe 3D spatial rotations (they avoid gimbal lock problem of pitch/yaw/roll, are more compact than matrices)?

    Also, I would assume that it’s common practice to use real or complex numbers to prove facts about number theory? Why is it the case?

    It’s interesting in the sense that it ties the power of mathematical tools with the observed nature of reality.

  50. fred Says:

    I think this is related to the Archimedes’ Hat-Box Theorem – why is a sphere’s surface area four times its shadow?

  51. Scott Says:

    fred: Yeah, I already mentioned in the post that Wootters’ theorem is directly related to Archimedes’ Hat-Box Theorem. Taking a random 1-qubit pure state, and asking what probability distribution it yields when measured in the |0⟩,|1⟩ basis, is the same thing as taking a random point on the surface of the Bloch sphere, and projecting it horizontally onto the cylinder enclosing the sphere. But the whole point of the Hat-Box Theorem is that that’s going to give you a uniform distribution over points on the cylinder (which is also why the sphere and the cylinder have the same surface areas).

    As for quaternions being useful to describe 3D rotations—yes, they are, but can you spell out why that’s related to any of the questions discussed in the post, and is not just another “drive-by facting”? 🙂

  52. Jon A. Says:

    This is great! However, Greg Moore has shown that families of quantum systems parametrized by non-commutative control parameters allow non-commutative amplitudes:

    Beyond the Xzibit meme (“Yo Dawg, I heard you like Quantum Observables, so I controlled your Quantum Observables with a Quantum Observables so your Observations are Quantum Observables!”), Moore lists 5 applications. The Xzibit meme is, however, suggestive – maybe there’s somehow relevance to the post-quantum world.

  53. John Sidles Says:

    Scott says (#45)  … Ignatowski’s not the one, Wikipedia says that was 1910 … am I thinking of Poincare? Minkowski?

    Lol — perhaps the pioneering relativist you’re thinking of is Robert Cromie (1895)?

    A little more seriously, and with reference to last week’s Shtetl Optimized post “The NP genie“, here is a Nondeterministic Polynomial Genie Query (NPGQ) that addresses the OP question:

    An ECT NPGQ  Genie, what is a 1000-character proof strategy that, with ordinary mathematical and scientific diligence, can produce a proof — specifically, a peer-reviewed, journal-published, consensus-accepted proof — that of the following five traits of a dynamical formalism,

      • Second-Law thermodynamics
      • local informatic causality
      • special relativistic invariance
      • complex-structured state-space
      • local gauge invariance

    any three traits imply all five, and moreover, all such five-trait dynamical formalisms satisfy the Extended Church-Turing Thesis (ECT)?

    Resources: for a lucid in-context discussion of the ECT, it’s mighty tough to beat the just-published Quantum Computing: Progress and Prospects (National Academies of Sciences, Engineering, and Medicine, 2018, PDF here, see in particular pages ix,xi,1,10,14,41,58). The authors and reviewers of this outstanding consensus study report — as such National Academy documents are called — deserve all of our appreciation and thanks.

    An important practical motivation for posing this particular ECT-centric NPGQ, is that an affirmative answer would guarantee the existence of simulation-guided engineering strategies for realizing ongoing improvements in the reproducibility of humanity’s new — and for the first time, wholly quantum! — Revised International System of Units (SI).

    Note in particular, that an affirmative answer to the above ECT NPGQ, which would establish the physical and engineering “goodness” of an ECT-compatible quantum SI, in turn requires a complex-structured quantum state-space … hence the natural appearance of “i” in quantum equations of motion.

  54. New top story on Hacker News: Why are amplitudes complex? – Hckr News Says:

    […] Why are amplitudes complex? 5 by beefman | 0 comments on Hacker News. […]

  55. Blake Stacey Says:

    Joshua Zelinsky (#1) asked,

    Second, how much worse do things get if one tries to do QM over the octonions? Is this even something we can reasonably define?

    This is a thing we can define, but only up to dimension 3. That is, there is no octonionic analogue of any system of higher dimensionality than a qutrit. The state space is the set of unit-trace elements of the “Albert algebra“; pure states are points in the octonionic projective plane. While there have been attempts over the years to find a physics application for this, to my knowledge, it’s mostly of interest to group theorists; for example, the Albert algebra’s group of determinant-preserving linear maps is the exceptional Lie group E_6.

    Sanketh (#2) asked,

    How could you mention Chris Fuchs and not talk about SICs??

    Chris is busy giving final exams this week, but as a colleague, I’ll chime in here: One reason that SICs might be very relevant indeed for this topic is that they give us a reason to prefer the complex numbers. The maximum number of equiangular lines in a space of dimension d (the so-called “Gerzon bound”) is d(d+1)/2 for real vector spaces and d^2 for complex. In the real case, the Gerzon bound is only known to be attained in dimensions 2, 3, 7 and 23, and we know it can’t be attained in general.* That is, symmetric informationally complete measurements don’t generally exist for real-vector-space quantum mechanics, whereas it sure looks like they do for the complex case. For example, over the real numbers you can’t get more equiangular lines in dimension 4 than in dimension 3, and you can’t do better in dimension 8 than in dimension 7. People haven’t investigated the quaternionic analogue as much, but numerical searches to date have not found any “quaternionic SICs” in dimensions larger than 3.

    *If you like peculiar alignments of mathematical topics, the appearance of 7 and 23 might make your ears prick up here. If you made the wild guess that the octonions and the Leech lattice are just around the corner … you’d be absolutely right.

  56. Pontus Says:

    Not a derivation in any way, but the rationale that I always used to convince myself that complex numbers are necessary for QM is by looking at the path integral formulation. The most democratic assignment of amplitudes gives each path an amplitude of equal weight. If we only use real numbers that means that we only have the sign to mess around with. By continuity of the action with respect to variations in the path, we see that we can’t even flip the sign. This in turn means no interference.

    In order to both assign equal weights to all paths AND allow interference, we need some way of smoothly interpolating between positive and negative numbers without ever changing their magnitude. The simplest way to do that is using complex numbers.

  57. akhmeteli Says:

    Scott Aaronson wrote: “The Schrödinger equation [does not] make much sense without complex numbers (though it can be fun to try).”

    I would like to mention one such (all but forgotten) attempt by none other than Schrödinger himself (Nature, v.169, p.538 (1952)). He considered the Klein-Gordon equation in electromagnetic field, rather than the nonrelativistic Schrödinger equation, and noted that the wave function can be made real by a gauge transformation. His comment: “That the wave function of [the Klein-Gordon equation] can be made real by a change of gauge is but a truism, though it contradicts the widespread belief about `charged’ fields requiring complex representation.”

    Schrödinger concluded his 1952 article with the following note: “One is interested in what happens when [the Klein-Gordon equation] is replaced by Dirac’s wave equation of 1927, or other first-order equations.” The approach of the 1952 article cannot be applied directly to the Dirac equation, as one cannot make real all four complex components of the Dirac spinor by a gauge transformation, but I found out that in a general case the Dirac equation in arbitrary electromagnetic field is equivalent to an equation for just one real function, as three out of four complex components of the Dirac spinor function can be algebraically eliminated from the Dirac equation, and the remaining component can be made real by a gauge transformation (J. Math. Phys. 52, 082303 (2011), http://akhmeteli.org/wp-content/uploads/2011/08/JMAPAQ528082303_1.pdf ) A similar result is true for the Dirac equation in Yang-Mills field (https://arxiv.org/abs/1811.02441). So real functions might be sufficient for quantum theory, at least in some important and general cases.

  58. Daniel Says:

    Hi Scott,

    “Why do the local measurement statistics underdetermine the global quantum state with real amplitudes, and overdetermine it with quaternionic amplitudes, being in one-to-one correspondence with it only when amplitudes are complex?”

    I think this also hinges on the question of why matrices (a non-commutative structure) are required in QM, and in a way that is not reducible to its commutative subsets. It seems to me that Why Complex Numbers and Why Non-commutativity are closely related questions.
    But I sometimes also wonder if there really are any meaningful, physical reasons behind these questions, just like why the gravitational force is proportional to r^-2 and not to r^-1.5 or to r^-3, other than the fact that if it is not r^-2 then the universe would be a very different place. What kind of reason do we expect to find, and how do we know it is the reason we have been looking for when we see it?

  59. fred Says:

    Scott #51

    “As for quaternions being useful to describe 3D rotations—yes, they are, but can you spell out why that’s related to any of the questions discussed in the post, and is not just another “drive-by facting”?”

    That was just an example to a more meta question about your question.
    I was not just pointing out that quaternions are useful for 3D rotations, but similarly to you, asking why it’s the case (something special about transformations in 3D space vs 2D, 4D,… space).
    Did we ever get meaningful answers from asking why any given mathematical tool is the right one to describe such and such aspect of nature? (besides just nothing out that it works).
    Are there simple (but non trivial) examples where such a “why only this mathematical tool works?” has led to some extra insight?

  60. fred Says:

    To expand a little bit, there’s one equivalent in CS, where we say that a given system/language just isn’t expressive/powerful enough to lead to a Turing Complete system.

  61. Neel Krishnaswami Says:

    When you post the paper on differential privacy on the arxiv, add a link on your blog, too. The category of metric spaces and short maps gives a pretty nice setting for studying differential privacy, and it’s a monoidal closed category, which makes me curious if there is a mostly-algebraic path to relating QC and DP in an interesting way.

  62. Age bronze Says:

    What would be the probability structure that would be recognized on a system which is completely deterministic and reversible?

    If it was finite, the probabilities of each state would be vectors, and still transform by matrix multiplication, and the matrix would have to be a permutation matrix.

    Now if you happened to imagine the system is continuous, you will be forced to introduce complex probabilities, so the permutation matrix would have intermediate states. (Its diagonalization is complex, all roots of unity)

    Classical probability theory would not allow any intermediate states between two certain states, and it would fail to describe a completely deterministic system, if you incorrectly assumed continuous state transition.

    (I’m aware of Gerard t hoofts work on CAI, and this is why I find it so convincing)

    Maybe the real reason for the complex probabilities are that the system is actually completely deterministic and discrete?

  63. Mark Palenik Says:

    Energy eigenstates make sense without complex numbers for time-reversible Hamiltonians, but I suppose that’s just being pedantic. Also, Majorana Fermions are real valued fields that obey a real wave equation, and the Klein-Gordon equation is real and permits real solutions (although admittedly, problems arise when you try to treat the Klein-Gordon equation in a non-field theoretical context, and the Dirac/Majorana equation has problems with causality in that context as well). Of course, this doesn’t really remove imaginary numbers from quantum mechanics, because the relation [x,p]=i still holds, and in fact, the entirety of the dynamics of a quantum system can be derived from classical mechanics plus this commutation relation. So in that sense, imaginary numbers are necessary for quantum behavior (although, I suppose it wouldn’t have to be i, any nonzero commutation relation would give you some kind of weird, non-classical behavior)

    I think the continuous unitary transformation thing is a bit of a red herring, though, because there’s no reason we should expect to be able to perform a continuous transformation that ends in an inversion. At least, it’s not something we see in every day life. In every day life, pretty much everything we see is an SO(3) transformation. We can rotate things continuously, which can be done with orthogonal matrices.

  64. Scott Says:

    Stephen Adler, the author of Quaternionic Quantum Mechanics and Quantum Fields, was kind enough to send me a detailed and informative email in response to this post, and to give me permission to reproduce it here. Adler’s message follows:

    I agree that there is faster than light signaling in quaternionic QM. This is evident from the fact that the non-existence of a tensor product with the usual properties implies the breakdown of clustering (Sec. 9.3 of my book — ”The tensor product problem and the failure of clustering”). Theories without clustering permit instantaneous communication between widely separated systems. But I did not explicitly discuss signaling in my book.

    However, the interest in quaternionic QM is not in applying it to large scale physics, but to possible hidden layers near the grand unification or Planck scale, where I know of no experimental evidence for clustering or no-superluminal signaling. The main result in the first part of my book is that in quaternionic potential scattering, the S-matrix is complex, not quaternionic. The quaternionic parts of the wave function decay exponentially at large distances. This discovery, that I made first by solving the quaternionic delta function potential model, and then generalized, was my main motivation in writing my book. One could then envision a very short distance quaternionic layer of physics, giving rise to the observed complex QM layer as its asymptotic theory at large distances. However, my attempts to construct a quaternionic realization of the Harari-Shupe preon idea, sketched in the final chapter of my book, never worked in detail, so I eventually abandoned this idea. I now think that if the observed particles are composites, it will be in a standard complex QM framework.

    My result about a complex S matrix surprised the experts, and has experimental consequences.
    A famous PRL of Asher Peres proposed to test for quaternionic QM by doing a neutron interference experiment, with materials A followed by B in one arm and B followed by A in the other, and looking for non-commuting phase shifts. This will give a null result in quaternionic potential scattering, since the quaternionic S matrix is complex. One has to do near-zone scattering experiments to set bounds on quaternionic physics, which is much harder.

    Another thing to come out of my book was the idea of using a trace variational principle to generate operator equations of motion without canonical quantization. This figured in my later ideas on generating complex QM as the thermodynamics of a non-commutative but classical dynamics, as discussed in my Cambridge U Press book ”Quantum Theory as an Emergent Phenomenon”.

    In response to my questions, Adler later sent me the followup below:

    In scattering with a general quaternionic potential, a suitable choice of ray representatives
    makes all the asymptotic states behave as e^{ikr}/r, up to a complex phase, where k is the wave number. The quaternionic effects then go as e^{-kr}, with k again the wave number. So there are in principle superluminal signaling effects, but they are very hard to see. This doesn’t rule out the possibility that a hypothetical sufficiently advanced civilization might be able to detect them, by methods unavailable to us at present. But ”for all practical purposes” the asymptotic wave function is complex.

    Just to put numbers on [this], the longest wavelength used for radio communication is ~100 km. The nearest ”advanced civilization” must be more than 1 light year ~ 10 ^{13} km away. So e^{-kr} ~ e^{-10^{11}}, which is FAPP zero.

  65. fred Says:

    Hey Scott,

    since you posted a link to the Knuth article…
    Roger Penrose was on the Joe Rogan podcast, pretty interesting (he covers consciousness, black holes, multiverses)

  66. Scott Says:

    fred #59: The difference is that complex numbers are not merely “useful” for QM, not merely one tool among several equally good ones that could be used. They literally are what the amplitudes are. I.e., if you tried to state the physics of our universe to someone from a completely different universe (e.g., Conway’s Game of Life), you’d either mention complex numbers on the first page, or else it would be obvious that you were intentionally avoiding them. That’s the part for which one might want an explanation, to the effect that complex numbers are the only choice satisfying these or those axioms.

  67. Scott Says:

    Daniel #58: On the contrary, the inverse-square law provides a perfect example of a mathematical choice in physics for which we do have a complete and satisfying explanation in terms of more basic principles. Even in Newtonian physics, one can handwave that we live in 3-dimensional space, where the surface area of a sphere increases like the square of the radius—so it stands to reason that if the “same amount of gravity” has to get spread thinner and thinner as a sphere of gravity expands, the amount per unit area should fall off as 1/r2. But in general relativity, it’s more than that: once you have the basic setup, you can derive the 1/r2 law in the Newtonian limit; it literally couldn’t have been anything other than that. I think people would be happy to explain the origin of quantum mechanics, in terms of something more fundamental, in a way that was half as unambiguous and clear.

  68. fred Says:

    Scott #67

    I was wondering if this could be answered by searching for some quantity that has to be conserved through the evolution of a QM system.
    As the wave function “propagates” in space, is there an analogy to the various flux being conserved in the Maxwell equations.
    In other words, can the discussion be moved from matrix mechanics to the Schrodinger equation, where i (the imaginary unit) appears as well?

    But, from the wiki on the Schrodinger equation:

    “The Schrödinger equation is a variation on the diffusion equation where the diffusion constant is imaginary. A spike of heat will decay in amplitude and spread out; however, because the imaginary i is the generator of rotations in the complex plane, a spike in the amplitude of a matter wave will also rotate in the complex plane over time. The solutions are therefore functions which describe wave-like motions. Wave equations in physics can normally be derived from other physical laws – the wave equation for mechanical vibrations on strings and in matter can be derived from Newton’s laws, where the wave function represents the displacement of matter, and electromagnetic waves from Maxwell’s equations, where the wave functions are electric and magnetic fields. The basis for Schrödinger’s equation, on the other hand, is the energy of the system and a separate postulate of quantum mechanics: the wave function is a description of the system. The Schrödinger equation is therefore a new concept in itself; as Feynman put it:

    “Where did we get that (equation) from? Nowhere. It is not possible to derive it from anything you know. It came out of the mind of Schrödinger.”
    — Richard Feynman ”

    Btw, conservation of energy flux is why EM static fields decay in r^-2, but it’s not as obvious (to me, without doing the math!) why changes in those fields decay in r^-1 (making radio communication possible).

  69. Mark Palenik Says:

    But the key there is in the Newtonian limit. In GR, the gravitational field doesn’t fall off quite as 1/r^2 because it’s a nonlinear field (the 1/r^2 law is closely related to linearity). And the weak force is also not a 1/r^2 law. We can say this is “because” it’s massive bosons, but then of course, we’re already using quantum mechanics, which we’ve already argued is nonintuitive, to intuitively explain why we have something other than 1/r^2.

    As for the non-commutativity that Daniel mentioned, sure it’s weird from the standpoint of what seems normal based on what we see every day, but if x and p did commute, we could also ask the question of why they do. It’s sort of like how everyone just assumed that parallel lines don’t intersect, until someone finally asked, “hey, what happens if we don’t assume that.”

    One final point: I mentioned before that you can derive the dynamics of a quantum system from the canonical commutation relation [x,p]=i, which automatically implies that time-evolution goes as e^iHt. If the commutator were real (an non-zero), this would be a non-unitary transformation. I haven’t given any thought as to what a quaternion would imply.

  70. Sniffnoy Says:

    Mark #69: Well, if q is any purely imaginary quaternion, e^q will have unit norm, same as for complex numbers; an easy way to see this is to remember that all purely imaginary unit quaternions are conjugate.

  71. Ajit R. Jadhav Says:

    Scott #67:

    If you entertain the notions of (i) matter waves (i.e. wavefunctions), and (ii) de-Broglie relations (which are the same as the Planck-Einstein relations “lifted” from radiation physics, and applied without modification to matter), then complex amplitudes become inevitable. Here are the relevant points. (I refer to David Morin’s online chapter on QM, but only through memory):

    Let the ansatz for the matter waves be complex-valued plane-waves: \Psi = A e^{-i(kx -\omega t)}. Then \Psi_{t} = i \omega \Psi (where the subscript denotes partial differentiation), and so, \omega = -i \Psi_{t}.

    Now, energy of the system is E = T + V. On the LHS, use the de Broglie (Planck) relation to get: \hbar \omega = T + V. If you now substitute the above expression for \omega, you get the usual -\hbar \Psi_{t} time-dependence on the LHS. (With a similar procedure, on the RHS, T gets replaced by an expression involving \Psi_{xx}, but it does not concern us here.)

    Now the question is: Why take the entire complex number for the solution of SE? Why not drop the imaginary part the way electrical engineers often do—i.e. in classical physics? The reason is this:

    If the time-derivative is of the first order, then with a real-valued \Psi the solution would be diffusive, not oscillatory. But by the postulate of matter-waves (see above), we require the solution to be oscillatory. The only way to have oscillations in time even when the derivative is of the first-order is for the field variable to be (taken as being) fully complex. That’s why.

    It’s all an issue of what kind of postulates to pick up so that they might produce a theory which fits the experimental evidence.

    If the cavity radiation were to be such that Planck’s relation were to be E = \hbar \omega^2, and then, if matter waves were still to mimic light waves, then, the time-derivative would turn out to be of the second-order, and so, the matter-waves could perhaps be taken to be only real-valued (though not necessarily so, they could still be fully complex). With such a solution, the characteristics would travel at a finite speed, be sharp (i.e., signals would live on the surface of the “light” cone), and hence, entanglement wouldn’t be possible (or so I guess).

    But Planck’s hypothesis, which did explain the experimental observations right, was that E = \hbar \omega. And de Broglie’s hypothesis was that there are matter waves. And so…



  72. Ajit R. Jadhav Says:

    Sorry for a lot of typos/mistakes in my reply #71 above. While the choice of the ansatz is not ideal (I take -i in place of +i even as keeping the rest as (kx – \omega t) ) and some of the subsequent expressions are incomplete or erroneous, these are relatively minor. (I mean, the typical reader of this blog could easily correct for them.) The most significant error was stating E = \hbar \omega^2. As we change the power of \omega, the constant of proportionality would no longer have the dimensions of \hbar. I think that was a significant error. (You either write in dimensionsless form, or you take care to write quantities of the right dimensions.)

    Have been running mild fever and nausea for a few days, and so, with the medication, my fingers literally were shaking when I typed the above reply. Also, was feeling a bit drowsy too. It’s just that I thought that the relevant (and logically prior) physics-based considerations were worth noting, and also that I could pull it too. So I hurried it up. … Looks like what came out didn’t have the best possible form.

    Bye for now.


  73. Ahron Maline Says:

    Is it really an observable fact that quantum amplitudes are complex? It has been argued that real QM would lead to physics identical to our own. The following is an argument that I heard from Ron Maimon, who sometimes comments here; I don’t know the original source.
    Given that the time-evolution of the universal quantum state is linear and norm-preserving, it will be represented by unitaries satisfying U(t_2,t_3)U(t_1,t_2)=U(t_1,t_3). If there is no external time dependence, and the evolution is smooth in time, it must take the form U(0,t)=exp(-itH) with H a hermitian Hamiltonian. Then the only real solutions to |psi(t)>=U(0,t)|psi(0)> are either a constant vector that is a zero eigenvector of H, or of the form cos(Et)|E_1> – sin(Et)|E_2> where |E_1> and |E_2> are degenerate eigenvectors of H, with energy E (or linear combinations of the above).
    Now since we are dealing with the universal Hamiltonian, this solution is all there will ever be. All events, including all measurements or transformations that “we may perform” are just interference patterns between the universal eigenstates, as seen in a classical reality generated by decoherence, itself an interference effect. In particular, nothing will ever separate |E_1> and |E_2>; they will carry on their circular dance for all time.
    So there is nothing stopping us from relabeling |E_2> as i|E_1>, and presto! – we get complex QM. Any time we (using complex QM) describe some state, we can in principle express it as a sum of eigenstates of the universal Hamiltonian, and then think of the real and imaginary parts of each coefficient as real coefficients multiplying two degenerate vectors.
    You see, I always agreed with SMBC: “Wait. You guys put complex numbers in your ontologies?”

  74. Scott Says:

    Ahron #73: It’s obvious that you can simulate complex QM by a theory involving real amplitudes only. That was mentioned in the original post, and discussed further in the comments above. For that matter, you can simulate QM to arbitrary accuracy, albeit probably not efficiently, by a theory like Conway’s Game of Life that involves only 0’s and 1’s. None of that is relevant to the question at hand. For, if you want to reproduce the physics of our world, then when it finally comes time to tell your simulator what to simulate, either you’ll make explicit reference to complex numbers, or else it will be obvious that you’re intentionally avoiding them. So the question, again, is why?

  75. Ahron Maline Says:

    Scott #74: No, I’m saying much more than the fact that you can simulate complex QM by a theory involving real amplitudes only. I’m saying that any real QM will necessarily “simulate” complex QM, nothing more and nothing less.

  76. Anonymous Says:

    I was thinking about my earlier comment that the universe might store the imaginary part and the particles the real part whether entangled with other particles or not. Complex numbers can be in rectangular or polar — polar seems much more likely. By storing theta for every particle and every entangled particle combination, the universe can control the time speed of anything in its domain since theta is just the angle of the clock hand.

    Polar form has some advantages in calculations — multiply use less real multiplications. If the top half of the n+1 real qubit vector had magnitudes and the bottom half angles what kind of matrices would be equivalent to the unitary matrix if any — has this been worked out?

  77. Scott Says:

    Ahron #75: But that’s just wrong. Did you read the post? In a world governed by real QM (and without artificial restrictions on the observables that make it equivalent to complex QM), you could see, for example, that there were pairs of bipartite states that couldn’t be distinguished from each other by any products of local observables, but could be distinguished by global observables. That’s an actual empirical test that would distinguish that world, which is perfectly self-consistent and so forth, from the actual world governed by complex QM.

  78. Anonymous Says:

    I was still thinking (so far) of an exact emulation of 2^n complex ket with a 2^(n+1) real ket. Rectangular decomposition would be the usual way and unitary matrices would work because at the end (a+bI)(a-bI)=a^2+b^2. I was wondering about a polar decomposition with r’s on the top half and thetas on the bottom half. I guess at the end a probability could be obtained by just squaring an r on the top half and ignoring the bottom half that have thetas since a^2+b^2=r^2. The matrix would have to be unitary for the top half, I am not sure what restrictions the bottom half with thetas would impose.

  79. Diego Says:

    Scott, about your offer, I’m interested in a post about the relationship between “gentle” measurements and differential privacy.

  80. anonymous Says:

    The idea was if there was a hidden player that sometimes played with exclusive info, complex numbers might not be the best model, otherwise complex numbers appear to be best.

  81. Ahron Maline Says:

    Scott #77: Sorry for the delay in responding.
    I am arguing that these “artificial restrictions on the observables” are actually the unavoidable result of unitary dynamics.
    When we compare the observables and unitaries of a d-dimensional complex Hilbert space to those of a 2d-dimensional real Hilbert space, the extra operators that appear in the real case are precisely those that separate a pair of dimensions that, in the complex case, are the real and imaginary parts of one vector. My claim is that since these pairings are generated at the level of the universal Hamiltonian, it is in principle impossible to break them up.
    In the context of quantum computing, we are accustomed to Alice & Bob sitting outside of “the system” and applying unitaries and measurements. But of course, unitaries can only happen in practice as part of the single, global unitary of time evolution. Measurements are also just unitaries in disguise; unitaries that entangle the measured degree of freedom with the pointer states of some device.
    Thus, once we know that the relative amplitudes of the pair of global vectors are set for all time as cos(Et)|E_1> – sin(Et)|E_2>, and that the full state is a sum of such pairs, we can conclude that the only physically possible local unitaries are those where the dimensions come in pairs that will not be separated, i.e. those unitaries that “simulate” complex QM.

    One important correction: I wrote above that the real vectors |E_1> and |E_2> will be degenerate eigenstates of the Hamiltonian. This is wrong: the Hamiltonian exchanges the two, so neither will be an eigenstate. Rather, in real QM the Hamiltonian simply cannot be diagonalized. It is only once we move to complex QM, and declare that |E_2> = i|E_1>, that the vector (now there is only one) becomes an eigenstate with energy E.

  82. Scott Says:

    Ahron #81: Nope. There’s a perfectly internally consistent theory of “real QM,” where yes, you could do a simulation of complex QM, but then you’d always have full access to the real and imaginary parts of each amplitude separately. And it’s interesting to study the similarities and differences between that theory and complex QM. Declaring at the outset that you’re only going to think about real QM, if restrictions are placed on the observables that make it equivalent to complex QM, and therefore that real QM is equivalent to complex QM, is as willfully perverse as, I don’t know, saying you’ll only think about Gaussian integers if they happen to be real, and therefore nothing distinguishes the Gaussian integers from the ordinary ones.

  83. Ahron Maline Says:

    Scott #82: Please forgive my frustration, but you have now responded three times without addressing my argument at all! No, I am not imposing any restrictions. Of course that would be perverse. The restrictions appear on their own; they fall out inevitably from the dynamics.

    Yes, sure there is a internally consistent theory of “real QM”. But that assumes there is a classical “you”, sitting outside the quantum system and applying operators at will. Of course “you” would then have access to the real and imaginary parts of each amplitude separately.

    But we don’t believe in a separate, classical reality, do we? The world is a single quantum system, and we are part of it. There is only One True Hamiltonian. If I, a subsystem, “apply an operator” to another subsystem, that process must show up somehow in the solution to the global Schroedinger equation (including decoherence to give the appearance of classicality).

    Since we generally assume the global Hamiltonian is time-independent (that’s how we get energy conservation), the time evolution is just O(0,t) =exp(-itH) (writing O instead of U to emphasize that it is an orthogonal matrix). Any nontrivial O of this form cannot be diagonalized with real eigenvectors, but instead can be made block-diagonal, with each block a 2-d rotation matrix (or the identity matrix, for the zero-energy subspace). This pairs up the dimensions in a natural way.

    If we take a scenario where someone is applying an operator to a subsystem, we can in principle express the story in terms of interference of the global energy states. Yes, this will be a crazy, utterly unrecognizable description, but correct nonetheless. We will then find that the pairings of dimensions remain as they always were. Converting back to our local, understandable basis, the pairings will still show up, and the operator being applied will respect them. Thus operators that access the real and imaginary parts of an amplitude separately will simply never happen.

    I suppose that by using global unitary dynamics, I am basically committing myself to MWI. But to my mind, the only alternative is some form of objective collapse, which needs to be added to the dynamics. When and if we have a believable model for that, we will need to check again what happens in real QM, i.e. whether or not the collapse respects the dimensional pairing. But my bet is that most such models will, and so will still simulate complex QM only.

    [Why are these the only alternatives? Because I am committed to realism, and Bohmian Mechanics and the like are just “MWI in denial”. But I don’t want to get into that discussion here.]

  84. venky Says:

    On a similar note, can we get by with only rationals in TCS, or do we need to assume the reals for convergence, etc.? Practically, we only use fractions anyway.

  85. Anonymous Says:

    I think maybe rationals are possible for a 2^(n+1) ket to simulate a quantum computation — that would be a gold mine of new formulas. In polar decomposition r is never negative — so r^2 can be used instead, the actual probability, which is arguably rational: the expected yeses/2^n. A matrix would be stochastic for the top half of the vector tracking r^2 because it must sum to 1. The bottom half of the vector tracking the angle could be a fraction of the half circle traversed either positive or negative also rational since quantum means discrete not continuous.

    Of course there may be some very basic reasons this wouldn’t work since I just thought of it.

  86. Scott Says:

    Ahron #83: Ok, so you’re just saying that once you take into account what I called the “application layer,” including a time-independent Hamiltonian and Schrödinger equation for the universe (as opposed to just the abstract rules of quantum information), complex numbers play a key role, though one could also imagine the abstract rules of quantum information having been implemented in some other way where complex numbers didn’t play such a role. I was confused because this was also a point made in the original post.

  87. Ahron Maline Says:

    Scott #86: So you agree to my claim? If any time-independent Hamiltonian system running on the real QM “hardware” contains observer subsystems who operate on & measure other subsystems, then those observers will need to use complex QM? If so, then we have a pretty solid explanation for “why” the amplitudes in our physics are complex. It also means we don’t know whether they are real or complex at the hardware level.

    I admit that my argument – that the pairings of universal energy states will necessarily be reflected in the operators that can be implemented – is a bit hand-wavy, and I’m not sure how to go about making it rigorous. I was kind of hoping you would know…

    As for whether one could imagine the rules of quantum information having been implemented in some other way… well, maybe my imagination is a bit limited, but I don’t really see how. Asking for a time-independent Hamiltonian is really not a very demanding request. It is merely the requirements that:

    1. There is a continuous dimension called “time”.
    2. A realization of the system means the assignment of a vector state to each point in time.
    3. The state for any one time point determines the correct states for the whole history.
    4. The above determination is linear.
    5. The rules for the above determination are invariant under time translation.
    6. Each allowed history is a differentiable function of time.

    Which of these would the hypothetical “alternative implementation” give up? If you don’t have determinism or linearity, at least some of the time, then the system isn’t very meaningfully “quantum”. If the dynamic laws change with time, then something that isn’t part of the state must be changing them, meaning you have a reality that is only partially quantum. If the dynamics are not differentiable, then by time-invariance they are not differentiable at any point. To me that sounds pretty ugly. And if the state is not a function of time at all, then… I don’t know what, but not a reality I can imagine.

  88. Ahron Maline Says:

    Continuing my response to Scott #86:
    To further clarify: I am not trying to explain why QM is complex at the fundamental level. Just the opposite – I think it likely isn’t! I am trying to explain where the complex QM we use comes from, if the True amplitudes are real. I argue that a “complex QM simulation” is the only possible outcome, under very reasonable assumptions about how dynamics should work.

  89. AJ Says:

    Will string theory ever lead to an useful string computer? Will string theory say anything about quantum computing?

  90. Greg Kuperberg Says:

    There is another approach to quantum probability where you begin with a suitable abstract algebra of random variables, then define a state as a suitable linear functional on that algebra. In the standard version the algebra is a von Neumann algebra, and there are ways to define such an algebra so that you don’t even use Hlibert spaces at first, you can bring them in later to describe states. In this approach I think that you actually can work over the real numbers rather than the complex numbers — you can consider real rather than complex von Neumann algebras. Although I haven’t worked out the details of this, I think that one way or another the end result is not very different from working over the complex numbers. I expect that the outcome of this approach is similar to that of (say) the fundamental theorem of algebra over the reals instead of the complex numbers. Complex numbers “lurk inside” real polynomials whether you like it or not, in the sense that a real polynomial might still have complex roots.

    Meanwhile quaternionic quantum probability suffers from a different problem besides (in some interpretations) superluminal signaling. Namely, in order to express a joint system, you should use tensor products. This already happens in classical probability — a joint probability distribution for Alice and Bob’s lives in the tensor product of the vector spaces of Alice’s and Bob’s distributions. It still happens at both the Hilbert space level and the algebra level in quantum probability. But, although you can make quaternionic vector spaces and quaternionic Hilbert spaces, there isn’t any natural tensor product of two quaternionic vector spaces to produce any better than a real vector space. The subtle algebraic reason for this is that left scalar multiplication differs from right scalar multiplication for quaternions, and you need both kinds to make basis-independent tensors. Meanwhile the simplest form of a quaternionic vector space only has scalar multiplication on one side. You can consider the more complicated structure of a quaternionic bivector space, but I suspect that at best you would end up with the same process of assimilation under complex numbers as mentioned previously in the real case.

    On the other hand, if you don’t mind ditching the greater symmetry present in basis-independent formalism, then this other problem with superluminal signaling seems worth mentioning. 🙂

  91. Richard Gaylord Says:

    “why should amplitudes have been complex numbers? “. i want to strongly recommend reading the experimental biography “The Quantum Astrologer’s Handbook: a history of the Renaissance mathematics that birthed imaginary numbers, probability, and the new physics of the universe”. It is an excellent read. and i think that scott has the reasonable approach to ‘understanding to q.m. he treats it as a type of probability (see the comic http://www.smbc-comics.com/comic/the-talk-3). and if that doesn’t satisfy you, remember what Bertrand Russell said “”Probability is amongst the most important science, not least because no one understands it”.

  92. Robert Says:

    What does “nonzero density” mean? I’m not familiar with this terminology. It is not correct that the separable states are dense (topologically) in all mixed states, they form a closed proper subset. The Braunstein et al. paper does not mention density, but rather says that separable states are a neighbourhood of the totally mixed state. This implies what you say later because a linear function is determined by is values on one neighbourhood of one given point (because neighbourhoods can always be dilated to contain any given point, similar to the proof that a linear function is determined by its values on the unit ball).

  93. Scott Says:

    Robert #92: I guess the correct term would’ve been “nonzero measure.” Thanks; I hadn’t thought about that!

  94. Ahron Maline Says:

    Scott, now that I see the thread is not yet dead, I’ll try once more to hawk my wares. I believe that the argument I made above (in comments #73,#81,#83 and #87) is correct, and that it largely answers one of the main questions of your post. Comment #83 probably expresses the argument most clearly. You must not have understood my writing- if you had, you would either be very excited or be coming up with refutations!

    In brief: quantum amplitudes may well be real. There is no way we could know the difference! Any reasonable dynamics, applied to real QM, would automatically restrict the physically realizable operators to those of complex QM – as long as the universe is considered as a single quantum system, including the experimenters and measurement devices.

  95. Doug Sweetser Says:

    Quaternions have as a subgroup the complex numbers. For this simple reason, there can be no result ever created using complex numbers that cannot also be written using quaternion. Just pick a simple form of quaternions, say (a, b, 0, 0) or (a, 0, b, 0), and the rest can be done by a machine. A tiny bit more interesting is to pick a fixed multiple of the 3 imaginary numbers, say (0, 1, 2, 3). Everything done with a complex-valued Hilbert space can be done in the “point one way” quaternion-valued Hilbert space. That is the one trick one must use to reproduce all the results of complex-valued QM. Of course the one direction is arbitrary, any will do. The reason the trick works is that quaternions commute with all quaternions pointing in the same direction.

    What happens if one tries to work with quaternions where the imaginary points in different directions? That is a novel situation, different from what can be done using complex numbers. At this point, I do not know how to deal with it with the kind of intellectual precision required. I do know the problem Adler points out in 9.3 will not arise if one uses “point one way” quaternions.

  96. Scott Says:

    Doug #95: Yes, it’s evident that if we restrict ourselves to quaternions that “point one way,” then we’ll get back standard complex QM. In such a case, I’d say that our theory simply was complex QM: it’s empirically indistinguishable from it, so Occam’s Razor would tell us to cut all mention of quaternions from the formulation of our theory.

    The interesting question, the question that concerned Adler and that also concerns me, is: what happens if you generalize QM to allow quaternionic amplitudes that can “point multiple ways”—so in particular, you have to deal with noncommutativity?

    Based on Adler’s discoveries, my post, and the discussion above, here’s a very short summary of what we’ve learned about that question:

    If the noncommutativity were still noticeable at arbitrary spatial separations, then you’d get superluminal signaling, which to most physicists means that the theory is sick (the term I used was “flaming garbage fire” 🙂 ) and needs to be discarded.

    If, on the other hand, the noncommutativity were exponentially suppressed with distance—as seems to be the case in the QFTs that Adler studied—then quaternionic quantum theories could have some hope of describing our world. But even in that case, Adler says that he no longer sees a motivation for these theories from particle physics, and has moved on to other ideas.

  97. Doug Sweetser Says:

    Scott #96: The reason I have decided not to take a shave from Occam’s Razor at this time is the “point one way” quaternions have a physical interpretation. There are three dimensions in space and there are three imaginary dimensions in quaternions. If and only if that is a meaningful one-to-one and onto relationship, then I understand what say a conjugate is. In a normal mirror reflection, one spatial dimension flips signs while two other stay the same. Flipping one sign on space is why a right hand appears in the mirror to be left, but one still is standing on their feet. In point-one-(arbitrary)-way quaternion series quantum mechanics, one uses mirrors that flip all 3. My classical mirror reflection feels nonlocal to me. The same would apply to conjugate of a space-time wave function. Perhaps it is a delusion, but now nonlocality has a physical, not metaphyical interpretation: its a 3D mirror. No wonder quantum mechanics is odd, every calculation involves mirrors. I bet you know this already, but even the uncertainty principle which looks like a simple inequality, in the derivation involves a conjugate to make the proof. Mirrors, mirrors, everywhere in space-time (just not the time part).

    Thanks for including the comments from Adler. It was not clear from his book I was rereading last night where he viewed a serious technical issue with his approach. I want to avoid getting sick which is why I am sticking sooo close to complex-valued QM. I have to accept that I am too close to the complex-valued QM, but when I pass a mirror these days, I imagine the plane shrinking to a point and getting a 3D spatial reflection and smile (I be too ugly to smile at my mere reflection, so have to creating this abstraction).

  98. Scott Says:

    Doug #97: My difficulty with your response is that, if quaternionic QM described our world, then quaternions would be showing up as amplitudes—i.e., phases in an abstract (quaternionic generalization of a) Hilbert space, not as spatial directions. So it’s totally unclear what relation, if any, the i,j,k would have to the three orthogonal directions of physical space. I won’t say that there couldn’t possibly be any relation—after all, in standard QM, I wouldn’t have predicted that the 3-dimensional Bloch sphere would have anything to do with 3-dimensional space, and yet it does!—but the relation would certainly need to be spelled out.

  99. Doug Sweetser Says:

    Scott #98: I use the word “dimension” in two ways. There are space-time dimensions. This has four degrees of freedom, no more no less. This is my path to the physical world. There are also state dimensions. There can be one, two, finite or an infinite number of state dimensions depending on what physical system is being described. Each of these state dimensions is composed of a space-time quaternion. The Hilbert vector space is about the state dimensions.

    Say one has 10 states. It is easy enough to calculate all the conjugates and then calculate the inner product of the two and get one quaternion of the form (p, 0, 0, 0). For metaphysical me, those three zeroes are important! They may be saying “I am the observer, here at my spatial origin”. The p is just the odds the observer gets to see the event. What is used to construct the odds? The wave function and its spatial (but not temporal) reflection. The calculation for p will be identical either way (complex or quaternion series), but it is fun trying to read something physical into the zeroes.

    There may be a reason why “point one way” quaternions are reasonable. All experiments have to point a direction in space to collect data. Experiments that cover areas and volumes do so by adding up smaller observations. Maybe. Or not.

  100. Doug Sweetser Says:

    I have decided to accept the critique that “point one way” quaternions while they might work, are far too constrained to be of general interest. I do this at a point where I have a specific technical idea of how to fix the flaw but the work has not been completed. If and when I get things to work (specifically address the superluminal signaling problem), then I will comment here again. No promise on the time scale since it is possible it cannot be done.

    Enjoy Australia, should be easy.

  101. Philippe Grangier Says:

    A simple “explanation” about why complex numbers are needed, rather than real numbers, is that it must be possible to connect continuously any permutation matrix (applied to the basis states) to the identity matrix (i.e. to not changing these states). This is possible with unitary (complex) matrices, but not with orthogonal (real) matrices, because they split into two unconnected parts with determinants ±1.
    For details about why asking for that, see e.g. http://www.nature.com/articles/srep43365 , page 5 of pdf. This generalizes Scott’s argument about taking the square root of a matrix with determinant -1, and it is also related to the above comment by Age bronze. I did not try to find a similar argument for quaternions, help welcome ! Personally I’m happy with the simplest way to do the job, i.e. complex numbers, from an Occam-type argument.

  102. Rich peterson) Says:

    I didnt get too far in this since i dont know any QM. But when you talk about eigenvalues you think about polynomials, the Cayley-hamilton polynomials. So perhaps that the complex numbers are the algebraic closure of the integers gives you enough roots but not too many roots?

  103. rich petersen Says:

    maybe im just restating your point.

  104. Robert2 Says:

    My favorite way to justify the appearance of complex numbers in quantum theory goes as follows: Starting from an operational way of defining things, you would start out with experiments being formalized as observables (which will later be hermitean operators) which start out as a real vector space (and states being dual). But from „doing non-linear transformations of the scale of your measurement device) you argue that with an observable A you also want f(A) for a reasonable class of functions f, at least powers. But then you can define a commutative but not associative product a.b as ((a+b)^2-a^2-b^2)/2. (This is a priori all you can have in terms of operators as you want to stay within what will later be hermitean operators). This turns your observables into a Jordan algebra. Then you can invoke a classification result of such algebras that yields that with some exceptional cases, all Jordan products arise as anti-commutators of some complex algebra. And this is where the complex numbers show up. Thinking a bit more about this, it is actually quite hard to give an operational justification for the product of operators (doing „one after the other“ would apply more to unitaries but then you have to postulate some version of Stone-von Neumann which you don‘t have yet) as that takes you out of the real cone.

  105. Ahron Maline Says:

    Update on #73 etc.: I just spoke with Ron Maimon, and he tells me that the argument is his own (rather than coming from a published work as I had believed) and that he does not have a proof for it (that is, a proof that the realizable operators will turn out to be restricted to those of complex QM).

    Nevertheless, the hand-wavy argument does seem correct: the dynamics of the measurement must be expressible as some set of terms within the global Hamiltonian, and so they must respect the dimension pairing that block-diagonalizes said Hamiltonian.

  106. Luke Says:

    Would it be totally naive and stupid to say that the amplitude is imaginary because it doesn’t really exist in the real world? I.e. the wavefunction is a vibration in a space that is not part of regular 4-D spacetime, but in an imaginary dimension. That was always my undergrad-level intuition behind it. Then it’s not that weird to think of particles having a wavefunction of imaginary amplitude, because the amplitude is not part of regular spacetime.

  107. Scott Says:

    Luke #106: Yes, I’m afraid. 🙂

    Amplitudes are complex numbers, not just imaginary ones. But more to the point, real numbers don’t “exist in the real world” any more than complex ones do! They’re both mathematical constructions. The question addressed in this post was just: why does the latter mathematical construction and not the former accurately describe this part of physics?

  108. Richard L. Peterson Says:

    Sorry, should have said the analytic completion of the algebraic closure of the integers.

  109. Doug Sweetser Says:

    Hi Scott:

    It is cold here in Massachusetts, but I was not able to warm myself by the claimed flaming garbage fire discussed in the main blog. I was able to spot a few issues in the calculation with the state |1> for Alice, |+> for Bob, U/V stuff.

    I recently rolled my own tools for doing quaternion series quantum mechanics calculations. Just last week I took my first mini-class on quantum computation during MIT’s IAP. The subject of the CHSH inequality came up which provided a perfect test for my software. All I had to do was set j=k=0, and if I was doing things right, I had to get the anti-correlation of -2 2^(1/2) out of the process. The code was up to the task. Great, and boring too since the CHSH inequality has been proven long ago for complex numbers.

    It then became possible to extend the CHSH inequality to quaternions, to go from (1 i, 0, 0) to (n i, m j, p k) where n, m, and p can take whatever real values. The only thing that changed was a normalization factor. That change is important at two levels. First it is so simple a computer can do it. Second it means that quaternion series quantum mechanics is not exactly complex-valued quantum mechanics. If anyone wants to see the details of the work, it is on a GitHub Jupyter notebook, to avoid filters, url of bit dot ly slash vp dash CHSH.

    That was my warm up exercise I did before trying to figure out the calculations presented in the main blog.

    My approach was simple… I know that if one calculates a for a quaternion series, the result will always be a positive real number. If one has a quaternion series operator O, then a will be a real number and all three imaginary terms will be zero. I was able to show that = (1, 0, 0, 0). This is a good thing. In the details of the calculation, it is also not too interesting since only real values go into it, no imaginaries take the stage.

    It turns out that = (1/2, 1/2, 0, 0). This means that V for the state |+> is not of interest because it is not an operator, so cannot be observed.

    Can we ask a different question that is of interest? What I decided to do was calculate = (0, 1, 0, 0). This too fails, but can be fixed.

    Vi =
    | -i 0 |
    | 0 1 |

    This leads to = (1, 0, 0, 0). Combine these so: = (1, 0, 0, 0). Nice.

    Some might object that Vi does not look like what they expect for a Hermitian matrix. The operator Vi most definitely is not a Hermitian operator. It is unclear at this moment if Vi would be considered part of work by Carl Bender on non-Hermitian quantum mechanics, or something novel, the transition from one imaginary number to three. While you may judge a complex Hermitian matrix by its trace which must be real, you cannot do so for the considerably more diverse quaternion series. The trace rule still exists for quaternion series if and only if two of the three imaginaries are zero. What happens in the other cases? I have no idea, this is too new to know. Fun.

    What about the suggested rotation? We can predict it won’t work: quaternion series are very fussy about direction, so if this was chosen without calculations, the odds are great it is a little off. That said, if the pattern holds, it will be simple to fix.

    The pattern holds. I decided to rotate both U and V with the rotation matrix I chose to call jk. When I multiplied jk and U, took that and calculated , the number was a pure j. That can be fixed with multiplication by -j. Here is the result:

    jjkU =
    | 1 -i |
    | -k j |

    = (1, 0, 0, 0)

    The same story applies to Vi, including the -j. The rotation does nothing to alter the odds. So I formally disagree that Alice can communicate a bit to Bob. A Jupyter notebook is available on GitHub at bit dot ly slash vp dash why dash complex.

    **The Big Picture: non-locality is space-like separated reflections of information**

    A little more effort is required to implement quaternion series quantum mechanics. There are normalization factors and more care to determine where things have to point in 3D space. The relatively simple rules for Hermitian matrices have to be expanded in ways that currently are not clear. Why bother when the calculations will generate the same probabilities?

    There is a physical interpretation to wave functions and their conjugates. Treat a wave function as a series of events in space-time. The conjugate of that series of events will flip the signs of the 3D spatial terms while keeping the time part identical. This has some similarities to a regular mirror which flips the sign of one of the three spatial directions. A mirror reflection always feels distant and unreachable. Mirrors involve photons bouncing around which is not what is going on in quantum mechanics. Instead, this is a mathematical reflection that always necessarily must be done for observations in quantum mechanics. One takes the information one has – one state being say (1, 2, 3, 4) – and calculates its conjugate, (1, -2, -3, -4). These two pieces of information are necessarily space-like separated because the times are the same but the spatial locations are opposite.

    I have been living with this idea for two months. I think it raises more questions than it answers, but it does start to answer a question that has been around since the 1920s: what physical reason is there that requires quantum mechanics to be non-local? Non-locality is due to 3D spatial reflections required so that has a least lower bound of zero. Or for a non-technical audience: physicists have to use mirrors to get the math right of almost empty space. Mirrors are odd when used all the time.

  110. Doug Sweetser Says:

    #109 Mangled in WordPress translation.
    There are ~10 times in the above where I tried to use bracket notation. I didn’t notice that bracket notation gets hidden… I will try to recreate the missing parts using () instead of angled brackets.

    1-3: I know that if one calculates (a|a) for a quaternion series, the result will always be a positive real number. If one has a quaternion series operator O, then (a|O|a) will be a real number and all three imaginary terms will be zero. I was able to show that (1|U|1) = (1, 0, 0, 0).

    4. It turns out that (+|V|+) = (1/2, 1/2, 0, 0).

    5. What I decided to do was calculate (0|V|0) = (0, 1, 0, 0).

    6-7. This leads to (0|Vi|0) = (1, 0, 0, 0). Combine these so: (1|U|1)(0|Vi|0) = (1, 0, 0, 0). Nice.

    8. When I multiplied jk and U, took that and calculated (1|jkU|1), the number was a pure j.

    9. (1|jjkU|1) = (1, 0, 0, 0)

    My bad for not noticing.

  111. (W^2 + X^2) / (W^2 + X^2 + Y^2 + Z^2) | Complex Projective 4-Space Says:

    […] in quantum theory (which seems to be a special case of a theorem by Bill Wootters, according to this article by Scott Aaronson arguing why quantum theory is more elegant over the complex numbers as opposed to the reals or […]

  112. (W^2 + X^2) / (W^2 + X^2 + Y^2 + Z^2) | cp4space Says:

    […] in quantum theory (which seems to be a special case of a theorem by Bill Wootters, according to this article by Scott Aaronson arguing why quantum theory is more elegant over the complex numbers as opposed to the reals or […]

  113. Shtetl-Optimized » Blog Archive » Research (by others) proceeds apace Says:

    […] to the title might be “well duh … who ever thought it didn’t?” (See this post of mine for a survey of explanations for why quantum mechanics “should have” involved complex […]

  114. Shtetl-Optimized » Blog Archive » Why Quantum Mechanics? Says:

    […] theorem, and of the specialness of the 1-norm and 2-norm in linear algebra, and of the arguments for complex amplitudes as opposed to reals or quaternions, and of the beautiful work of Lucien Hardy and of Chiribella et […]