Quantum teleportation from scratch to magic part 3 — How it works: nonlocality

22 min readAug 27, 2020

So you’ve teleported some qubits, now you must want to know — how?

Welcome back. This is part 3 of my series on quantum teleportation. In the first part, I introduced the foundational theory and notation for quantum computing. In the second part, I showed how to do it in real life (with some minor caveats) on a real quantum computer. Why not check them out if you haven’t already (for the sake of The Algorithm if not your curiosity).

Finally, in this part, I’m going to give you my very unqualified thoughts about what sort of crazy universe could ever permit such black magic. I originally sat down to write this article right after publishing the last one, but I quickly realised I had no idea what I was talking about (not that anyone really does). So now I return to you, feeling quite significantly less sane, with a slightly more informed attempt to try and describe what’s going on.

[edit: I wrote this early on in my exploration of quantum information and so there are several misconceptions in here. I have decided to keep it mostly as I wrote it, however, to preserve my line of thinking at the time]

(sdrawrof elcitra eht daer esaelp woN)

What’s right

For the purposes of this article, the only thing we will assume to be ‘true’ within quantum mechanics is this lovely thing called the Shrodinger equation:

While it looks ghastly, the general idea is that the evolution of a quantum state through time (left side) is determined by a special operator called a Hamiltonian (right side). I’m emphasising determined because this equation contains no probabilities, which creates tensions down the line as we’ll see.

This equation is the quantum equivalent of Newton’s second law, F=ma, in that it allows us to predict when a particle will be where if we know where it was when.

This equation was made in response to the puzzling results of some experiments, most notably the double-slit experiment. In the double-slit experiment, light was fired at a screen with two small slits and was then recorded on a screen. Classically, assuming the photons spread out slightly randomly from the beam, and then again from the slits, you would expect to see a relatively uniform pattern on the screen, with a bigger density in the middle. Instead, this bizarre thing called an interference pattern is observed, as seen below:

Source: https://en.wikipedia.org/wiki/Double-slit_experiment

There are some regions with quite high density and others with absolutely none. The pattern is far more like what you would expect if you were sending lots of waves and they bounced off each other, preventing them from ending in certain points. But here’s where it gets really weird: the same pattern emerges if you send the photons one at a time! Therefore, a single photon seems to be going through both slits and bouncing off itself before ending a single point. It’s behaving like a wave and a particle at the same time!?!

OK, OK that can’t be right. Why can’t you just measure which slit the light is going through? Nice try, sucker! Measuring a photon turns it from a wave back to a particle and we no longer see our lovely interference pattern. So, light only behaves as a wave when we’re not looking at it!

And so we have quantum mechanics: things can be in several places at once but only if they don’t tell anyone. Not convinced? Let’s explore the difficulties.

What’s wrong

When introducing the key concepts to you, I missed out pretty much all of the key questions of quantum mechanics. So here’s a list of some juicy ones to get us going:

Why does every operation have to be reversible?
Why except measurement? Wait, what even counts as a measurement?
What actually is a superposition and why can’t I see one?
How does entanglement really work?
Why do we measure probabilistically according to amplitudes like|α|² rather than just α?

By the end of this article, I hope to have at least brought all of these questions into a slightly more cohesive framework. Ideally, a couple may even be resolved.

Entanglement and the Bell inequality

The key to the process of teleportation is entanglement. So much so that’s its basically all we’ll talk about for the rest of this article. As a reminder, this innocent-looking fella is called a Bell pair. All it takes to prepare is a H and CX gate and yet I’m claiming it breaks metaphysics as we know it. (Don’t worry, it breaks normal physics too)

The weirdness of a Bell pair is that you will always measure the same result for both qubits (if they are measured in the same basis). Otherwise, you’ll get random results.

A central argument in interpreting quantum mechanics is the debate over whether an entangled particle decides its value at the point of measurement (and then somehow sends it to the other one) or whether it was that value all along (and we just didn’t know it).

There was also a third point of view at the beginning. Einstein, being a fan of speed limits, asserted that nothing could go faster than the speed of light (doing so would break his lovely theory of general relativity). This (and anti-semitic persecution) is one of the reasons he left Germany. It is also why he argued strongly that quantum mechanics was incomplete because there must be some ‘hidden variables’ written into every entangled particle telling it what value it should take for any particular measurement. In a famous paper, along with Podolsky and Rosen, he strongly defended our old friend local realism (properties of things exist independent to measurement; things can’t communicate faster than light).

Everything was going well for them until Bell came along and said “hey lads, we’re physicists not philosophers, right? So let’s stop arguing and start measuring.” And measure they did.

But what did they measure? Well, I’m glad you asked. I won’t go into all of the details (see here for an excellent overview) but the basic idea was you could test these ‘hidden variable’ strategies by randomly measuring in one of three bases and comparing what happens. If local realism is true, then the number of times the values agree should never exceed a certain value. But sure enough, when Aspect (and then many others) started blasting their photon beams, they observed an agreement at a rate that should have been impossible. This is known as a violation of Bell’s inequality and has been violated many times over significant distances, with increasing attention to closing potential “loopholes”.

Bad news for speed limits, bad news for Einstein and very bad news for local realism. We must, therefore, decide whether to cut localism or to cut realism. This evidence alone does not tell us which.

So let’s consider our options. If the two particles can’t talk to each other instantaneously (ie localism) then they must have been in that state before we measured them. There is no way they could have been in a superposition and agreed beforehand on what to output as proven by Team Aspect.

Alternatively, the particles were in a superposition and upon the measurement of the first particle, a message is sent to the other one telling it what’s going on. This happens faster than the speed of light and so permits, if not necessitates, information being sent back in time.

There are many theories that have now been developed to explain this and quantum mechanics as a whole. The quantity alone should be a warning sign that none of them has got it worked out. For both of our sakes, I’m not going to talk about them all, just the two most popular. For a longer list see this video and for a complete overview, see the paper which includes this diagram:

Source: https://arxiv.org/pdf/1509.04711.pdf

Don’t worry too much about the distinctions that diagram makes, just check out some of the wacky names.

Copenhagen Interpretation

The prevailing interpretation of quantum mechanics is called the Copenhagen interpretation after Bohr and Heisenberg who both lived there. It is what is taught in textbooks and universities and has a bit of a reputation for side-stepping the big problems, giving it the nickname “The shut-up-and-calculate interpretation”. So what does it say?

Well, it says that the so-called wave-function of particles (how likely something is to be where) exists independently to us and that we could never see it to find out if it’s real. Before measurement, reality is this sticky mess of weighted superpositions where everything is a little bit everywhere. What we do know is how that wave-function translates to probabilities of actually finding it somewhere, so that is the main focus.

So how does a measurement occur? “Not our problem really,” they say, “there’s enough on our plate as it is.” The Copenhagen Crew introduces measurement as the third postulate of quantum mechanics and simply asserts that you’ll measure the square of the modulus of the amplitude, no questions asked.

Quantum Computation and Quantum Information, the ‘Bible of quantum computing’, passes off the issue ‘pragmatically’:

“According to Postulate 2, the evolution of this larger isolated system can be described by a unitary evolution. Might it be possible to derive Postulate 3 [measurement] as a consequence of this picture? Despite considerable investigation along these lines there is still disagreement between physicists about whether or not this is possible. We, however, are going to take the very pragmatic approach that in practice it is clear when to apply Postulate 2 and when to apply Postulate 3, and not worry about deriving one postulate from the other.”

A famous thought experiment about a cat was proposed (I can’t remember by whom) to test this intuition of when it is apparently so ‘clear’ to provide a measurement. There is a cat in an isolated box where the release of some deadly poison is triggered by the radioactive decay of a (quantum) particle. Before opening the box, the particle is in a superposition of decaying and not. Therefore, the cat is in a superposition of being dead and alive? Sometimes people interpret this to mean that’s what literally could happen but, in fact, the Cat Guy proposed this to show the absurdity of having such a poorly defined barrier between the wave function and measurement.

Source: https://en.wikipedia.org/wiki/Schr%C3%B6dinger%27s_cat

Clearly, I’m not a huge fan of this one, so let’s move on.

Many worlds

The many-worlds interpretation, by contrast, says that there is no barrier between superposition and measurement. There is no breakdown between the classical and the quantum world. Instead, when a superposition is created, any observer will find themselves in a superposition too. Since we clearly can only ever see one thing at a time, then the you who sees one outcome exists in a different “alternate universe” to the one which sees the other outcome. Both literally exist but you will only ever find yourself in one of them (depending on how we define you).

This theory is quite mindblowing at first but theoretically, it is quite neat. Proponents of the theory, including the inventor of the quantum computer David Deutsch, argue that this is the truest way to interpret the Schrodinger equation since there is no clumsy addition of any measurement condition and it remains deterministic.

But if it’s deterministic, where do the probabilities come from? This interpretation says it comes from your ignorance of which resulting universe you’re actually in. It emerges from this that your ‘best bet’ would be proportional to the number of total universes with that same outcome. The many-worlders claim, not without criticism, that our old friend |α|² drops out as the most rational expectation you could have.

I think that this interpretation is very appealing because of its effectiveness in dealing with superpositions and simplicity as a whole. However, the derivation of the probability amplitudes does seem a little contrived. It seems to me that if there is a 5% chance of finding myself in some alternate universe then it’s more of a hypothetical than an actual existence.

Other people take issue with what that means for the problem of identity. If there’s a version of me that sees one outcome and a version of me that sees another, what is me? That is an extremely important question, lying very deep in the pit of the eternally debated philosophical questions. It certainly does make it hard to say which is me, but I think no more than the problem of time does: if “I” like eating apples for breakfast now but as a child “I” liked eating cereal and one day “I” will like eating porridge, then what do “I” like to eat for breakfast? For me, the answer to this is that “I” am the line joining the points of my life throughout, rather than any of those individual points. I think that this kind of line could be drawn through the different worlds too.

But there’s a bigger problem too, which I’ve never seen addressed anywhere.

Remember from the double-slit experiment that even when sent individually, particles could cancel themselves out of being measured at regular, predictable points. This is called interference and is actually a very powerful feature of quantum mechanics, rather than an accident. But how could this happen in the many-worlds theory? The whole point is that they’re completely independent but if the different possible paths of the particle are able to interact then they must be together in some way or another. If so, then they are more simultaneous universes than alternate ones and should probably be treated like a single system.

(edit: turns out there’s an equation which exactly describes how these many-worlds can interact — the very same Schrodinger equation I introduced at the beginning! The entire system I said they should be treated as would be all the different universes. The theory also says that they can interact up until the point of decoherence which is basically just a better-defined measurement)

By the way, Sean Carroll gives an excellent introduction to the theory in this video.

Testing Entanglement

While writing this article, I thought I’d test some stuff out a little bit to try and understand entanglement a little better. Perhaps you might too.

The All-Along Hypothesis

Recall (again) that a Bell pair is created when a |+> state acts as a control in a CNOT gate. So what happens when you ambiguously control some other operations? Check out this circuit and its results:

That purple line is a CSWAP: when the control (the dot) is 0, nothing happens. When it's 1, the targets (the two crosses) are swapped.

If you want to make this circuit yourself, the code is just:

from qiskit import QuantumCircuit, QuantumRegister
qc = QuantumCircuit(QuantumRegister(1, name='control'), QuantumRegister(2, name='target'))
qc.h(0)
qc.x(1)
qc.cswap(0,1,2)
qc.measure_all()
qc.draw(output='mpl')

NB That simulate function I wrote myself and have tried to add to qiskit but hasn’t been accepted yet :(

Personally, the results were very surprising. Working backwards, either:

the top qubit was 0, meaning the swap wasn’t performed and so the middle qubit was 1 (I put the X gate there so we could distinguish it) and we measure ‘010’
the top qubit was 1, meaning the swap was performed and so the 1 ends up on the bottom qubit and we measure ‘101’

If the control bit was in a superposition, then the cswap gate almost ‘knew in advance’ what it was going to be measured as. When doing this with a cnot gate, we normally say that the qubits become entangled but what exactly is becoming entangled here? The layout of the circuit?

In the ‘all-alonger’ spirit, a much more intuitive explanation is that after the H gate, sometimes the qubit is |0> and sometimes its |1>. This conforms to realism.

Remember in the last part I briefly mentioned something called the ‘Deferred Measurement Principle’: moving a measurement to the end of a circuit will have no difference on results. This means that the following circuit will measure the exact same results as our last one:

Sure enough, the exact same results (with some fluctuations because of noise)! This time it makes complete intuitive sense as the value we measure in the middle does in fact tell us what is going to happen for the rest of the circuit. The question now is: what’s the difference between the two circuits?

You could say that entanglement does exist in the former one but you could also just say that they are identical. Imagine taking the position that if you could never tell measure a difference between two things then they are the same. By this logic:

The principle of deferred measurement taken seriously

If so, then the magic of entanglement is broken. It's like flipping some coins, sending them out to space and then calling whatever they do spooky because you forgot to check what they were.

But if taken to the extreme, does the principle of deferred measurement (PDM) mean you could put measurements anywhere in your circuit? Wouldn’t that mean that we could very easily simulate all quantum circuits and so they’re not that special after all? Well, let’s test it.

After some experimentation, it turns out that while most (randomly generated) circuit give the same results with and without deferred measurement, this is not always true. Consider the following:

That’s all it takes. You might remember from the first part of this series that HXH = Z so the result of the complete circuit is Z|0> = |0>.

However, with the mid-circuit interruption, it would measure 1 half of the time, and then restart the circuit in the |1> state, giving random end results, as we see.

This clearly shows that the PDM cannot be applied universally. In particular, we need to keep track of which basis our qubits are in. If we had done an X measurement (rather than a Z one), we could have preserved the state of the circuit. Another explanation for the deterministic results of the first circuit is that when the second H gate is applied, the |+> and |-> states destructively interfere (i.e. cancel each other out). However, this contradicts the all-along approach which says that the qubit wasn’t in a superposition, we just didn't know what it was. It seems pretty absurd to imagine a qubit that is actually in the|0> state interfering with a hypothetical |1> state that comes from our uncertainty.

Therefore, the decision of when to measure a circuit is, in fact, an important one. This is contrary to what the many-worlders say when they try to get rid of the measurement problem. The only way they could resolve this would be to allow the universes to ‘cancel out’ after the second H gate but as I said before, that doesn’t seem like they’re particularly alternate if they can interact in such a way (edit: unless you change your perspective on what the whole universe is…)

So there we are. In this section, I have tried to argue that the “all-along” approach does not hold up very well if we use it to take the PDM seriously. Now it's time to give a bit of time for the ‘on the flyers’.

The On-the-fly Hypothesis

So if instead, we assume that a Bell pair actually is in a superposition until it’s measured, how is it so damn consistent? How does one particle even know if the other one has been measured yet? What provider are they using to communicate so fast? (mine only offers 4G)

Somehow, information is being communicated faster than the speed of light. As we saw earlier, Einstein didn’t like that idea. That’s because in this theory of space-time if you sent a message faster than light to someone moving in one direction who sent it to someone else moving in another direction who sent it back to you, then it could arrive before you sent the first one! However, the reconciliation of Einstein’s relativity (“things bend when they go fast”) with quantum mechanics (“the most accurate theory ever”) is the biggest problem in physics right now and I’m not here to settle it but to take sides (guess which).

Apart from, “because it has to” why is there this insistence that time can only go forwards? The ‘answer’ lies in something called entropy. Entropy is a pretty poorly understood concept but basically boils down to the probabilities of arrangements of particles. Imagine I gave you a bag containing 10 green marbles and 10 blue marbles. If you shook the bag would you expect to see the marbles separated by colour (ordered) or mixed more uniformly in the bag (disordered)? You’d have to be overwhelmingly lucky to randomly separate the marbles and so we assume that with shaking (ie random movement over time) the disorder (ie entropy) increases. Formally, this is called the second law of thermodynamics.

All (cool) physical laws are time-symmetric. This means if I showed you a video of some particles bouncing around according to physical laws (eg frictionless billiard balls), and then I showed you it backwards, you wouldn’t be able to tell me which was which. Or if I showed you a video of a pendulum swinging you wouldn’t be able to tell (ignoring air resistance).

Entropy is the only major exception to this: if I showed you a video of a glass dropping and breaking then you could obviously tell that that was going forward in time. Thus it is given the honour of creating the “arrow of time” that we seem to experience. Another way to think of it is that the reason you can’t unscramble an egg is the same reason you can’t go back in time, change your mind and fry it.

Full disclosure: I’m not a fan of either time or entropy. This is an unpopular opinion, but to be honest it just doesn’t sit well with me that a probabilistic argument about subjectively perceived macrostates is used to justify one of the seemingly most fundamental features of the universe (I also don’t see how gravity, a genuinely fundamental force which results in the clumping of particles over time, agrees with this; nor do I see why time couldn’t then in principle be reversed in some places at the expense of a greater increase in the rest of the system, as we do with entropy by eating food). I won’t spend too much time complaining about time though, since the philosophers Hume and Kant have shown that we can’t help but think about things in terms of causation and time which makes it impossible to consider a universe without them.

Getting back to the point, in a quantum system where the particles don’t even have a certain position, it isn’t so obvious (to me at least) that increasing entropy can emerge. Furthermore, notice that the Shrodinger equation is time-symmetric. I said at the beginning that that is one of the only things we’re going to be taking for granted, so now let’s see what can happen if we let quantum information travel either direction in time.

Imagine two entangled qubits were created and then separated:

After being spread out over space and time, the orange qubit is then measured, to give 1. This result is then observed by Alice on the left. Imagine that because entangled qubits are so desperate to agree with each other that this result is then also propagated backwards in time to the creation of the entangled state. This would mean that the measurement of the green state is now fixed to 1 as well and that is what Bob, on the right, sees too.

This is a greatly simplified adaption of the transactional interpretation of quantum mechanics by John Cramer but it still gives a bit of explanation of the consistency.

An interesting question is whether this is equivalent to the all-along interpretation we discussed earlier. In what state is the green qubit sent out? In the beginning, we said Bob receives it as a superposition but isn’t that then overwritten by the backwards-time correction? Well, I think it depends on how you look at the sequence of events. On the one hand, Bob could have only measured 1 but on the other hand, that result was only forced because of Alice’s measurement (ie it caused it without preceding it in the normal sense) and if Bob had measured first, something else could have happened.

Causes from the future are normally quite worrying because of paradoxes like “what would happen if I went back in time and killed my grandmother?” However, fixing the measurement outcome of a superposition does not appear to be in danger of creating any such paradoxes. In fact, Lev Vaidman has run real-world experiments where he found that future measurements affected present results so I’m not making as much up here as it might seem! What’s more, you can create situations where quantum states behaving independently could violate something like the conservation of momentum and so “communication” between entangled quantum states appears as a necessary explanation. But wait: doesn't this imply I can send information back in time because entanglement is spooky? The answer is no (for the time being) because you can’t know in advance what the first qubit is going to output and therefore you have no way of controlling what the second one will say. All you do know is that they’ll end up being the same.

Anyways, this transactional interpretation basically says that a ket, e.g. |ψ>, is a particle going forward in time, while a bra, e.g. <ψ|, is the same particle going backwards in time. Together they make up the state <ψ|ψ>.

“Yeah big deal, you’ve doubled the notation, whatever,” you say. It is a big deal, you’re right! If a particle is going in two opposite directions then the observed probability of measuring that state literally just drops out as ψ*ψ =|ψ|²!

Another interesting result of allowing this ‘bidirectional time’ is found by making use of the reversibility of (ALL) quantum circuits. Remember this circuit and its results from earlier:

Implicitly, the state is being prepared in the ‘000’ state, but are there any other inputs we could use that would give the same result? Well, you could examine the circuit and work through the possibilities yourself or you could run it backwards starting either in the state ‘010’ or ‘101’. Once again, I’ll be using my yet-to-be-merged qiskit helper function quick_simulate to prepare my circuit in the state x=‘101’:

These results are identical to when it’s prepared in the state ‘010’:

There are a few things to notice here:

There are two possible inputs (‘000’ and ‘001’) that map the circuit to the outputs ‘010’ and ‘101’.
These very same outputs, when used as inputs on the circuit in reverse, give back those original inputs
The set of inputs has the same size as the set of outputs
In this case, it could be written (‘000’, ‘001’): (‘101’, ‘010’). For want of a better data structure (let me know if you think of one), enumerating all possible inputs gives us:

A dictionary mapping indistinguishable inputs to indistinguishable outputs

You’re probably wondering why I’m talking so much about what must be a contrived example. The thing is, I’ve been running some tests and it seems that the above is true for all circuits (with varying mappings and set sizes). For example, here’s what it looks like for a randomly generated circuit of 5 qubits:

Notice that this time, the outputs and inputs are not quite so interchangeable. For example, ‘00000’ going forwards (first row left to right) gives different results to ‘00000’ going backwards (last row right to left). Nevertheless, there is still this very strong relationship between the two sets. You can test it yourself with my repository here.

So this means we can interpret the outputs of a quantum circuit as the set of inputs that are indistinguishable to the one that was put in. Perhaps quantum computing is just all about distinguishing potential paths through a circuit.

This leads me to speculate that a quantum circuit is constantly running forwards and backwards on the sets of possible inputs/outputs which might explain a bit about the randomness of measurement. The stable structure of the maps shown above (which I’ll try to prove when I actually know what I’m talking about) would guarantee that a sort of stable equilibrium would emerge as it runs backwards and forwards (through the circuit and time).

Note that there’s probably some way of explaining this using linear algebra and properties of unitary matrices, however, I am far more interested in what is physically happening here. In particular, I think it raises the question: at the quantum level, what is the difference between going forwards and backwards?

I am trying to convince you there is none. You might be thinking “yeah, yeah quantum mechanics breaks another fundamental assumption I had of the universe, bit of a tired trick at this point mate. My clock is still going up last time I checked”.

Indeed your clock may be going up (although I’d keep measuring if I were you). Measurement irreversibly collapses a quantum state so surely that’s what is bringing time back in? Well almost, except I also claim that

Preparation irreversibly ‘uncollapses’ a quantum state. In other words, measuring any given state from a circuit is indistinguishable to preparing that state on the reverse of the circuit.

In that view, consider a measurement as what you could have prepared the state in (and vice versa). This is of course circular (or at least self-referential), but so is bi-directional time! As a state bounces forwards and backwards in a circuit, perhaps it finds itself hopelessly wondering how it got there, too.

If you’ve managed to read this far, then perhaps you would be interested in exploring some of this further. Some interesting next steps would be to also consider the probabilities of various measurements in a set (I’ll speculate they add up to something cool) or maybe to try and find a problem where you’d like to know what else could have given you that answer (e.g. modular division) and use this approach to quickly solve it backwards.

But for now, we’re done.

Conclusion

Let’s return to our original questions and see what we have answered:

Why is the measurement probability the amplitude squared? Because every state is going forward in time, |ψ>, and backwards in time, <ψ| and, trivially, <ψ|ψ> = ψ*ψ =|ψ|²
Why does every operation, U, have to be unitary? Because all quantum circuits run both directions in time, which cancel out: U*U = I. Information is conserved.
How does entanglement work? Because a signal can be sent back in time after the first measurement, ensuring the second one will be consistent.
What is measurement? I still don’t know, but its the opposite of causation/preparation (which also doesn’t make sense but is conventionally overlooked)
What actually is a superposition and why can’t I see one? I’m still not sure, so I suppose I’m in a superposition of understanding too!
How does teleportation work? I know that’s the whole reason you came here, but it was just a trick to get you to read my rants about time. Stay tuned for part 4 of my series where I tell you how teleportation REALLY works.

3.5/6 apparently answered? I’ll take that!

Disclaimer: none of these are mainstream answers, nor are they supported by any experts (yet) so don’t take them as gospel. They are simply what I came to believe in the course of writing this article and which I have done my best to convince you of as well.

I would love to hear any and all feedback about this so please feel free to comment on this post or send me an email to edwinagnew1*at*gmail.com

(Now please read the article in reverse)