New papers dance!

Two new papers were recently posted on the arXiv with my first two official PhD students since becoming a faculty member! The earlier paper is titled Efficient online quantum state estimation using a matrix-exponentiated gradient method by Akram Youssry and the more recent paper is Minimax quantum state estimation under Bregman divergence by Maria Quadeer. Both papers are co-authored by Marco Tomamichel and are on the topic of quantum tomography. If you want an expert’s summary of each, look no further than the abstracts. Here, I want to give a slightly more popular summary of the work.

Efficient online quantum state estimation using a matrix-exponentiated gradient method

This work is about a practical algorithm for online quantum tomography. Let’s unpack that. First, the term work. Akram did most of that. Algorithm can be understood to be synonymous with method or approach. It’s just a way, among many possibilities, to do a thing. The thing is called quantum tomography. It’s online because it works on-the-fly as opposed to after-the-fact.

Quantum tomography refers to the problem of assigning a description to physical system that is consistent with the laws of quantum physics. The context of the problem is one of data analysis. It is assumed that experiments on this to-be-determine physical system will be made and the results of measurements are all that will be available. From those measurement results, one needs to assign a mathematical object to the physical system, called the quantum state. So, another phrase for quantum tomography is quantum state estimation.

The laws of quantum physics are painfully abstract and tricky to deal with. Usually, then, quantum state estimation proceeds in two steps: first, get a crude idea of what’s going on, and then find something nearby which satisfies the quantum constraints. The new method we propose automatically satisfies the quantum constraints and is thus more efficient. Akram proved this and performed many simulations of the algorithm doing its thing.

Minimax quantum state estimation under Bregman divergence

This work is more theoretical. You might call it mathematical quantum statistics… quantum mathematical statistics? It doesn’t yet have a name. Anyway, it definitely has those three things in it. The topic is quantum tomography again, but the focus is different. Whereas for the above paper the problem was to devise an algorithm that works fast, the goal here was to understand what the best algorithm can achieve (independent of how fast it might be).

Work along these lines in the past considered a single figure of merit, the thing the defines what “best” means. In this work Maria looked at general figures of merit called Bregman divergences. She proved several theorems about the optimal algorithm and the optimal measurement strategy. For the smallest quantum system, a qubit, a complete answer was worked out in concrete detail.

Both Maria and Akram are presenting their work next week at AQIS 2018 in Nagoya, Japan.

Quantum computing worst case scenario: we are Lovelace and Babbage

As we approach the peak of the second hype cycle of quantum computing, I thought it might be useful to consider the possible analogies to other technological timelines of the past. Here are three.

Considered Realism

We look most like Lovelace and Babbage, historical figures before their time. That is, many conceptual, technological, and societal shifts need to happen before—hundreds of years from now—future scientists say “hey, they were on to something”.

Charles Babbage is often described as the “father of the computer” and he is credited with inventing the digital computer. You might be forgiven, then, if you thought he actually built one. The Analytical Engine, Babbage’s proposed general purpose computer, was never built. Ada Lovelace is credited with creating the first computer program. But, again, the computer didn’t exist. So the program is not what you are currently imagining—probably, like, Microsoft Excel, but with parchment?

By the time computing began in earnest, Lovelace and Babbage were essentially forgotten. Eventually, historians restored them to their former glory—and rightfully so as they were indeed visionaries. Lovelace anticipated many ideas in theoretical computer science. However, the academic atmosphere at the time lacked the language and understanding to appreciate it.

Perhaps the same is true of quantum computation? After all, we love to tout the mystery of it all. Does this point to a lack of understanding comparable to that in computing 200 years ago?

This I see as the worst case scenario for quantum computation. We are missing several conceptual—and possibly societal—ideas to articulate this thing which obviously has merit. Eventually, humanity will have a quantum computer. But, will that future civilisation look at us as their contemporaries or a bunch of idiots mostly interested in killing each other while a few of our social elite played with ideas of quantum information?

Cautious Optimism

We are on the cusp off a quantum AI winter. We’re in for a long calm before any storm.

This is probably where most academic quantum scientists sit. We’ve seen 10-year roadmaps, 20-year roadmaps, even 50-year roadmaps. The truth is that every “scalable” proposal for quantum technology contains a little magic. We really don’t know what secret sauce is going to suddenly scale us up to a quantum computer.

On the other hand, very very few scientists believe quantum computing to be impossible—it’s going to happen eventually. At the same time, most would also not bet their own money on it happening any time soon. And, if most scientists are correct, the hype doesn’t match reality and we’re headed for a crash—a crash in funding, a crash in interest, and—worst of all—a crash in trust.

Some would argue that there are too many basic science questions unanswered before we harness the full potential of this theory that even its practitioners continue to call strange, weird, and counterintuitive. The science will march on anyway, though. Memes with truth and merit have a habit of slow and steady longevity. The ideas will evolve and—much like AI—eventually become mainstream, probably in our lifetime.

Unabated Opportunism

We will follow the same steady forward march that digital computers did the past 50 years.

If you are involved with a start-up company with an awkwardly placed “Q” in its name, this is where you sit. You believe our current devices are “quantum ENIAC machines”. Following the historical trajectory of classical computers, we just need some competitive players making a steady stream of breakthroughs and—voila!—quantum iPads for your alchemy simulations in no time. Along the way, we will reap continuing benefits from the spin-offs of quantum tech.

This is the quantum tech party line: quantum supremacy (yep, that’s a term of art now) is near. We are on the precipice of a technological—no, societal—revolution. It’s a new space race with equally high stakes. Get your Series A while the gettin’s good.

Like it or not, this is the best case scenario for the field. Scientists like to argue about what the “true” resource for quantum computation is. Turns out, it was money all along. Perhaps the hype will create a self-fulfilling prophecy that draws the the hobbyists and tinkerers that fueled much of the digital revolution. Can we engineer such a situation? I think we better find that out sooner rather than later.



Estimation… with quantum technology… using machine learning… on the blockchain

A snarky academic joke which might actually be interesting (but still a snarky joke).


A device verification protocol using quantum technology, machine learning, and blockchain is outlined. The self-learning protocol, SKYNET, uses quantum resources to adaptively come to know itself. The data integrity is guaranteed with blockchain technology using the FelixBlochChain.


You may have a problem. Maybe you’re interested in leveraging the new economy to maximize your B2B ROI in the mission-critical logistic sector. Maybe, like some of the administration at an unnamed university, you like to annoy your faculty with bullshit about innovation mindshare in the enterprise market. Or, maybe like me, you’d like to solve the problem of verifying the operation of a physical device. Whatever your problem, you know about the new tech hype: quantum, machine learning, and blockchain. Could one of these solve your problem? Could you really impress your boss by suggesting the use of one of these buzzwords? Yes. Yes, you can.

Here I will solve my problem using all the hype. This is the ultimate evolution of disruptive tech. Synergy of quantum and machine learning is already a hot topic1. But this is all in-the-box. Now maybe you thought I was going outside-the-box to quantum agent-based learning or quantum artificial intelligence—but, no! We go even deeper, looking into the box that was outside the box—the meta-box, as it were. This is where quantum self-learning sits. Self-learning is protocol wherein the quantum device itself comes to learn its own description. The protocol is called Self Knowing Yielding Nearly Extremal Targets (SKYNET). If that was hard to follow, it is depicted below.

Inside the box is where the low hanging fruit lies—pip install tensorflow type stuff. Outside the box is true quantum learning, where a “quantum agent” lives. But even further outside-the-meta-box is this work, quantum self-learning—SKYNET.

Blockchain is the technology behind bitcoin2 and many internet scams. The core protocol was quickly realised to be applicable beyond digital currency and has been suggested to solve problems in health, logistics, bananas, and more. Here I introduce FelixBlochChain—a data ledger which stores runs of experimental outcomes (transactions) in blocks. The data chain is an immutable database and can easily be delocalised. As a way to solve the data integrity problem, this could be one of the few legitimate, non-scammy uses of blockchain. So, if you want to give me money for that, consider this the whitepaper.



Above: the conceptual problem. Below: the problem cast in its purest form using the formalism of quantum mechanics.

The problem is succinctly described above. Naively, it seems we desire a description of an unknown process. A complete description of such a process using traditional means is known as quantum process tomography in the physics community3. However, by applying some higher-order thinking, the envelope can be pushed and a quantum solution can be sought. Quantum process tomography is data-intensive and not scalable afterall.

The solution proposed is shown below. The paradigm shift is a reverse-datafication which breaks through the clutter of the data-overloaded quantum process tomography.

The proposed quantum-centric approach, called self-learning, wherein the device itself learns to know itself. Whoa. 

It might seem like performing a measurement of \{|\psi\rangle\!\langle \psi|, \mathbb I - |\psi\rangle\!\langle \psi|\} is the correct choice since this would certainly produce a deterministic outcome when V = U. However, there are many other unitaries which would do the same for a fixed choice of |\psi\rangle. One solution is to turn to repeating the experiment many times with a complete set of input states. However, this gets us nearly back to quantum process tomography—killing any advantage that might have been had with our quantum resource.


Schematic of the self-learning protocol, SKYNET. Notice me, Senpai!

This is addressed by drawing inspiration from ancilla-assisted quantum process tomography4. This is depicted above. Now the naive looking measurement, \{|\mathbb I\rangle\!\langle\mathbb I |, \mathbb I - |\mathbb I\rangle\!\langle \mathbb I|\}, is a viable choice as

|\langle\mathbb I |V^\dagger U \otimes \mathbb I |\mathbb I\rangle|^2 = |\langle V | U\rangle|^2,

where |U\rangle = U\otimes \mathbb I |\mathbb I\rangle. This is exactly the entanglement fidelity or channel fidelity5. Now, we have |\langle V | U\rangle| = 1 \Leftrightarrow U = V, and we’re in business.

Though |\langle V | U\rangle| is not accessible directly, it can be approximated with the estimator P(V) = \frac{n}{N}, where N is the number of trials and n is the number of successes. Clearly, \mathbb E[P(V)] = |\langle V | U\rangle|^2.

Thus, we are left with the following optimisation problem:
\min_{V} \mathbb E[P(V)] \label{eq:opt},

subject to V^\dagger V= \mathbb I. This is exactly the type of problem suitable for the gradient-free cousin of stochastic gradient ascent (of deep learning fame), called simultaneous perturbation stochastic approximation6. I’ll skip to the conclusion and give you the protocol. Each epoch consists of two experiments and a update rule:

V_{k+1} = V_{k} + \frac12\alpha_k \beta_k^{-1} (P(V+\beta_k \triangle_k) - P(V-\beta_k \triangle_k))\triangle_k.

Here V_0 is some arbitrary starting unitary (I chose \mathbb I). The gain sequences \alpha_k, \beta_k are chosen as prescribed by Spall6. The main advantage of this protocol is \triangle_k, which is a random direction in unitary-space. Each epoch, a random direction is chosen which guarantees an unbiased estimation of the gradient and avoids all the measurements necessary to estimation the exact gradient. As applied to the estimation of quantum gates, this can be seen as a generalisation of Self-guided quantum tomography7 beyond pure quantum states.

To ensure integrity of the data—to make sure I’m not lying, fudging the data, p-hacking, or post-selecting—a blochchain-based solution is implemented. In analogy with the original bitcoin proposal, each experimental datum is a transaction. After a set number of epochs, a block is added to the datachain. Since this is not implemented in a peer-to-peer network, I have the datachain—called FelixBlochChain—tweet the block hashes at @FelixBlochChain. This provides a timestamp and validation that the data taken was that used to produce the final result.


SKYNET finds a description of its own process. Each N is a different number of bits per epoch. The shaded region is the interquartile range over 100 trials using a randomly selected “true” gate. The solid black lines are fits which suggest the expected 1/\sqrt{N} performance.

Speaking of final result, it seems SKYNET works quite well, as shown above. There is still much to do—but now that SKYNET is online, maybe that’s the least of our worries. In any case, go download the source8 and have fun!


The author thanks the quantum technology start-up community for inspiring this work. I probably shouldn’t say this was financially supported by ARC DE170100421.

  1. V. Dunjko and H. J. Briegel, Machine learning and artificial intelligence in the quantum domain, arXiv:1709.02779 (2017)
  2. N. Satoshi, Bitcoin: A peer-to-peer electronic cash system, (2008), 
  3. I. L. Chuang and M. A. Nielsen, Prescription for experimental determination of the dynamics of a quantum black box, Journal of Modern Optics 44, 2455 (1997)
  4. J. B. Altepeter, D. Branning, E. Jerey, T. C. Wei, P. G. Kwiat, R. T. Thew, J. L. O’Brien, M. A. Nielsen, and A. G. White, Ancilla-assisted quantum process tomography, Phys. Rev. Lett. 90, 193601 (2003)
  5. B. Schumacher, Sending quantum entanglement through noisy channels, arXiv:quant-ph/9604023 (1996)
  6. J. C. Spall, Multivariate stochastic approximation using a simultaneous perturbation gradient approximation, IEEE Transactions on Automatic Control 37, 332 (1992)
  7. C. Ferrie, Self-guided quantum tomography, Physical Review Letters 113, 190404 (2014)
  8. The source code for this work is available at

Why are there so many symbols in math?

“Mathematics is the language of the universe.” — every science popularizer ever

I am a mathematician. In fact, I have a PhD in mathematics. But, I am terrible at arithmetic. Confused? I certainly would have been if a self-proclaimed mathematician told me that 15 years ago.

The answer to this riddle is simple: math is not numbers. Whenever a glimpse of my research is seen by nearly anyone but another mathematician, they ask where the numbers are. It’s just a bunch of gibberish symbols, they say.

The whiteboard in my office on 12 December 2017.

Well, they are right. Without speaking the language, it is just gibberish. But why—why all these symbols?

The symbols are necessary because communicating the ideas requires it. A simple analogy is common human language.

Mandarin Chinese, for example, has many more like-sounding syllables than English. This has led to a great number of visual puns, which have become a large part of Chinese culture. For example, the phrase 福到了(“fortune has arrived”) sounds the same as 福倒了(“fortune is upside down”). Often you will see the character 福 (“fortune”, fú, which you pronounce as ‘foo’ with an ascending pitch) upside down. While, like most puns, this has no literal meaning, it denotes fortune has arrived.

Fudao, CC BY-SA 3.0,

Not laughing? OK, well, not even English jokes are funny when they have to be explained, but you get the idea. This pun just doesn’t translate to English. (Amusingly, there is also no simple common word for pun in Chinese.)

The point here is that upside down 福, with its intended emotional response, is not something you can even convey in English. The same is true in mathematics. Ideas can be explained in long-winded and confusing English sentences, but it is much easier if symbols are used.

And, there really is a sense in which the symbols are necessary. Much like the example of 福, most mathematicians use symbols in a way that is just impossible to translate to English, or any other language, without losing most of the meaning.

Here is a small example. In the picture above you will see p(x|θ). First, why θ? (theta, the eighth letter of the Greek alphabet, by the way). That’s just convention—mathematicians love Greek letters. So, you could replace all the θ’s by another symbol and the meaning wouldn’t change. It’s like the difference between writing Chinese using characters or pinyin: 拼音 = pīnyīn.

You might think that it is weird to mix symbols, such as Roman and Greek, but it now very common in many languages, particularly in online conversations. For example, Chinese write 三Q to mean “thank you”, because 三 is 3 and, in English, 3Q sounds like ‘thank you”. In English, and probably all languages now, emojis are mixed with the usual characters to great effect. You could easily write, “Have a nice day. By the way, my mood is happy and I am trying to convey warmth while saying this.” But, “Have a nice day :)” is much easier, and actually better at conveying the message.

OK, so we are cool with Greek letters now, how about  p(x|θ)? That turns out to be easy to translate—it means “the probability of x given θ.” Unfortunately, much like any statement, context is everything. In this case, not even a mathematician could tell you exactly what p(x|θ) means since they have not been told what x or θ means. It like saying “Bob went to place to get thing that she asked for.” An English speaker recognises this as a grammatically correct sentence, but who is “she”, what is the “thing”, and what is the “place”? No one can know without context.

What the English speaker knows is that (probably) a man, named Bob, went to store to purchase something for a woman, whose name we don’t know. The amazing thing is that many more sentences could follow this and an English speaker could easily understand without the context. Have you ever read or listened to a story in which the characters are never named or described? You probably filled in your own context to make the story understandable for you. Maybe that invented context is fluid and changes as you hear more of the story.

The important point is that such actions are not taught. They come from experience—from being immersed in the language and a culture built from it. The same is true in mathematics. A mathematician with experience in probability theory could follow most of what is written on that whiteboard, or at least get the gist of it, without knowing the context. This isn’t something innate or magical—it’s just experience.

Milking a new theory of physics

For the first time, physicists have found a new fundamental state of cow, challenging the current standard model. Coined the cubic cow, the ground-breaking new discovery is already re-writing the rules of physics.

A team of physicists at Stanford and Harvard University have nothing to do with this but you are probably already impressed by the name drop. Dr. Chris Ferrie, who is currently between jobs, together with a team of his own children stumbled upon the discovery, which was recently published in Nature Communications*.

Image credit: Ingrid Kallick

The spherical theory of cow had stood unchallenged for over 50 yearsand even longer if a Russian physicist is reading this. The spherical cow theory led to many discoveries also based on O(3) symmetries. However, spherical cows have not proven practically useful from a technological perspective. “Spherical cows are prone to natural environmental errors, whereas our discovery digitizes the symmetry of cow,” Ferrie said.

Just as the digital computer has revolutionized computing technology, this new digital cow model could revolutionize innovation disrupting cross-industry ecosystems, or something.

Lead author Maxwell Ferrie already has far-reaching applications for the result. “I like dinosaurs,” he said. Notwithstanding these future aspirations, the team is sure to be milking this new theory for all its worth.

* Not really, but this dumping ground for failed hypesearch has a bar so low you might as well believe it.

No one is going to take you seriously

I make jokes. I do scientist. I make jokes while doing science.

Recently, at the Australian Institute of Physics Congress I presented this poster:

I think it was generally well received. Of course it got lots of double takes and laughs, but was it a good scientific poster? One of my senior colleagues was of mixed minds, eventually concluding with some familiar life advice:

Yes, I admit it is funny. But, eventually it will catch up with you. No one is going to take you seriously. You will not be seen as a serious scientist.

Good—because I am not a serious scientist. I am a (hopefully) humorous scientist, but a scientist nonetheless.

I’m going to get straight to the point with my own advice: avoid serious scientists at all costs. They are either psychopaths or sycophants. I can’t find it in me to be either. So I’ll continue doing science, and having a bit of fun while I’m at it. You only science once, right?


How quantum mechanics turned me into a Bayesian

A few elder-statesmen of quantum theory gathered together while a handful of students listened in eagerly. Paraphrasing, one of them—quite seriously—said, “I don’t think any of the interpretations are logically consistent… but there is this ‘transactional interpretation’, where influences come from the future, that might be the only consistent one.” The students nodded their heads in agreement—I walked away.

Bayesianism is (some would say) a radical alternative philosophy and practice for both understanding probability and performing statistical analysis. So, like all young contrarian students of science, I was intrigued when I first found Bayesian probability. But where is the fun in blaming this on my own faults? I’m going to blame someone else—I’m going to blame it on Chris Fuchs.

On two occasions the Perimeter Institute for Theoretical Physics (PI) hosted lectures on the Foundations of Quantum Mechanics. I was lucky enough to be a graduate student in Waterloo at the time. The first, in 2007, was my first taste of the field. It was exciting to hear the experts at the forefront speaking about deep implications for physics andindeed—even the philosophy of science itself. I knew then that this was the area I wanted to work in.

However, I quickly became disillusioned. The literature was plagued by lazy physicists posing as armchair philosophers. There was no interest in real problems—only the pandering of borderline pseudoscience. It’s no wonder—why bother doing hard work and difficult mathematics when peddling quantum mysticism is what gets you press?

I stayed, though, because there were several researchers at PI who seemed interested in solving real, technical problemsand, they were doing so using techniques from another field I had already worked in: Quantum Information Theory. I learned an immense amount from Robin Blume-Kohout, Rob Spekkens and Lucien Hardy while there, but the one who left a lasting impression was Chris Fuchs.

Before we get to Fuchs, though, let’s back up for a momentjust what is this Quantum Foundations thing, and what has it got to do with Bayesianism? As you know, quantum theory dictates that the world is uncertain. That is, as a scientific theory, it makes only probabilistic predictions. Many of the philosophical problems and misunderstandings of quantum theory can be traced back to this fact. Thus, if one really wants to understand quantum theory, one ought to understand probability first. Easy, right?

Nope. As it turns out, more people argue about how to interpret the seemingly simple and everyday concept of probability than do our most sophisticated and complex physical theory. Generally speaking, there are two camps in the interpretations of probability: frequentists and Bayesians. As noted, every student begins as one of the former. It was in 2010 when my conversion to the latter was complete.

Summary of the interpretations of quantum theory.

In 2010, PI hosted its second course on the foundations of quantum theory. This time around I had a few years of experience under my belt and my bullshit detectors were on high alert. My final assignment was to summarize the course, as shown above. The only lectures that didn’t leave me disappointed where Chris Fuchs’. Because I had been reading up on Bayesian probability anyway, his “Quantum Bayesian” interpretation of quantum theory just clicked.

And it wasn’t just about philosophy. Concurrently, I was taking a great course on Stochastic Processes from Matt Scott. Most of this field takes an objective (frequentist) view of probability. Matt was patient with my constant questions on how to phrase the concepts in terms of the subjective Bayesian view. I was starting to feel a bit overwhelmed with the burden of translating everything to the new framework… then it happened.


The assignment question was as follows: Show that \int_\infty^\infty p(x_2,x_1;t_2-t_1) d_{x_1} = p(x_2;t_2), the simplest case of the Chapman-Kolmogorov equation. What you were supposed to do is use the Markov property, p(x_2,x_1;t_2-t_1) = p(x_2|x_1;t_2-t_1)p(x_1|t_1), and integrate. It was a straightforward, but tedious, calculation. Here is what I wrote: first  p(x_2,x_1;t_2-t_1) = p(x_1|x_2;t_1-t_2)p(x_2|t_2). Since p(x_1|x_2;t_1-t_2) is a probability, it integrates to 1. Done.

This was seen as unphysical because t_1-t_2 is a negative time. So what?I thoughtprobabilities are not physical, they are subjective inferences. If I want to consider negative time to help me do my calculation, so be it. After all, I considered negative money to get me into university. But what I couldn’t believe is how difficult it was to convince others the solution was correct. It was at that moment I realized how powerful a slight change of view can be. I was a Bayesian.