New papers dance!

Two new papers were recently posted on the arXiv with my first two official PhD students since becoming a faculty member! The earlier paper is titled Efficient online quantum state estimation using a matrix-exponentiated gradient method by Akram Youssry and the more recent paper is Minimax quantum state estimation under Bregman divergence by Maria Quadeer. Both papers are co-authored by Marco Tomamichel and are on the topic of quantum tomography. If you want an expert’s summary of each, look no further than the abstracts. Here, I want to give a slightly more popular summary of the work.

Efficient online quantum state estimation using a matrix-exponentiated gradient method

This work is about a practical algorithm for online quantum tomography. Let’s unpack that. First, the term work. Akram did most of that. Algorithm can be understood to be synonymous with method or approach. It’s just a way, among many possibilities, to do a thing. The thing is called quantum tomography. It’s online because it works on-the-fly as opposed to after-the-fact.

Quantum tomography refers to the problem of assigning a description to physical system that is consistent with the laws of quantum physics. The context of the problem is one of data analysis. It is assumed that experiments on this to-be-determine physical system will be made and the results of measurements are all that will be available. From those measurement results, one needs to assign a mathematical object to the physical system, called the quantum state. So, another phrase for quantum tomography is quantum state estimation.

The laws of quantum physics are painfully abstract and tricky to deal with. Usually, then, quantum state estimation proceeds in two steps: first, get a crude idea of what’s going on, and then find something nearby which satisfies the quantum constraints. The new method we propose automatically satisfies the quantum constraints and is thus more efficient. Akram proved this and performed many simulations of the algorithm doing its thing.

Minimax quantum state estimation under Bregman divergence

This work is more theoretical. You might call it mathematical quantum statistics… quantum mathematical statistics? It doesn’t yet have a name. Anyway, it definitely has those three things in it. The topic is quantum tomography again, but the focus is different. Whereas for the above paper the problem was to devise an algorithm that works fast, the goal here was to understand what the best algorithm can achieve (independent of how fast it might be).

Work along these lines in the past considered a single figure of merit, the thing the defines what “best” means. In this work Maria looked at general figures of merit called Bregman divergences. She proved several theorems about the optimal algorithm and the optimal measurement strategy. For the smallest quantum system, a qubit, a complete answer was worked out in concrete detail.

Both Maria and Akram are presenting their work next week at AQIS 2018 in Nagoya, Japan.

One is the loneliest prime number

You can’t prove 1 is, or is not, prime. You have the freedom to choose whether to include 1 as a prime or not and this choice is either guided by convenience or credulity.

I occasionally get some cruel and bitter criticism from an odd source. I’m putting my response here for two reasons: (1) so I that I can simply refer them to it and not have to repeat myself or engage in the equally impersonal displeasure of internet arguments, and (2) I think there is something interesting to be learned about mathematics, logic, and knowledge more generally.

It all started when I wrote a very controversial book about an extremely taboo topic: mathematics. In my book ABCs of Mathematics, “P is for Prime”. The short, child-friendly description I gave for this was:

A prime number is only divisible by 1 and itself.

I thought I did a pretty good job of reducing the concept and syllables down to a level palatable by a young reader. Oh, boy, was I wrong. Enter: the angriest group of people I have met on the internet.

You see, by the given definition, I had to include 1 as a prime number since, as we should all agree, it is divisible only by 1 and itself.

Big mistake. Because, apparently, it has been drilled into people’s heads that this is a grave error, a misconception that can eventually lead young impressionable minds to a life of crime and possibly even death! It might even end up on a list of banned books!

By a vast majority, people love the book. I am generally happy with the reponse. The baby books I write are not for everyone—I get that. And I do try to take advice from all the feedback I receive on my books. There is always room for improvement. But the intense emotions some people have with the idea of 1 being a prime number is truly perplexing. Here are some examples:

I actually love the book, but there is a big mistake. The number 1 is not a prime number! The book should not be sold like this and needs to be reprinted.

and

1 IS NOT PRIME! How could a supposed math book have an error like this in it? I am disgusted!

Yikes. So what gives? Is 1 prime, or not? The answer is: that’s not a valid question.

Let me explain.

First, let’s look at a typical definition. Compare to, for example, Wikipedia’s entry on prime numbers:

A prime number (or a prime) is a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers.

Much more precise—no denying that. It’s grammatically correct, but probably hard to parse. I wanted to avoid negative definitions as much as I could in my books. But that’s beside the point. The reason 1 is not a prime is that the definition of prime itself is contorted to exclude it!

OK, so why is that? Well, the answer is probably not as satisfying as you might like: convenience. By excluding 1 as prime, one can state other theorems more concisely. Take the Fundamental Theorem of Arithmetic, for example:

Every integer greater than 1 either is a prime number itself or can be represented as the product of prime numbers and that, moreover, this representation is unique, up to (except for) the order of the factors.

Now, this statement would not be true if 1 were a prime since, for example, 6 = 2 × 3 but also 6 = 2 × 3 × 1 and also 6 = 2 × 3 × 1 × 1, etc. That is, if 1 were prime, the representation would not be unique and the theorem would be false.

However, if we do chose to include 1 as a prime number, all is not lost. Then the Fundamental Theorem of Arithmetic would still be true if it were stated as:

Every integer is a prime number itself or can be represented as the product of prime numbers and that, moreover, this representation is unique, up to (except for) the order of the factors and the number of 1’s.

Which version do you prefer? In either case, both the definition and theorem treat 1 as a special number. I’d argue that in this context, the number 1 is more of an annoyance that gets in the way of the deeper concept behind the theorem. But in mathematics you must be precise with your language. And so 1 must be dealt with as an awkward special case no matter which way you slice it.

So, is 1 prime, or not? Well, it depends on how you define it. But in the end it doesn’t really matter, so long as you are consistent. And understanding that is a much bigger lesson than memorizing some fact you were told in grade school.

The definition given in ABCs of Mathematics is not wrong” any more than all of the other simplifications and analogies I have made are “wrong”. But, in case you were wondering, the second printing will be modified with the hope that everyone can enjoy the book. Even the angry people on the internet deserve to be happy.

Why are there so many symbols in math?

“Mathematics is the language of the universe.” — every science popularizer ever

I am a mathematician. In fact, I have a PhD in mathematics. But, I am terrible at arithmetic. Confused? I certainly would have been if a self-proclaimed mathematician told me that 15 years ago.

The answer to this riddle is simple: math is not numbers. Whenever a glimpse of my research is seen by nearly anyone but another mathematician, they ask where the numbers are. It’s just a bunch of gibberish symbols, they say.

IMG_20171212_090428
The whiteboard in my office on 12 December 2017.

Well, they are right. Without speaking the language, it is just gibberish. But why—why all these symbols?

The symbols are necessary because communicating the ideas requires it. A simple analogy is common human language.

Mandarin Chinese, for example, has many more like-sounding syllables than English. This has led to a great number of visual puns, which have become a large part of Chinese culture. For example, the phrase 福到了(“fortune has arrived”) sounds the same as 福倒了(“fortune is upside down”). Often you will see the character 福 (“fortune”, fú, which you pronounce as ‘foo’ with an ascending pitch) upside down. While, like most puns, this has no literal meaning, it denotes fortune has arrived.

Fudao
Fudao, CC BY-SA 3.0, https://en.wikipedia.org/w/index.php?curid=30725479

Not laughing? OK, well, not even English jokes are funny when they have to be explained, but you get the idea. This pun just doesn’t translate to English. (Amusingly, there is also no simple common word for pun in Chinese.)

The point here is that upside down 福, with its intended emotional response, is not something you can even convey in English. The same is true in mathematics. Ideas can be explained in long-winded and confusing English sentences, but it is much easier if symbols are used.

And, there really is a sense in which the symbols are necessary. Much like the example of 福, most mathematicians use symbols in a way that is just impossible to translate to English, or any other language, without losing most of the meaning.

Here is a small example. In the picture above you will see p(x|θ). First, why θ? (theta, the eighth letter of the Greek alphabet, by the way). That’s just convention—mathematicians love Greek letters. So, you could replace all the θ’s by another symbol and the meaning wouldn’t change. It’s like the difference between writing Chinese using characters or pinyin: 拼音 = pīnyīn.

You might think that it is weird to mix symbols, such as Roman and Greek, but it now very common in many languages, particularly in online conversations. For example, Chinese write 三Q to mean “thank you”, because 三 is 3 and, in English, 3Q sounds like ‘thank you”. In English, and probably all languages now, emojis are mixed with the usual characters to great effect. You could easily write, “Have a nice day. By the way, my mood is happy and I am trying to convey warmth while saying this.” But, “Have a nice day :)” is much easier, and actually better at conveying the message.

OK, so we are cool with Greek letters now, how about  p(x|θ)? That turns out to be easy to translate—it means “the probability of x given θ.” Unfortunately, much like any statement, context is everything. In this case, not even a mathematician could tell you exactly what p(x|θ) means since they have not been told what x or θ means. It like saying “Bob went to place to get thing that she asked for.” An English speaker recognises this as a grammatically correct sentence, but who is “she”, what is the “thing”, and what is the “place”? No one can know without context.

What the English speaker knows is that (probably) a man, named Bob, went to store to purchase something for a woman, whose name we don’t know. The amazing thing is that many more sentences could follow this and an English speaker could easily understand without the context. Have you ever read or listened to a story in which the characters are never named or described? You probably filled in your own context to make the story understandable for you. Maybe that invented context is fluid and changes as you hear more of the story.

The important point is that such actions are not taught. They come from experience—from being immersed in the language and a culture built from it. The same is true in mathematics. A mathematician with experience in probability theory could follow most of what is written on that whiteboard, or at least get the gist of it, without knowing the context. This isn’t something innate or magical—it’s just experience.

Daily activities to promote mathematical fluency in kids

An outline of a day in my life of practicing mathematical literacy. These are activities I do with all my children, ages 0, 3, 5 and 7.

If you perform an internet search on some topic of interest, say “math puzzles for kids”, you end up with literally millions of pages. It can be overwhelming. But, just pick one and go! Don’t over-prepare or have high expectations or push too hard—the vast majority of experiments fail. In fact, failure is one of the most important ways we learn.

Below is an outline of a day in my life of practicing mathematical literacy. These are activities I do with all my children, ages 0, 3, 5 and 7. I may spend longer on the more complex information or tasks with my 7 year old, but I don’t shield my 3 year old completely from it.

Numeracy

I try to constantly be aware of when I am internalizing numbers or mathematical concepts and then encourage my children to participate. These are all things I usually do silently and unconsciously, but now ask my children. Just this morning I have done the following: I asked the children to read the clock; add change; discuss routes on our daily errand run; count and weigh fruits and vegetables; ask how long is left in a video; set a timer on my phone; ask how many crossings are required to tie a shoelace; and so on.

Puzzles

There are many puzzles and games you can play that can promote and improve mathematical thinking which are branded or advertised as math games. For example, today we played the following games: Tic-tac-toe, Dots and Boxes, made mazes, played some Lego (classic blocks), and built a dodecahedron with a magnetic construction set.

Books

Attempting to connect young minds with abstract ideas is relatively new. Our bookshelf contains: Introductory Calculus For Infants by Omi M. Inouye; Non-Euclidean Geometry for Babies and The Pythagorean Theorem for Babies by Fred Carlson; and a great deal of children’s and adult reference books and encyclopedias. Of course, some nights we can’t get through enough Harry Potter and the math has to wait.

Videos

YouTube is a blessing and a curse in our house. We don’t generally let the kids watch without supervision as the value seems to be quite low without constant feedback and discussion. Today we watched a few episodes of Amoeba Sisters and I showed them some MinutePhysics, which is directed more toward adults. I keep those sessions short as the language is often too complex.

Drawing

Drawing and coloring are great ways to relax and have many cognitive benefits. Sometimes I need to give them something to imitate by joining in and other times, like today, I just put a blank sheet and some markers out. My 3 year old has recently graduated from scribbling to drawing discernible faces and intentional shapes. The other day, my 7 year old found Art for Kids Hub, a fun and engaging set of drawing tutorials which has boosted her confidence quite a bit.

Coding

Getting children into computer programming has received a lot of attention in the past couple of years. Each of my 3, 5 and 7 years olds play the games and go through the exercises on Code.org. I find it very helpful when they work simultaneously or in a pair. We didn’t do any coding today on the computer, but we did play Robot Turtles, a board game meant to teach coding skills.

We also went to the park and ran around in circles screaming, for no reason in particular.

Phew! Even when I write it all out, it looks like a lot. But, it only adds up to a few hours spread over an entire day—time I would be spending with them anyway. I try to make life rewarding and engaging not only for them, but for me as well.

What does it mean to excel at math?

“In mathematics you don’t understand things. You just get used to them.” ― John von Neumann

John von Neumann made important scientific discoveries in physics, computer science, statistics, economics, and mathematics itself. He was, by all accounts, a genius. Yet, here he is saying he “just got used” to mathematics. While this was probably a tongue-in-cheek reply to a friend, there is some truth to it. Mathematics is a language and anyone can eventually learn to speak it.

Indeed,balls_to_sphere mathematics is the language by which scientists of all fields communicate—from philosophy to physics. And by mathematics, I don’t mean numbers. Scientists communicate ideas through mental pictures which are often represented by symbols invented just for that purpose. Here is an example: think about a ball. Maybe it is a baseball, or a basketball, or—if you are in Europe—a socc…err… football. Maybe it is the Earth or the Sun. Now try to get rid of the details: the stitching, the colors, the size. What is left? A sphere. You just performed the process of abstraction. A sphere is an idea, a mental image that you can’t touch—it doesn’t exist!

sphere_to_ballWhy is this important, anyway? Well, if I can prove things about spheres, then that ought to apply to any ball in the real world. So, formulas for area and volume, for example, equally apply to baseballs, basketballs, soccer balls, and any other ball. Mathematics is a very powerful way of answering infinitely many questions at once!

Now, it is said that to become an expert at anything, you need 10,000 hours of practice. While not a hard-and-fast rule, it seems to work out in terms of acquiring modern language—10,000 hours probably works out to mid-to-late teens for an adept student. Usually, we don’t start practicing real mathematics until well after we have mastered our first language, in late high-school or college. Why not start those 10,000 hours now with your children?

Sounds great, but where do you start? The bad news is that there are no simple rules. The good news is that it doesn’t really matter where you start. With your children, you could practice numeracy, practice puzzles and games, read books, watch science videos, try to code, draw pictures, or just sit in a quiet room and think. As you do these things, encourage generalization and abstraction. Ask questions and let your child ask questions. The correct answers are not important—it is the process that counts.

I was asked recently to share some tips for parents who want their kids to excel at math and do well in the classroom later on. The trouble is, doing well in the classroom—that is, doing well on standardized tests—doesn’t necessarily correlate with understanding the language of mathematics. If you want to do well on standardized tests, then just practice standardized tests. However, if you want your kids to have the powerful tools of abstraction​ at their disposal and possibly also do well on tests, then teach them the language of mathematics.

The power of simulation: birthday paradox

The birthday paradox goes… in a room of 23 people there is a 50-50 chance that two of them share a birthday.

OK, so the first step in introducing a paradox is to explain why it is a paradox in the first place. One might think that for each person, there is 1/365 chance of another person having the same birthday as them. Indeed, I can think of only one other person I’ve met that has the same birthday as meand he is my twin brother! Since I’ve met far more than 23 people, how can this be true?

This reasoning is flawed for several reasons, the first of which is that the question wasn’t asking about if there was another person in the room with a specific birthdayany pair of people (or more!) can share a birthday to increase the chances of the statement being true.

The complete answer gets heavy into the math, but I want to show you how to convince yourself it is true by simulating the experiment. Simulation is programming a computer or model to act as if the real thing was happening. Usually, you set this up so that the cost of simulation is much less than doing the actual thing. For example, putting a model airplane wing in a wind tunnel is a simulation. I’ve simulate the birthday paradox in a computer programming language called Python and this post is available in notebook-style here. Indeed, this is much easier than being in a room with 23 people.

Below I will not present the code (again, that’s over here), but I  will describe how the simulation works and present the results.

The simulation

Call the number of people we need to ask before we get a repeated birthday n. This is what is called a random variable because its value is not known and may change due to conditions we have no control over (like who happens to be in the room).

Now we simulate an experiment realising a value for n as follows.

  1. Pick a random person and ask their birthday.
  2. Check to see if someone else has given you that answer.
  3. Repeat step 1 and 2 until a birthday is said twice.
  4. Count the number of people that were asked and call that n.

Getting to step 4 constitutes a single experiment. The number that comes out may be n = 2 or n = 100. It all depends on who is in the room. So we repeat all the steps many many times and look at how the numbers fall. The more times we repeat, the more data we obtain and the better our understanding of what’s happening.

Here is what it looks like when we run the experiment one million times.

exponemil
Simulating the birthday paradox. On the horizontal axis is n, the number of people we needed to ask before a repeated birthday was found. We did the experiment one million times and tallied the results.

So what do all those numbers mean? Well, let’s look at how many times n = 2 occurred, for example. In these one million trials, the result 2 occurred 2679 times, which is relatively 0.2679%. Note that this is close to 1/365 ≈ 0.274%, which is expected since the probability that the second person has the same as the first is exactly 1/365. So each number of occurrences divided by one million is approximately the probability that we would see that number in a single experiment.

We can then plot the same data considering the vertical axis the probability of needing to see n people before a repeated birthday.

prob
Same as the previous plot but now each bar is interpreted as a probability.

Adding up the value of each of these bars sums to 100%. This is because one of the values must occur when we do an experiment. OK, so now we can just add up these probabilities starting at n = 2 and increasing until we get to 50%. Visually, it is the number which splits the coloured area above into two equal parts. That number will be the number of people we need to meet to have a 50-50 shot at getting a repeated birthday. Can you guess what it will be?

Drum roll… 23! Tada! The birthday paradox simulated and solved by simulation!

But, wait! There’s more.

What about those leap year babies? In fact, isn’t the assumption that birthdays are equally distributed wrong? If we actually tried this experiment out in real life, would we get 23 or some other number?

Happily, we can test this hypothesis with real data! At least for US births, you can find the data over at fivethirteight’s github page. Here is what the actual distribution looks like.

dist_birth.png
Distribution of births in US from 1994-2014, by day of year.

Perhaps by eye it doesn’t look too uniform. You can clearly see 25 Dec and 31 Dec have massive dips. Much has been written about this and many beautiful visualizations are out there. But, our question is whether this has an effect on the birthday paradox. Perhaps the fact that not many people are born on 25 Dec means it is easy to find a shared birthday on the remaining days, for example. Let’s test this hypothesis by simulating the experiment with the real distribution of birthdays.

To do this, we perform the same 4 steps as above, but randomly sampling answers from the actual distribution of birthdays. The result of another one million experiments is plotted below.

truehist.png
Simulating the birthday paradox on the true distribution of births. On the horizontal axis is n, the number of people we needed to ask before a repeated birthday was found. We perform the experiment one million times and tallied the results.

And the answer is the same! The birthday paradox persists with the actual distribution of birthdays.

Nerd sniping

The above discussion is very good evidence that the birthday paradox is robust to the actual distribution of births. However, it does not constitute a mathematical proof. An experiment can only provide evidence. So I will end this with a technical question for those mathematical curiosos out there. (What I am about to do is also called Nerd Sniping.)

Here is the broad problem: quantify the above observation. I think there is more than one question here. For example, it should be possible to bound the 50-50 threshold as a function of the deviation from a uniform distribution.

(Cover image credit: Ed g2s, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=303792)