eatmore's blog

By eatmore, history, 17 months ago, In English

Here I want to tell my personal opinion of the Microsoft Q# coding contest. First, I'd like to thank Microsoft Quantum team for making a great contest in an unusual format. All the problems were interesting (except D problems ‒ more about them below), and I'm looking forward to similar contests in the future. However, there is a number of issues that I'd like to see discussed and hopefully fixed.

The tools

The first problem with Q# compiler that everyone trying to use it will see is that it is SLOW. On my machine, compiling a trivial Hello World program takes about 45 seconds. This is a well known problem, reported both on GitHub and even here on Codeforces, and frankly it's a PITA to wait so much after every fix to know if your code still compiles (or still doesn't).

But this is only a tip of the iceberg. Before the contest, I decided to read the source code of the compiler and the libraries to learn more about them (unfortunately, the documentation is not comprehensive). And the code I saw was not good: of poor quality, overly complicated and inefficient. Everywhere I looked, there were bugs. Overall, I reported 8 issues before and during the contest, but given that I only looked through a fraction of all code, I'm sure there are many more.

Before looking at the code, I thought that they would parse the entire file using either a parser automatically generated from a grammar, or a manually created one. But actually it was very different. Their compiler is designed for incremental parsing, that is, to re-parse a file more efficiently if only a small part of it changes. To that end, they first split the file into lines, and then they split the lines into "code fragments" (simple statements and parts of compound statements) by scanning the text for occurrences of the characters {, } and ;. Because these characters can also appear in string literals and comments, they try to identify those as well, but this code has a bug which makes it not work with nested string literals. Then they parse each fragment using a parser made using FParsec library, but it is written in such a way that it would re-parse many subexpression multiple times. For example, when parsing bracketed constructs (such as tuples), they would first scan the code to find a closing parenthesis, and then they would apply a parser to a substring between the current position and the closing parenthesis. Also, when parsing an expression, they at some point try to parse it as a function call (which would first parse anything that can be used in place of a function name, such as a tuple or an array indexing expression), and then, if there is no function call, they will parse the same subexpression again (not looking for a function call this time).

Overall, I think that the complexity of their parser overrides any benefit from it being able to parse incrementally (which it doesn't do anyway when compiling from a command line). Another problem is that they "compile" Q# programs by converting them into C#. As a result, it is not possible to use C# keywords in Q# programs (that restriction is actually not necessary, because in C# it is possible to use keywords as identifiers by prefixing them with @, but perhaps the authors of Q# compiler don't know that). Also, if you name your operations or variables in a certain way, Q# compiler may produce C# code that doesn't compile or has unexpected semantics.

Quantum classification

Quantum classification, at least in the way it is implemented in Q#, is a joke. The systems that are actually used for classification, such as neural networks and decision trees, have a very useful property: with a sufficiently big model and enough training, they can learn to approximate an arbitrary function to an arbitrary precision. With quantum models, this is not possible. The set of functions that a quantum model can learn is very limited: specifically, quantum models can only recognize classes of the form $$$q(\mathbf{f})>0$$$, where $$$q$$$ is a quadratic form (a homogeneous polynomial of degree two) and $$$\mathbf{f}$$$ is a vector of features. Essentially, a quantum classifier is exactly as powerful as a linear classifier with threshold zero applied to all pairwise products of features (the proof that it is possible to make a quantum model that corresponds to every quadratic form is not obvious and is left as an exercise to the reader).

Classical preprocessing helps, but only a little. Adding constant elements (method 1) allows adding linear and constant terms (so now it can recognize the classes of the form $$$q(\mathbf{f})+\mathbf{a}\cdot\mathbf{f}+b>0$$$), which makes this method the most useful. Conversely, method 2 is completely useless, because every quadratic form of a tensor product of a feature vector and a constant vector is equivalent to a quadratic form of the feature vector alone. Method 3 essentially allows using degree-4 polynomials instead of degree-2 ones, but this is not very useful without constant terms, and method 4 gives the same as method 1 plus some pairwise products of features (but there is no way to choose which).

From all this, one can notice that instead of training a quantum model it may be more efficient to train an above mentioned linear classifier and then compute an equivalent model. But I didn't do this. During the contest, I tried to do normal training (using TrainSequentialClassifier), but it took too long so I instead made all the models by hand. The process is simple: first, use matplotlib to show the points on the screen, and manually find a description of the class that should be recognized. Then, convert it to a quantum model. This is not difficult: I only used Y rotations, and they are equivalent to rotations of a 2D vector. One of the aspects that is not mentioned in the documentation is that it is the last qubit that is measured (in Z basis) to get the result. The measurement computes the sum of squares of amplitudes of states with the last qubit having value 0 and compares it with the sum of squares of amplitudes of states with the last qubit having value 1. When making a model, I used three tricks. The first trick is to use rotations by $$$\pi$$$ to swap states. By applying multiple controlled rotations, it is possible to permute the states in any way, but I used classical preprocessing to arrange the states to not require many swaps. The second: to make a sum of two numbers, I used rotation by $$$\frac{\pi}{2}$$$ (essentially, a Hadamard gate). From two numbers, it produces their sum and their difference (up to a constant). Then, to get rid of the difference, I used the third trick: when performing a rotation by $$$\frac{\pi}{2}$$$ on a pair of states with amplitudes $$$(x,0)$$$, the result is two states with equal amplitudes, which cancel each other. From this: having constants (from classical preprocessing), sums, being able to discard numbers, and comparing sums of squares, it is possible to solve all five D problems.

As I said, I am looking forward to future Q# contests. I hope that the next contest will have harder problems (29 people solving every problem is too many), better tools and no quantum classification problems.

 
 
 
 
  • Vote: I like it
  • +264
  • Vote: I do not like it

»
17 months ago, # |
  Vote: I like it +11 Vote: I do not like it

Maybe it'd be a good idea to make two contests, one educational (like this, where you train that "quantum thinking" skills) and one with "real" problems like from regular CF contests, that require building quantum algorithms?

»
17 months ago, # |
  Vote: I like it +5 Vote: I do not like it

In the graphical representation they show for an example classifier, isn't it the first qubit (index 0) that is measured?

  • »
    »
    17 months ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    Everywhere I looked, there were bugs.

    Their documentation is no better than their code. I see some other problems with that illustration:

    • They use X and Z rotations, but you'd actually want to use Y rotations exclusively because they keep the amplitudes real.
    • The last two rotations are redundant because they don't affect the qubit that is measured.

    Here is the code that is doing the measurement. For some reason, they are writing this instead of just M(Tail(qs)).

    • »
      »
      »
      17 months ago, # ^ |
        Vote: I like it +11 Vote: I do not like it

      I see. Well, since it's the last qubit that is actually measured, at least the last two rotations aren't redundant anymore.

  • »
    »
    17 months ago, # ^ |
    Rev. 2   Vote: I like it +19 Vote: I do not like it

    The graphical representation is wrong, the last qubit is measured.

    Otherwise the last two rotations would be meaningless, and indeed they seem meaningless from the graphical representation but they're actually not.

    • »
      »
      »
      17 months ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      And here I was, wondering what those last two rotations were for. :)

»
17 months ago, # |
  Vote: I like it +20 Vote: I do not like it

Thank you for the feedback. I will pass it along to the team working on the compiler and the libraries.

P.S. 29 seems just about right to me :-)

  • »
    »
    17 months ago, # ^ |
      Vote: I like it +26 Vote: I do not like it

    I'd wager the number of full solves would be significantly higher if the QML documentation was better. It was extremely frustrating; you'd have to poke around the source code to find information the contest expects us to have.