Nisiyama_Suzune's blog

By Nisiyama_Suzune, 6 months ago, In English,

If you've ever taken some lessons on competitive programming, chances are that you have already heard about one of the most famous formula: the Möbius inversion. This article is aimed to provide some basic insight on what is the Möbius inversion, as well as how to apply it in various programming tasks.


If you are not familiar with the linear sieve and multiplicative functions, it is recommended that you read about them first here.

I will introduce some frequently used notations and lemmas first.


  1. [P] refers to the boolean expression, i.e. [P] = 1 when P is true, and 0 otherwise.
  2. x refers to rounding x down to the nearest integer. Thus refers to the integer division.
  3. d|n means that d can divide n (without a remainder).

The following functions are all multiplicative functions, where p is a prime number and k is a positive integer.

  1. The constant function I(pk) = 1.
  2. The identity function Id(pk) = pk.
  3. The power function Ida(pk) = pak, where a is constant.
  4. The unit function .
  5. The divisor function , denoting the sum of the a-th powers of all the positive divisors of the number.
  6. The Möbius function μ(pk) = [k = 0] - [k = 1].
  7. The Euler's totient function φ(pk) = pk - pk - 1.


I have some my unofficial names for these frequently used conclusions. If you happen to know any more commonly used name for them, you are more than welcome to tell me.

  • ( The integer division lemma ) For every positive integer p, q and r, .

Proof: We know that . Hence . Since the fraction part of cannot exceed , we achieve the conclusion.

  • ( The harmonic lemma ) Consider the harmonic sequence on integer division .
  1. The sequence is non-increasing, and there are at most different elements.
  2. The sum of the sequence is approximate to .

Proof: Denote . For every i not exceeding , there will be no more than values in the domain of d(i), so there can be at most different values for d(i). For the rest part of i greater than , we can infer that and is a positive integer, so there can be at most different values for d(i). Therefore we know that there are at most different elements in the sequence.

Since the Euler–Mascheroni constant states that , we know that , so the sum of the original sequence is approximate to .

Moreover, it is actually possible to exploit the property that the sequence has at most different elements, and write a loop that runs in complexity to iterate through every possible value of d(i), using the fact that the greatest integer x satisfying d(x) = d(i) is . The piece of code below demonstrates one way to program it.

for (int i = 1, la; i <= n; i = la + 1) {
	la = n / (n / i);
	//n / x yields the same value for i <= x <= la.
  • ( The transitive property of multiplicative functions )
  1. If both f(x) and g(x) are multiplicative, then h(x) = f(x)g(x) is also multiplicative.
  2. If both f(x) and g(x) are multiplicative, then their Dirichlet convolution is also multiplicative. Specifically, if f(x) is multiplicative, is also multiplicative.

The proof is more mathematical, which I will skip here.

What is the Möbius inversion?

According to Wikipedia, the Möbius inversion states that:

If for every positive integer n, then , where μ(x) is the Möbius function, which is multiplicative and satisfies f(1) = 1, f(p) =  - 1, f(pk) = 0 for any prime number p and any integer k ≥ 2. ( It is worth noting that you can pre-compute all values of the Möbius function from 1 to n using the linear sieve. )

However, the definition alone does not mean much (unless the problem statement explicitly gives you something like . In that case, well...). There is one important property that is probably more useful than the definition:

In order to prove this property, we have to use the transitive property of multiplicative functions to show that is multiplicative. After that, we can see f(1) = 1 and f(pk) = 0, which means f(x) = ε(x). Now you may ask: why is this property important? I think it is best to show the reasons in some basic examples below.

Example 1. Find out the number of co-prime pairs of integer (x, y) in range [1, n].

Solution 1. Notice that the question is the same as asking you the value of

Now apply the Möbius inversion on [gcd(i, j)] = 1, we have

which is the same as

Notice that [d|gcd(i, j)] = [d|i][d|j]. Therefore

We can change the order of summing things up, so

We know that . Thus

Then we can simply loop through 1 to n to compute the formula. While we can optimize this loop to using the harmonic lemma, we still have to use the linear sieve to pre-compute the values of μ(d). Therefore, the overall complexity of the algorithm is O(n).

Example 2. Find out the sum of gcd(x, y) for every pair of integer (x, y) in range [1, n] (gcd(x, y) means the greatest common divisor of x and y).

Solution 2. Similar as above, notice that the question is the same as asking you the value of

Let k = gcd(i, j). We can then loop for k first.

Now assume i = ak, j = bk, and then

It can be observed that the last part of the formula is the same as the one in example 1. Applying the same procedure, we have

According to the harmonic lemma, . Therefore, if we compute the sum above with brute force, the overall complexity will be .

This is, however, not the best complexity we can achieve with the Möbius inversion. Now let l = kd, and we can loop for l first.

Applying the transitive property of multiplicative functions, we can see that is multiplicative. Moreover, we have g(pk) = pk - pk - 1. If you have somewhat studied about the number theory, you may figure out that g(l) is actually the Euler's totient function. Regardless of whether you know about the coincidence or not, g(l) can be pre-computed with the linear sieve, and , which can be simply computed in O(n) complexity.

Example 3. Find out the sum of lcm(x, y) for every pair of integer (x, y) in range [1, n] (lcm(x, y) means the least common multiple of x and y).

Solution 3. Well, we should be pretty familiar with the techniques by now. First of all, the answer should be

Since , let k = gcd(i, j). We can then loop for k first.

Now assume i = ak, j = bk, and then

Now let's proceed to assume a = dp, b = dq, then

Now notice , and we have

Now following the example above, let l = kd, and then

Notice that is also multiplicative, and g(pk) = pk - pk + 1. We can pre-compute the values of g(l) via the linear sieve, and . The overall complexity will be O(n).

Practice Problems

Simple Sum

(Thanks _threat_ for showing me this one!)




Sky Code



It is known that all examples above can be further optimized to . The exact technique will (probably) be introduced in the upcoming article.

Update: The article introducing the optimization is ready here.

Finally, thanks Tommyr7 for his effort in the article!

  • Vote: I like it  
  • +202
  • Vote: I do not like it  

6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Excellent tutorial. If you don't mind can you just write down in 1-2 sentences how to precompute the Mobius function using linear sieve ?

  • »
    6 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it
        m[1] = 1;
        for (int i = 2; i < MAXN; ++i) {
            if (!lp[i]) for (int j = i; j < MAXN; j += i)
                if (!lp[j]) lp[j] = i;
            m[i] = [](int x) {
                int cnt = 0;
                while (x > 1) {
                    int k = 0, d = lp[x];
                    while (x % d == 0) {
                        x /= d;
                        if (k > 1) return 0;
                if (cnt & 1) return -1;
                return 1;
5 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you explain how to use Mobius inversion in The GCD problem link given here. I am having problem in using the theory given the that problem.

  • »
    4 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Step 1: if and only if , and . Using this we can reduce the problem of finding 0 < x ≤ b, 0 < y ≤ d such that to the problem of example 1.

    Step 2: The sum over the interval [a, b] × [c, d] is the sum over [1, b] × [1, d] minus [1, a - 1] × [1, d] minus [1, b] × [1, c - 1] plus [1, a - 1] × [1, c - 1].