### box's blog

By box, 15 months ago, Consider the following problem (general form of original Min25 sieve):

We are given some multiplicative function $f(n)$, described at prime powers by $f(p^k)=g(p,k)$, where $g(p,1)$ is a polynomial of low degree, and we can calculate $g(p,k)$ for any singular $p$ and $k$ in $O(1)$.

How do we calculate the prefix sum of $f(n)$ at the distinct positive integer values of $\lfloor\frac Nk\rfloor$, where $N\le 10^{10}$? Clearly, brute force or regular sieves are not going to cut it here.

Prerequisite: For any integer $N$, for all integers $1\le k\le N$ there are at most $2\sqrt N$ distinct values of $\lfloor\frac Nk\rfloor$. Call these $2\sqrt N$ such values the blocks of $N$. (A note on notation: In the following text, $\frac Nk$ will be used to describe regular division, and $N/k$ will be used to denote floored division.)

Proof

Our workflow is going to be similar to this:

1. We iterate over each nonzero exponent $T$ of $g(n,1)$. We first use the prefix sum of $x^T$ to estimate the contribution of $x^T$ to each prefix sum.
2. Then, we sieve away composite values to ensure that our new prefix sum only counts prime $n$.
3. After summing the result of the step 3 sieve over all $T$, we have obtained the prefix sum of $f(n)[n\in\mathbb P]$, where $\mathbb P$ is the set of primes.
4. Then, we sieve back composite values to restore a correct answer.

The first step is done trivially by an interpolation of your choice.

The second step asks us to evaluate the following function at the blocks of $N$:

$S(n)=\sum\limits_{i=1}^n[i\in\mathbb P]i^T$

Let $p_1,p_2,\dots$ be the sequence of primes, $p_0=1$, and $\mathbb P(i)$ be the minimum prime factor of $i$.

Consider the following, modified, version of $S$, which "simulates" sieving away primes:

$S(n,j)=\sum\limits_{i=1}^n[i\in\mathbb P\lor\mathbb P(i)>p_j]i^T$

As $j$ tends to infinity, each $n$ satisfies $S(n)=S(n,j)$. In particular, when $p_j^2>n$, we have $S(n)=S(n,j)$, because if $x$ is composite its minimum prime factor must not be greater than $\sqrt x$! This means that we don't need to check all primes up to $N$, but instead only up to $\sqrt N$. Now, we have the edge case:

$S(n,0)=\sum\limits_{i=2}^ni^T$

The problem becomes: how do we convert $S(n,j-1)$ evaluated at the blocks of $N$ to $S(n,j)$ evaluated at the blocks of $N$? We effectively need to sieve away the composites with a minimum prime factor of exactly $p_j$.

In fact, the integers with a minimum prime factor of exactly $p_j$ are part of $p_j^TS(n/p_j,j-1)$, because this fixes at least one power of $p_j$, but unfortunately this includes primes less than $p_j$. So we simply subtract away them, and get what we need to sieve away: $p_j^T(S(n/p_j,j-1)-S(p_{j-1},j-1))$.

This gives us an recurrence:

$S(n,j)=S(n,j-1)-p_j^T(S(n/p_j,j-1)-S(p_{j-1},j-1))$

Note that $S(p_{j-1},j-1)$ is simply the sum of the $T$-th powers of the first $j-1$ primes, so this can be precalculated. We replace this with $W_{j-1}$:

$S(n,j)=S(n,j-1)-p_j^T(S(n/p_j,j-1)-W_{j-1})$

It is now apparent that the values of $S(n,j)$ at blocks of $N$ only depend on values of $S(n,j-1)$ at blocks of $N$; hence we maintain $S(n,j)$ at blocks of $N$ as we slowly increase $j$ until $p_j^2>N$.

What is the time complexity of this?

Observe that for some $j$, we only need to update values of $n$ where $p_j^2<n$, because at other positions, $S(n,j-1)$ is already a correct estimate of $S(n)$. (Without this optimization, the time complexity becomes $O(N/\log N)$.)

This means that the time complexity of this part is proportional to the sum of the number of primes less than $n$ for each $n$ in the blocks of $N$. Here it suffices to consider only the $n>\sqrt N$, as brute force works with the other half.

We have:

$\sum\limits_{k=1}^{\sqrt N}\pi(\sqrt{N/k})=\sum\limits_{k=1}^{\sqrt N}O(\frac{\sqrt{\frac Nk}}{\log\frac Nk})$
$=O(\frac{\int_{1}^{\sqrt N}\sqrt{\frac Nk}dk}{\log N})=O(\frac{N^{3/4}}{\log N})$

Now, we move on to the fourth step.

To sieve composite values back, we consider "reversing" the process we did in the second step. This suggests that we still use a function sieving based on the minimum prime factor. Define

$R(n,j)=\sum\limits_{i=2}^n[i\in\mathbb P\lor\mathbb P(i)\ge p_j]f(n)$

Our base case is now

$R(n,\pi(\sqrt n)+1)=S(n)$

Our problem becomes to transition from $j+1$ to $j$; that is, to sieve back in integers with a minimum prime factor of $p_j$.

The composite number in question, $x$, could fall into two categories.

1. $x$ may have prime factors that are not $p_j$; in other words, $x=p_j^cy$ for some $1\le c$ and $p_j<\mathbb P(y)$.
2. $x$ is a non-prime power of $p_j$; in other words, $x=p_j^c$ for some $1<c$. Note that, if $c=1$, then $x$ is not composite.

We count the contribution of all $c$ where $p_j^c\le n$. For the second type, we may count by evaluating $g(p_j,c)$ with brute force, so we care about deriving the value of the first type.

We must select $y$ such that $1\le y\le n/p_j^c$ and $\mathbb P(y)>p_j$. Like our calculation for the second step, this quantity is $R(n/p_j^c,j+1)-\text{included primes}=R(n/p_j^c,j+1)-S(p_j)$. If we aren't careful in our enumeration of $c$, it might be that $p_j>n/p_j^c$, so we take $\max(0,R(n/p_j^c)-S(p_j))$.

Remember that $f$ is multiplicative, so to account for the power $p_j^c$, we just multiply this by $g(p_j,c)$.

To summarize what we are sieving in:

$\sum\limits_{c=1,p_m^c\le n}g(p_j,c)\max(0,R(n/p_j^c)-S(p_j))+\sum\limits_{c=2,p_m^c\le n}g(p_j,c)$

Observe that for the first summation, $c$ contributes if and only if $p_m^{c+1}\le n$, so we can rewrite the summation and remove the maximum operand — to get a full recurrence:

$R(n,j)=R(n,j+1)+\sum\limits_{c=1,p_m^{c+1}\le n}g(p_j,c)(R(n/p_j^c)-S(p_j))+g(p_j,c+1)$

Iterating $j$ from $\pi(\sqrt n)+1$ to $0$, our answers are at $R(n,0)$.

The analysis of the time complexity of this part is practically the same as that of the first part.

## Example: SPOJ ASSIEVE

(Yes, in fact this entire blog post was to advertise this problem of mine.)

First let's weaken the problem a bit: How do we calculate

$g(n,k,r)=\sum\limits_{i=1}^ni^k[L(i)\equiv r\pmod 4]$

where $L(i)$ is the integer log function, equal to the sum of the prime divisors of $i$, counting repetition?

At first it is not obvious at all how this is a multiplicative function. However, the Min25 sieve is not bound to functions in the integers. It works perfectly fine for polynomials too!

Consider the multiplicative function $\mathfrak L(i)$, defined by $\mathfrak L(p^c)=p^{ck}x^{(pc)\bmod 4}$, operating on $\mathbb Z/p\mathbb Z[x]/(x^4-1)$. This happens to be the cyclical convolution ring of polynomials of degree $4$ — in other words, convolution here is defined so that $x^{i+j}$ overflows to $x^{(i+j)\bmod 4}$. This is exactly what we need — addition modulo $4$!

Unfortunately, this function does not seem to satisfy an important requirement of the sieve: $p^kx^{p\bmod 4}$ is not a polynomial, nor a fully multiplicative function. A first instinct would be to do FFT/NTT — but this does not help either.

The solution would be to completely circumvent the first step. Instead, we take the true meaning of the function $p^kx^{p\mod 4}$. To find the prefix sum of this at $n$, what we really want is for each $z=0,1,2,3$, the sum of the $k$-th powers of all primes less than $n$, where said prime modulo $4$ is equivalent to $z$.

Surprisingly, this is doable.

To formalize, we need, at the blocks of $N$:

$z=0,1,2,3:\sum\limits_{i=1}^{n}[i\in\mathbb P\land i\equiv z\pmod 4]i^k$

Observe that for $z=0$ the answer is always $0$, and for $z=2$ the answer is $[2\le n]2^k$. Consider how to sieve away the answers for $z=1$ and $z=3$ in parallel.

Let us recall the recurrence for the second step of the sieve:

$S(n,j)=S(n,j-1)-p_j^T(S(n/p_j,j-1)-W_{j-1})$

In order to incorporate $z$, we need to augment this into a polynomial, with nonzero coefficients of $x^1$ and $x^3$. Let the current prime in question be $p=p_j$, and let the previous prime be $p'=p_{j-1}$. We want

$[x^{z|z\in\{1,3\}}]S(n,j)=\sum\limits_{i=1}^n[i\in\mathbb P\lor\mathbb P(i)>p_j][i\equiv z\pmod 4]i^k$

Remember that from $p'$ to $p$, we sieve away the composites with a minimum prime factor of $p$, so these composites can all be expressed in the form $pd$. Now, we can do casework on $p$ modulo $4$. In the following, $p,pd,d$ are all expressed modulo $4$.

1. If $p=1$, then:
1. For $pd=1$, we must have $d=1$. Hence we seive away $p^k[x^1](S(n/p_j,j-1)-W_{j-1})$ from $[x^1]S(n,j)$.
2. For $pd=3$, we must have $d=3$. Hence we seive away $p^k[x^3](S(n/p_j,j-1)-W_{j-1})$ from $[x^3]S(n,j)$.
2. If $p=3$, then:
1. For $pd=1$, we must have $d=3$. Hence we seive away $p^k[x^3](S(n/p_j,j-1)-W_{j-1})$ from $[x^1]S(n,j)$.
2. For $pd=3$, we must have $d=1$. Hence we seive away $p^k[x^1](S(n/p_j,j-1)-W_{j-1})$ from $[x^3]S(n,j)$.

To calculate $S(n,0)$, use any polynomial interpolation algorithm.

This recurrence now allows us to successfully complete the second step of sieving. The fourth step is virtually the same, except we operate in $\mathbb Z/p\mathbb Z[x]/(x^4-1)$.

Notice that the given function at primes has not changed in the original problem. We only need to modify the multiplication operation in the fourth step to complete our solution.

I hope you enjoyed this blog! If you have any interesting problems that can be solved with this sieve, let me know.

Have a nice day :) Comments (8)
 » Is this a variation on the derivation of prime-counting (Meissel-Lehmer) algorithm?
•  » » I guess it counts as an extension. Meissel-Lehmer is a closely related algorithm that doesn't need step 3 and step 4, making it more amenable to optimizations.
•  » » » I never worked through the math myself, I just used someone else's implementation. Eventually I will go through rhe math.
 » It's a standard trick in this type of number theory harmonic-lemma algorithm to use the brute-force technique all the way up to $\tilde{O}(n^{2/3})$ instead of only up to $\sqrt{n}$. This results in a slightly better time complexity, $\tilde{O}(n^{2/3})$ instead of $\tilde{O}(n^{3/4})$.
•  » » Oh yeah I saw this in Project Euler solutions. I assume it is this? https://www.ams.org/journals/mcom/1985-44-170/S0025-5718-1985-0777285-5/home.html
 » Who is this Min25 guy anyway? They're like a modern Prometheus, bringing the most arcane NT-algorithms into our measly CP realm. Yet I've never heard anything about them except their nickname.
•  » »
 » The idea of using two different functions that are only equal at prime arguments is brilliant!Also there is a simpler description of step 2 that shows how to solve this problem for any number instead of just 4.What we need is to find $\sum f(p a_i)$ given $p$ and $\sum f(a_i)$. In our case, $f(n)[x] = n^k x^n$ -- a polynomial in $x$.But then $f(p a)[x] = (p a)^k x^{p a} = p^k (a^k (x^p)^a) = p^k f(a)[x^p]$ which is linear at can be applied to the whole sum.