[Tutorial] An interesting counting problem related to square product

The statement:

Given three integers $$$n, k, p$$$, $$$(1 \leq k \leq n < p)$$$.

Count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$1 \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the number can be big, output it under modulo $$$p$$$.

For convenient, you can assume $$$p$$$ is a large constant prime $$$10^9 + 7$$$

Notice that in this blog, we will solve in $$$O(n)$$$ for general $$$k$$$ from the original task

For harder variants you can see in this blog [Variants] An interesting counting problem related to square product

Solution for k = 1

The answer just simply be $$$n$$$

Solution for k = 2

Problem

Given $$$n (1 \leq n \leq 10^6)$$$, count the number of pair $$$(a, b)$$$ that satisfied

$$$1 \leq a < b \leq n$$$
$$$a \times b$$$ is a perfect square.

Since the number can be big, output it under modulo $$$10^9 + 7$$$.

You can submit here: https://lqdoj.edu.vn/problem/aicprtsp2.

Examples

Example 1

Input 1:

Output 1:

Explanation 1:

There are no satisfied integer pair $$$(a, b)$$$ that $$$1 \leq a < b \leq 1$$$

Example 2

Input 2:

Output 2:

Explanation 2:

There are $$$4$$$ satisfied pairs: {$$$1, 4$$$}, {$$$1, 9$$$}, {$$$2, 8$$$}, {$$$4, 9$$$}.

Example 3

Input 3:

Output 3:

Explanation 3:

There are $$$16$$$ satisfied pairs: {$$$1, 4$$$}, {$$$1, 9$$$}, {$$$1, 16$$$}, {$$$1, 25$$$}, {$$$2, 8$$$}, {$$$2, 18$$$}, {$$$3, 12$$$}, {$$$4, 9$$$}, {$$$4, 16$$$}, {$$$4, 25$$$}, {$$$5, 20$$$}, {$$$6, 24$$$}, {$$$8, 18$$$}, {$$$9, 16$$$}, {$$$9, 25$$$}, {$$$16, 25$$$}.

Algorithm

We need to count the number of pair $$$(a, b)$$$ that $$$1 \leq a < b \leq n$$$ and $$$a \times b$$$ is perfect square.

Every positive integer $$$x$$$ can be represent uniquely as $$$x = u \times p^2$$$ for some positive integer $$$u, p$$$ and $$$u$$$ as small as possible ($$$u$$$ is squarefree number).

Let represent $$$x = u \times p^2$$$ and $$$y = v \times q^2$$$ (still, minimum $$$u$$$, $$$v$$$ ofcourse).

We can easily proove that $$$x \times y$$$ is a perfect square if and if only $$$u = v$$$.

So for a fixed squarefree number $$$u$$$. You just need to count the number of ways to choose $$$p^2$$$.

The answer will be the sum of such ways for each fixed $$$u$$$.

Illustration

Squarefree Sieve 10x10 Table | Iteration 0

Squarefree Sieve 10x10 Table | Iteration 1

Squarefree Sieve 10x10 Table | Iteration 2

Squarefree Sieve 10x10 Table | Iteration 3

Squarefree Sieve 10x10 Table | Iteration 5

Squarefree Sieve 10x10 Table | Iteration 6

Squarefree Sieve 10x10 Table | Iteration 7

Squarefree Sieve 10x10 Table | Iteration 10

Squarefree Sieve 10x10 Table | Iteration 11

Squarefree Sieve 10x10 Table | Iteration 13

Squarefree Sieve 10x10 Table | Iteration 14

Squarefree Sieve 10x10 Table | Iteration 15

Squarefree Sieve 10x10 Table | Iteration 17

Squarefree Sieve 10x10 Table | Iteration 19

Squarefree Sieve 10x10 Table | Iteration 21

Squarefree Sieve 10x10 Table | Iteration 22

Squarefree Sieve 10x10 Table | Iteration 23

Squarefree Sieve 10x10 Table | Iteration 100

Implementation

Implementation using factorization

vector<int> prime; /// prime list
vector<int> lpf;   /// Lowest prime factor, lpf[x] is smallest prime divisor of x
void sieve(int lim = LIM) /// O(n)
{
    prime.assign(1, 2);
    lpf.assign(lim + 1, 2);

    lpf[1] = 1; /// For easier calculation but can cause inf loops
    for (int i = 3; i <= lim; i += 2) {
        if (lpf[i] == 2) prime.push_back(lpf[i] = i);
        for (int j = 0; j < sz(prime) && prime[j] <= lpf[i] && prime[j] * i <= lim; ++j)
            lpf[prime[j] * i] = prime[j];
    }
}

/// mask(x) is smallest positive number that mask(x) * x is a perfect square
int getMask(int x) /// O(log n)
{
    int mask = 1;
    while (x > 1) {
        int p = lpf[x], f = 0;
        do x /= p, f++; while (p == lpf[x]);
        if (f & 1) mask *= p; /// if current power is odd, we mutiple mask with current prime
    }
    return mask;
}

int cnt[LIM];
int magic(int n) /// O(n log max(a))
{
    memset(cnt, 0, sizeof(cnt[0]) * (n + 1));

    ll res = 0;
    for (int a = 1; a <= n; ++a) /// Check all cases of a
        res += cnt[getMask(a)]++;

    res %= MOD;
    return res;
}

int main() /// O(n log max(a))
{
    int n;
    cin >> n;
    sieve(n + 500);
    cout << magic(n);    
    return 0;
}

Implementation 1

int solve(int n)
{
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (n + 1));

    long long res = 0;
    for (int i = 1, j; i <= n; ++i) if (is_squarefree[i]) 
    {
        for (j = 1; i * j * j <= n; ++j)
            is_squarefree[i * j * j] = false;

        res += 1LL * (j - 1) * (j - 2) / 2;
    }

    res %= MOD;
    return res;
}

Implementation 2

int solve(int n)
{
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (n + 1));

    long long res = 0;
    for (int i = 1; i <= n; ++i) if (is_squarefree[i]) 
    {
        for (int j = 1; i * j * j <= n; ++j)
        {
            is_squarefree[i * j * j] = false;
            res += j - 1;
        }
    }

    res %= MOD;
    return res;
}

Implementation related to Möbius function

/// Linear Sieve
vector<bool> isPrime; /// Characteristic function = A010051
vector<int> prime;    /// Prime list              = A000040
vector<int> lpf;      /// Lowest prime factor     = A020639
vector<int> mu;       /// Mobius                  = A008683
vector<int> phi;      /// Euler totient phi       = A000010
void sieve(int n)
{
    if (n < 1) return ;
    /// Extension part
    mu.assign(n + 1, 1);
    phi.assign(n + 1, 1);
    /// Main part
    prime.clear();
    lpf.assign(n + 1, 0);
    isPrime.assign(n + 1, true);
    isPrime[0] = isPrime[1] = false;
    for (int x = 2; x <= n; ++x)
    {
        if (isPrime[x]) /// Func[Prime]
        {
            lpf[x] = x;
             mu[x] = -1;
            phi[x] = x - 1;
            prime.push_back(x);
        }
 
        for (int p : prime) /// Func[Prime * X] <- Func[Prime]
        {
            if (p > lpf[x] || x * p > n) break;
            isPrime[x * p] = 0;
            lpf[x * p] = p;
            if (lpf[x] == p)
            {
                 mu[x * p] = 0;
                phi[x * p] = phi[x] * p;
            }
            else 
            {
                 mu[x * p] = -mu[x];
                phi[x * p] = phi[x] * phi[p];
            }
        }
    }
}

long long solve(int n)
{
    sieve(n + 1);
    long long res = 0;
    for (int x = 1; x <= n; ++x) if (mu[x]) 
    {
        int t = sqrt(n / x);
        res += 1LL * t * (t - 1) / 2;
    }

    res %= MOD;
    return res;
}

Complexity

So about the complexity....

For the implementation using factorization, it is $$$O(n \log n)$$$.

Hint 1

Hint 2

Proof

Bonus

For the 2 implementations below, the complexity is linear.

Hint 1

Hint 2

Hint 3

Hint 4

Proof

For the last implementation, the complexity is Linear

Hint 1

Hint 2

Proof

Solution for general k

Problem

Given $$$k, n (1 \leq k \leq n \leq 10^6)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$1 \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the number can be big, output it under modulo $$$10^9 + 7$$$.

You can submit here: https://lqdoj.edu.vn/problem/aicprtspk.

For $$$k = 3$$$, you can subimit here: https://oj.vnoi.info/problem/dhbb2020_square.

Examples

Example 1

Input 1:

2 1

Output 1:

Explanation 1:

There are $$$2$$$ satisfied array of size $$$1$$$: {$$$1$$$}, {$$$2$$$}.

Example 2

Input 2:

10 2

Output 2:

Explanation 2:

There are $$$4$$$ satisfied array of size $$$2$$$: {$$$1, 4$$$}, {$$$1, 9$$$}, {$$$2, 8$$$}, {$$$4, 9$$$}.

Example 3

Input 3:

27 3

Output 3:

Explanation 3:

There are $$$12$$$ satisfied array of size $$$3$$$: {$$$1, 4, 9$$$}, {$$$1, 4, 16$$$}, {$$$1, 4, 25$$$}, {$$$1, 9, 16$$$}, {$$$1, 9, 25$$$}, {$$$1, 16, 25$$$}, {$$$2, 8, 18$$$}, {$$$3, 12, 27$$$}, {$$$4, 9, 16$$$}, {$$$4, 9, 25$$$}, {$$$4, 16, 25$$$}, {$$$9, 16, 25$$$}.

Using the same logic above, we can easily solve the problem.

Now you face up with familliar binomial coefficient problem

This implementation here is using the assumption of $$$p$$$ prime and $$$p > max(n, k)$$$

You can still solve the problem for squarefree $$$p$$$ using lucas and CRT

Yet just let things simple as we only focus on the counting problem, we will assume $$$p$$$ is a large constant prime.

O(n) solution

const int LIM = 1e7 + 17;
const int MOD = 1e9 + 7;

int fact[LIM + 1]; /// factorial:         fact[n] = n!
int invs[LIM + 1]; /// inverse modular:   invs[n] = n^(-1)
int tcaf[LIM + 1]; /// inverse factorial: tcaf[n] = (n!)^(-1)
void precal_nck(int n = LIM)
{
    fact[0] = fact[1] = 1;
    invs[0] = invs[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int i = 2; i <= n; ++i)
    {
        fact[i] = (1LL * fact[i - 1] * i) % MOD;
        invs[i] = MOD - 1LL * (MOD / i) * invs[MOD % i] % MOD;
        tcaf[i] = (1LL * tcaf[i - 1] * invs[i]) % MOD;
    }
}

int nck(int n, int k)
{
    k = min(k, n - k);
    if (k < 0) return 0;

    long long res = fact[n];
    res *= tcaf[k];         res %= MOD;
    res *= tcaf[n - k];     res %= MOD;
    return res;
}

int solve(int n, int k)
{
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (n + 1));
    precal_nck(n);

    long long res = 0;
    for (int i = 1, j; i <= n; ++i) if (is_squarefree[i]) 
    {
        for (j = 1; i * j * j <= n; ++j)
            is_squarefree[i * j * j] = false;

        res += nck(j - 1, k);
    }

    res %= MOD;
    return res;
}

Extra Tasks

These are harder variants, and generalization from the original problem. You can see more detail here

A: Can we also use phi function or something similar to solve for $$$k = 2$$$ in $$$O(\sqrt{n})$$$ or faster ?

B: Can we also use phi function or something similar to solve for general $$$k$$$ in $$$O(\sqrt{n})$$$ or faster ?

C: Can we also solve the problem where there can be duplicate: $$$a_i \leq a_j\ (\forall\ i < j)$$$ and no longer $$$a_i < a_j (\forall\ i < j)$$$ ?

D: Can we solve the problem where there is no restriction between $$$k, n, p$$$ ?

E: Can we solve for negative integers, whereas $$$-n \leq a_1 < a_2 < \dots < a_k \leq n$$$ ?

F: Can we solve for a specific range, whereas $$$L \leq a_1 < a_2 < \dots < a_k \leq R$$$ ?

G: Can we solve for cube product $$$a_i \times a_j \times a_k$$$ effectively ?

H: Can we solve if it is given $$$n$$$ and queries for $$$k$$$ ?

I: Can we solve if it is given $$$k$$$ and queries for $$$n$$$ ?

J: Can we also solve the problem where there are no order: Just simply $$$1 \leq a_i \leq n$$$ ?

K: Can we also solve the problem where there are no order: Just simply $$$0 \leq a_i \leq n$$$ ?

M: Can we solve for $$$q$$$-product $$$a_{i_1} \times a_{i_2} \times \dots \times a_{i_q} = x^q$$$ (for given constant $$$q$$$) ?

N: Given $$$0 \leq \delta \leq n$$$, can we also solve the problem when $$$1 \leq a_1 \leq a_1 + \delta + \leq a_2 \leq a_2 + \delta \leq \dots \leq a_k \leq n$$$ ?

O: What if the condition is just two nearby elements and not all pairs. Or you can say $$$a_i \times a_{i+1} \forall 1 \leq i < n$$$ is a perfect square ?

Comments (11)

Write comment?

SPyofcode

2 years ago, # |

-10

Auto comment: topic has been updated by SPyofcode (previous revision, new revision, compare).

→ Reply

DMCS

+41

Go get a life pls

ExistentialCrisis

2 years ago, # ^ |

agreed

Lihwy

SPyofgame Master when?

← Rev. 3 →

-6

clyring

← Rev. 2 →

+19

I've had only a quick look and haven't checked your references yet. Your definitions of $$$f$$$ and $$$g$$$ seem off. As defined, $$$f(49) = \mathrm{size} \{(1, 7)\} = 1$$$ and $$$g(49) = \mathrm{size} \{(1, 7), (49, 49)\} = 2$$$, and $$$g(n) = f(n) + 1$$$, not $$$f(n) - 1$$$. But from the rest of the post it seems likely intended that $$$f(49) = 6$$$ and $$$g(49) = 7$$$. Then in the solution derivation it seems that it should be that $$$\overset{n}{\underset{p=1}{\Large \Sigma}} f(p) = -n + \overset{n}{\underset{p=1}{\Large \Sigma}} g(p)$$$ rather than what is written.

It possible to solve this problem in $$$O(\sqrt{n} \log{\log{n}})$$$ for any $$$k$$$ using a similar idea as that for $$$k = 2$$$. Here's a quick sketch of how to generalize your approach to any $$$k$$$:

Spoiler

It should also be possible to perform the transpose of the Möbius inversion on the sequence $$$(d \mapsto \lfloor \frac{n}{d^2} \rfloor)$$$ and use prefix sums on the sequence $$$s_k$$$ to calculate the answer, which I would intuitively expect to work in $$$\tilde{O}(n^{1/3})$$$. (That transpose looks similar to the DP described in this blog.)

The solution is great. But I still find it hard to understand that the part using Dirichlet Convolution. Can you eleborated more ?

I'm not sure which part you're referring to. Here are my guesses:

Regarding the extra tasks:

Great solution. But can I ask that why $$$p \approx n^{2/5}$$$ is also complicated ? Is it for general $$$k$$$ but with $$$O(n^{2/5})$$$ solution ?

← Rev. 4 →

+11

It's an issue with claiming $$$O(n^{2/5})$$$ indeed. That's not enough time to naively pre-calculate the factorials and inverse factorials to make calculating the necessary binomial coefficients fast, and how messy the fallback strategies are depends on how large $$$k$$$ is.

If $$$k < n^{1/15}$$$, then a simple $$$O(k)$$$ loop for each of the $$$O(n^{1/3})$$$ necessary binomial coefficients is OK.
A standard polynomial multi-point evaluation will take around $$$O(\max(n^{1/3}, k) \cdot (\log{k})^2)$$$ time*, which is OK if $$$k < n^{2/5} (\log{n})^{-2}$$$.
With some clever small-step big-step polynomial multipoint shenanigans, the necessary factorials can be precalculated in $$$O(n^{3/8} \sqrt{\log{n}})$$$. Specifically, choose some integer step size $$$t$$$. Then, calculate the polynomial $$$Q(x) = \prod_{i=0}^{t-1} x - i$$$, and evaluate it at the positive multiples of $$$t$$$ up to $$$\sqrt{n}$$$. Then we can take prefix products to get the factorials at multiples of $$$t$$$. This takes around $$$\frac{\sqrt{n}}{t} (\log{t})^2$$$ time, so we may as well naively pre-calculate all factorials up to $$$\frac{\sqrt{n}}{t} (\log{t})^2$$$ as well, and only around $$$\frac{t^2}{(\log{t})^4}$$$ of the factorials that we will need are not in that range. Each one of these now only takes around $$$t$$$ multiplications to calculate, since the factorials at multiples of $$$t$$$ are already calculated. So the total cost is around $$$\frac{\sqrt{n}}{t} (\log{t})^2 + \frac{t^3}{(\log{t})^4}$$$, which is minimized when $$$t$$$ is of order $$$n^{1/8} (\log{n})^{3/2}$$$.

*Actually, it's a bit worse than this with FFT-unfriendly primes. But not by enough to really matter for this discussion.

-18

No offensive :( I just want to say that I did not expect it is that hard when $$$p \approx n^{2/5}$$$.

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

#	User	Contrib.
1	maomao90	174
2	adamant	164
2	awoo	164
4	TheScrasse	160
5	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	148
9	pajenegod	145
10	BledDest	144

SPyofcode's blog

The statement:

Solution for k = 1

Solution for k = 2

Problem

Examples

Algorithm

Illustration

Implementation

Complexity

Solution for general k

Problem

Examples

Extra Tasks