z_e_z_e's blog

By z_e_z_e, history, 8 days ago, In English

hello, I want to ask you a question , what is the real time complexity of seive? is it nlog(log(n)) or sqrt(n)log(log(n)) , and why? and whish loop of this code is cost that time complexity ? 1- for(int i=2; i<=n; i++) { 2- if(isprime[i]) 3- { for(int j=i*i; j<=n; j+=i) isprime[j]=0; } }

thanks alot.

 
 
 
 
  • Vote: I like it
  • -17
  • Vote: I do not like it

»
8 days ago, # |
  Vote: I like it +10 Vote: I do not like it

1) Depends on how you write it. If it's like this:

code

it's $$$O(N \log \log N)$$$. If you remove the if (!prime[i]) continue;, it roughly doubles the time it takes, although the complexity is $$$O(N \log N)$$$, because the compiler can optimize a lot (I guess). I have never seen, nor heard of, a valid sieve that runs in $$$O(\sqrt{N} \log \log N)$$$, and it seems like a random statement to me. It takes $$$O(\sqrt{N})$$$ to check if a number is prime (by checking all numbers $$$n$$$ with $$$n \cdot n \leq N$$$), which can make answering queries about whether a number is prime or not $$$O(Q \sqrt{N})$$$.

Also, the j loop takes the longest, simply because it's the innermost loop. And no, you cannot optimize this algorithm any more to check primality of all numbers less than $$$N$$$ in substantially less time than $$$O(N \log \log N)$$$ (or $$$O(N)$$$ with this linear sieve, although it has a worse constant factor, so in my experience it doesn't do much better than the sieve above).

Finally, an interesting note -- Using a vector<bool> works about $$$20\%$$$ faster in my testing than using a bitset<size>.

  • »
    »
    8 days ago, # ^ |
      Vote: I like it +21 Vote: I do not like it

    First of all, the complexity of the code without if (!prime[i]) continue; is $$$O(n \log n)$$$ not because of the compiler optimizes it but because of it is bounded by the harmonic sum series. (Sidenote: Even if the compiler optimizes it by a lot, the big O notation wouldn't take that into account anyways). Harmonic series sum is $$$\frac{1}{1}+\frac{1}{2}+\frac{1}{3} \dots \frac{1}{\infty}$$$. Then we can use the standard proof of why the harmonic series is divergent to answer this question too.

    • »
      »
      »
      7 days ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      I know about the harmonic sum series, that's why it's $$$O(N \log N)$$$. I meant that I tested my code with $$$N = 5 \cdot 10^7$$$, and the $$$N \log \log N$$$ version took 0.22s, compared to 0.47s without the if (!prime[i]) continue;. With the big O notation, the former should have about $$$N \ln \ln N \approx 1.44 \cdot 10^8$$$ operations, while the latter should use $$$N \ln N \approx 8.86 \cdot 10^8$$$ operations, which is a factor of more than 6, a whole lot more of a difference than the factor of two that I saw when I tested it. Do you have a better explanation for this than that the compiler can probably optimize it, because that would be greatly appreciated.

      • »
        »
        »
        »
        7 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        What were your compilation flags? Do you have optimizations turned on? If so, you should try turning them off and see the difference.

      • »
        »
        »
        »
        7 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I just tested it https://ideone.com/lo9wqE and it looks like that its around 5.5x faster which is around 6.

»
8 days ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

What if the code is like this:

for(int i=2;i<=sqrt(n);i++) { if(prime[i]) { for(int j=(i*i);j<=n;j++) prime[j]=false; } }

The j loop iterates like: n/2, n/3, n/5, n/7... So, the series has an upper-bound of n(log(log(n))). Why we are not considering the i loop, which goes from 2 to sqrt(n)? I am really confused about it, can someone explain it. Thanks. Sorry if I said something incorrect.

  • »
    »
    8 days ago, # ^ |
    Rev. 3   Vote: I like it +5 Vote: I do not like it

    Your code should be something like this instead:

    for (int i = 2; i * i <= n; ++i) {
        if (prime[i]) {
            for (int j = i * i; j <= n; j += i) {
                prime[j] = false;
            }
        }
    }
    

    The number of iterations will be roughly

    $$$\sum_{1 \le i \le \sqrt{n}, i \text{ prime}} \frac{n - i^2}{i}$$$

    We can use the following two things carefully for a rough analysis: if $$$\pi(n)$$$ is the number of primes $$$\le n$$$ and $$$p_n$$$ is the $$$n^\mathrm{th}$$$ prime, then we have $$$\pi(n) \approx \frac{n}{\log n}$$$ and $$$p_n \approx n \log n$$$ (ignoring the error terms).

    The second term sums up to roughly $$$\frac{n}{2 \log n}$$$. The first term sums up to roughly $$$n \log \log \sqrt{n}$$$, which is $$$n \log \log n - n$$$. So the overall summation is something like $$$n \log \log n - n - \frac{n}{2 \log n}$$$. (Note that there might be some error terms of the order of the two terms that are subtracted here that I'm missing out on since this is a very rough analysis; the main point of this computation is to show that it's asymptotically the same thing as the other sieve). Hence this might be faster than the usual sieve in terms of constant factor.

    • »
      »
      »
      8 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Thank You. But, the code, it will be an infinite loop if I am right. And also, we don't need to consider for 1 too, considering it will mark every number as not prime in the inner loop. So maybe, it will not work correctly if I'm right.

      https://www.geeksforgeeks.org/how-is-the-time-complexity-of-sieve-of-eratosthenes-is-nloglogn/

      This article explained how the upper bound of the series is nlog(log(n)), but ignored the outer loop, that is for(int i=2;i<=sqrt(n);i++), it's time complexity is sqrt(n) right. So, the overall complexity should be sqrt(n) * nlog(log(n)). Am, I doing something wrong?

      • »
        »
        »
        »
        8 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        No, it won't be an infinite loop (maybe you were looking at the previous revision, I updated it a few minutes ago). Also since prime[0] and prime[1] are already marked as false before running this loop, it shouldn't have been an issue in the first place either.

        No, it doesn't ignore the outer loop (at least my analysis, idk about GFG). The number of iterations of the inner loop is what the bottleneck is (and each iteration of the outer loop takes O(1) time). Here's what one iteration of the outer loop does:

        1. Check if $$$i$$$ is prime. This takes $$$O(1)$$$ time.
        2. If $$$i$$$ is prime, run a loop for $$$\frac{n - i^2}{i}$$$ iterations. This takes $$$O\left(\frac{n - i^2}{i}\right)$$$ time.

        So taking the sum over all the iterations of the outer loop, you get a contribution of $$$O(\sqrt{n})$$$ from the first part, and what I computed in the previous comment from the second part (which is much more significant than the contribution of the first part, which is why I skipped it).

        I hope this clears up the confusion.

        • »
          »
          »
          »
          »
          8 days ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Oh, yes. It's correct now, it was not updated when I saw. Thanks for clearing my confusion.

        • »
          »
          »
          »
          »
          6 days ago, # ^ |
          Rev. 3   Vote: I like it 0 Vote: I do not like it

          thanks alot for this explanation , i understand from your explanation that if i considered the outer loop and the inner loop the time complexity of all code will be nlog(log(sqrt(n))), and if I make the outer loop start from 2 to n the time copmlexity will be nlog(log(n)), is it true?

          • »
            »
            »
            »
            »
            »
            6 days ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            The complexities are the same asymptotically, there's only a constant factor difference (well not even that, the percentage by which the sqrt one is faster than the normal one tends to 0 as $$$n$$$ tends to $$$\infty$$$). But for practical purposes it's kind of better.