Hello codeforces! Try to solve both of the problems from https://codeforces.com/contestInvitation/fcf424afaf5b7a5d01aebf1a908d6ef3589e4fb3. good luck

Solution:

This is a simple dp problem(very simple), but with a high constraints. Standard knapsack runtime is O(NW), but we can optimize it to run in O(NW/32) using bitset.

```
#include<bits/stdc++.h>
using namespace std;
const int W=2e5;
bitset<W>b;
signed main(){
int n,w;
scanf("%d %d",&n,&w);
b[0]=1;
while(n--){
int a;
scanf("%d",&a);
b|=(b<<a);
}
if(b[w])printf("YES\n");
else printf("NO\n");
}
```

How does it work? `b[k]`

contains 0, if it's not possible to get sum `k`

, and 1, if it's possible.

At start, we set `b[0]`

to 1(because we can get sum 0). Next, for each item we left-shift out bitset by a[i]. `new_b=b<<a[i]`

After this move, new bitset contains information about what can we get if we will take current item, but it ignores all previous moves(when we didn't take current item). To fix this, we need to connect two bitsets in one using bitwise or. `new_new_b=new_b|b`

To do this fast, we just write `b|=(b<<a)`

Why this works fast? We do N operations of shifting W elements. But bitset works as a long binary number, which constructs of a many 32bit integers. So we don't shift W numbers, we shift only W/32. this is why it work's so fast

But this solution runs in 1s, which is too much. How to improve the improvement? Add some useful pragmas:

```
#include<bits/stdc++.h>
using namespace std;
// very useful pragmas
#pragma GCC optimize("O3")
#pragma GCC optimize("unroll-loops")
#pragma GCC target("avx2")
const int W=2e5;
bitset<W>b;
signed main(){
int n,w;
scanf("%d %d",&n,&w);
b[0]=1;
while(n--){
int a;
scanf("%d",&a);
b|=(b<<a);
}
if(b[w])printf("YES\n");
else printf("NO\n");
}
```

Now it runs in ~700ms

P.S This was our first ever problem created on polygon, so we failed it a little with bad tests. Sorry, we will improve ourself for the future!

Also thanks for a great callback!

how is the second problem possible to solve?

Hello! I am going to post an editorial soon.

Hi, i was wondering if it was intended to kill FFT solutions for A2 or is just a bad implementation from me.

Your bad :(

Well, i couldn't pass FFT during contest so i used an interesting trick based on the fact that $$$Sum(A_i)$$$ is bounded to 2e5.

Consider all elements are distinct, we can have at most $$$sqrt(MAX)$$$ distinct elements. Now, imagine we want to insert a new element X with frequency F.

I claim that we can decompose F into at most $$$logF+1$$$ factors such that every number between 1 and F can be formed by adding up some non-empty subset of those factors.

That way you can do the basic knapsack with $$$sqrt(M)*log(N)$$$ elements instead of $$$2e5$$$.

I'm not too sure of how strong the test cases are but it ran in 46 ms.

check the tutorial :D

Ok so isnt it kinda stupid to not be able to see other's submissions, since its obviously some trick? I feel like would be much more productive otherwise

For 2nd task memory limit is quite tough, using long long was giving mle and int was giving ac.

Auto comment: topic has been updated by PyPcDeV (previous revision, new revision, compare).$$$\mathcal O(n \sqrt S)$$$ solution where $$$S = a_1 + \dots + a_n$$$. Passes A2 in 46 ms.

Thanks!

Because of condition $$$a_1 + a_2 + \dots + a_n \le 2 \cdot 10^5$$$ it's possible to solve this problem in $$$O(\frac{w \cdot \sqrt{2 \cdot 10^5}}{32})$$$.

If we have $$$3$$$ or more items with equal value (say $$$x$$$), we can remove $$$2$$$ of this items and instead of them add one item with value $$$2x$$$. Let's do all this possible exchanges. Now we have $$$\le 2$$$ items of each value. And values sum is still $$$\le 2 \cdot 10^5$$$, because exchanges don't change total sum. It's easy to prove then that items count is $$$O(\sqrt{2 \cdot 10^5})$$$. And we just do simple knapsack with bitset on this items.

Source code

Thanks!

Are there any sources that mention this trick? Also why do we need >=3 items and not >=2 items when trying to merge?

Merging >= 2 won't work in cases like the following: we have 4 copies of an item $$$x$$$. 2 pairs of them merge into $$$2x$$$, then the 2 $$$2x$$$ items merge into one $$$4x$$$, which prevents us from getting sums such as $$$3x$$$. Merging when we have >= 3 fixes that issue because we leave at least one item of size $$$x$$$ behind to make smaller denominations.

Oh, it is somewhat like binary bits. If we merge >=2 pairs, then we won't have the option to set this bit to 1 (if number of items is even). Thanks, I get it now.

Because we may want to take only one $$$x$$$ in answer, so we need to leave one of them in our set. But when we want to take two $$$x$$$ in answer, we can imagine that we take item with $$$2x$$$ instead.

I saw this idea in this comment.

Btw, your code doesn't work for W = 2e5 because bitsets start indexing at 0. And it's also pretty tight on TL for N, M = 2e5 and all Ai = 1.