UNIQUE VISION Programming Contest 2022（AtCoder Beginner Contest 248） Announcement

2 years ago, # |

It took some WAs for D, but I've finally realized extra log N factor is TLE territory.

→ Reply

semicolonised

2 years ago, # ^ |

I got innumerable TLEs and it TLEd on only 1 test case most of the times. I used lower_bound and then implemented by own binary search function but none worked. Don't know how to improve my solution further :)

→ Reply

AsishKumar

2 years ago, # ^ |

I got an AC in D problem. So basically what I did is I created a map<int, vector> and then stored the indices of all the elements in that map. Then I just took lower_bound and upper_bound of l and r respectively of that vector in the map.

→ Reply

notwatermango

2 years ago, # |

how to represent line in E ?

→ Reply

2 years ago, # ^ |

I represented a line as the set of all points (in the input) it passes through

→ Reply

2_3_3

2 years ago, # ^ |

+11

my sol: let $$$vis_{i,j}$$$ be: if the line with point $$$i$$$ and $$$j$$$ is used, set it 1, otherwise set it 0. Then we can simply mark all pairs of points on the same line.

sample code

	for(int i=1;i<=n;i++)
		for(int j=i+1;j<=n;j++){
			if(vis[i][j])continue;
			vector<int>p;//get all points on the line of point i and point j.
			p.clear();
			p.push_back(i);
			p.push_back(j);
			for(int k=j+1;k<=n;k++)
				if(coline(i,j,k))
					p.push_back(k);
			for(int t1=0;t1<p.size();t1++)
				for(int t2=t1+1;t2<p.size();t2++)
					vis[p[t1]][p[t2]]=1;
			if(p.size()>=k)
				ans++;
		}

→ Reply

jainmilind

2 years ago, # ^ |

+22

I made a set<array<int64_t, 3>> and stored as form $$$a x + b y + c = 0$$$ reduced to lowest form using $$$gcd(a, gcd(b, c))$$$

and stored both $$$a,b,c$$$ and $$$-a,-b,-c$$$ in set

and printed size of set / 2 as answer

→ Reply

ussef_abdallah

2 years ago, # ^ |

← Rev. 2 →

We know that the equation of a line is $$$y = mx + c$$$, where $$$m$$$ is the line slope and $$$c$$$ is the intersection point with the y-axis. To avoid working with floating-point values, let's manipulate this equation a bit. So far, we have $$$y = \frac{\Delta{y}}{\Delta{x}} x + c \iff y\Delta{x} = x \Delta{y} + c \Delta{x}$$$ Now, we will represent each line with the triplet $$$(\Delta{x},\Delta{y}, c\Delta{x})$$$. And to avoid the problem of having lines with the same slope up to a constant factor, we can re-assign $$$\Delta{x} = \frac{\Delta{x}}{gcd(\Delta{x},\Delta{y})}$$$ Similarly, $$$\Delta{y} = \frac{\Delta{y}}{gcd(\Delta{x},\Delta{y})}$$$. With this representation, we can easily check if a point lies on a line. My submission here.

→ Reply

spookywooky

2 years ago, # |

+22

"Be careful not to count the same line more than once."

Haha, super funny. Can you at least explain how that works?

→ Reply

2 years ago, # ^ |

← Rev. 2 →

I represented a line as the set of all points (in the input) it passes through. There are O(N^2) lines that pass through at least 2 points, and each line has size O(N), so you can explicitly enumerate all such lines in this representation.

(optimization: instead of "set of points" you can use "sorted list of indices of points")

→ Reply

2 years ago, # ^ |

a line can be defined by its endpoints in the input, using a set you can store all valid lines in pairs of points

→ Reply

2 years ago, # |

The idea for D: Range Count Query is essentially the same as CF Div3D: Distinct Character Queries

For each number in $$$[1, N]$$$, keep a vector denoting the indices where this is element is present. If you fill it from left to right, this vector would be sorted. The answer for each query would then be $$$upper\_bound(R) - lower\_bound(L)$$$ on this position vector.

You might think that the complexity might jump to $$$O(n^2)$$$ if you keep a vector for each element, because a single element can occur $$$n$$$ times. The answer to that question is same as: While creating an adjacency list of a graph, why doesn't the time complexity jump to $$$O(n^2)$$$?

Time Complexity : $$$O(n + q \log{n})$$$ Code

Code

#include <bits/stdc++.h>
using namespace std;

void solve() {
    int n; cin >> n;
    vector<vector<int>> adj(n + 1);
    for(int ind = 1; ind <= n; ind++) {
        int ele; cin >> ele;
        adj[ele].push_back(ind);
    }

    int q; cin >> q;
    for(int i = 0; i < q; i++) {
        int left, right, x;
        cin >> left >> right >> x;
        auto start_itr = lower_bound(adj[x].begin(), adj[x].end(), left);
        auto end_itr = upper_bound(adj[x].begin(), adj[x].end(), right);
        cout << end_itr - start_itr << "\n";
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    solve();
    return 0;
}

...

→ Reply

oversolver

2 years ago, # ^ |

or same as Static Range Frequency

→ Reply

2 years ago, # ^ |

← Rev. 2 →

I think there are significant differences both ways, for Range Count Query I constructed the frequency dictionary of A[0:i] for each i and answered each query offline. But this doesn't work if operations can change A. And I think it's a legitimately different algorithm from yours because the time complexity is O(N) + Q log Q.

Likewise you can solve Distinct Character Queries with a segment tree for each letter of the alphabet, but it doesn't generalize to large alphabets.

→ Reply

2 years ago, # ^ |

Yes, of course. I didn't mean that the problems are same, I meant that if you've read the editorial of Distinct Character Queries, you'd instantly get the idea for solving Range Count Query (as the former contained the trick of maintaining sorted indices of positions and binary searching).

RCQ is a subset of DCQ but not the other way round, due to update queries.

→ Reply

2 years ago, # ^ |

Actually I also meant that RCQ is not a subset of DCQ, since the "use 26 segment trees" solution to DCQ cannot be generalized to larger A easily (I think?)

You are right that the editorial solution to DCQ is easier to generalize though

→ Reply

llc5pg

2 years ago, # ^ |

I just learnt that O(n + qlogn) is not the fastest solution. By treating the queries offline, we can achieve O(n + q) instead.

→ Reply

2 years ago, # ^ |

But won't you need to sort the queries for offline processing, thereby introducing a $$$\log{q}$$$ factor?

→ Reply

llc5pg

2 years ago, # ^ |

Here is how. This is not my submission. I just saw the author explained how it can be done: Submission

→ Reply

Yatin_Kwatra

2 years ago, # |

+14

How to solve G?

→ Reply

qiuzx

2 years ago, # ^ |

← Rev. 2 →

Let $$$x$$$ be the gcd of the numbers on the path. We first calculate the sum of $$$k$$$ for all possible $$$x$$$ with only the constraint that all numbers should be a multiple of $$$x$$$ (which means that the gcd might be $$$dx(d>1)$$$ instead of $$$x$$$, but we will deal with that later).

We can now mark every node whose number is a multiple of x as passable, which means the path we take can only pass passable nodes. We can calculate the answer of each connected component (it's easy to see that they are trees) seperately because we can never move from one component to another.

The contribution of two nodes $$$i,j$$$ will be $$$dep_i+dep_j-2dep_l+1$$$ , where $$$l$$$ is the LCA of $$$i$$$ and $$$j$$$. We choose a vertex as $$$l$$$ and can calculate the answer with simple math. Then we add the answers together for all connected compnents to get the answer of the current problem.

Now we get the answer when the $$$gcd$$$ is a multiple of $$$x$$$ , but we need the answer when $$$gcd=x$$$. Let $$$f_x$$$ be the answer, then we should use the current answer to minus the total $$$f$$$ for all multiples of $$$x$$$ , because those whose $$$gcd=dx(d>1)$$$ is calculated exactly once in those values.

Then we get a $$$O(n\omega(n))$$$ ($$$\omega(n)$$$ denotes the maximum number of factors for a number below $$$n$$$ , when $$$n=10^5$$$ , it's about $$$100$$$) solution if carefully implemented.

→ Reply

itachi_fam

2 years ago, # |

how to solve c without dp?

→ Reply

ElectroMaster3

2 years ago, # ^ |

← Rev. 2 →

I am not sure,but i think it's only solveable using DP.The reason for this is because it has small constraints so i think it is meant to be solved by DP.Also it is <=K so i think DP is needed.If it was =k then i'd use stars and bars.

→ Reply

2 years ago, # ^ |

Cant you apply stars and bars for all 1 <= i <= k?

→ Reply

ElectroMaster3

2 years ago, # ^ |

I think you can.The thing is that you should try to make it in a form like 0<=.It is a very common trick i believe.Let's say that you need to make a+b+c=n with 1<=a,b,c<=n.Then,what you need to do is that make it into a+1+b+1+c+1=n so that it'll be something like 0<=a,b,c<=n-1.Then the solvable form is a+b+c=n-3.Please correct me if i am wrong since i am not that good in math.

→ Reply

2 years ago, # ^ |

← Rev. 2 →

I actually meant this:

Since K<=2500, you can apply stars and bars to get the total sum 'X' for all N <= X <= K and add their total sum, right?

should be something like O(K*(N+K))

→ Reply

ElectroMaster3

2 years ago, # ^ |

← Rev. 2 →

Oh,I'm not sure.You see,you need modular inverse to count it right? because we need to take mod and there is a division involved in the stars and bars formula,right? (i haven't tried it myself so i'm not sure whether it's possible or not.But i think it's quite hard don't you think?)

→ Reply

2 years ago, # ^ |

Yh, will try this approach. It should be ossible with complementary counting

→ Reply

2 years ago, # ^ |

← Rev. 3 →

I think you're trying to solve for

a1 + a2 ... an = k, where ai>=M, but the problem is actually ai<=M(which I guess can be done with complementary counting). Is it possible to do it without complementary counting?

UPD: For those wondering about the PIE complementary counting method:

Inclusion/Exclusion in O(KlogN)

int calc(int a, int b, int lo=1, int hi=n){
    // distribute 'a' candies to 'b' children such that each children gets at least 'lo' candies and at most 'hi' candies. I assume comb(i,j) runs in O(logN)
    a-=lo*b; a+=b-1;    
    int tot = comb(a,b-1);
    for(int i = 1; i <= b; i++)
        tot+=(i%2?-1:1)*comb(b,i)*comb(a-i*hi,b-1);
    return tot;
}

→ Reply

ivatopuria

2 years ago, # ^ |

← Rev. 3 →

yes we can , with stars and bars + complementary counting . (count number of combinations with at least one variable(box) being $$$>M$$$) here is submission which uses that technique .

time complexity is $$$O(nlog(mod))$$$ but can be optimized to $$$O(n)$$$ if we precalculate modular inverses .

it's strange that they didn't mention this approach in editorial

→ Reply

Penguin07

2 years ago, # ^ |

Bonus 2 in the editorial uses formal power series in O(k).

→ Reply

llc5pg

2 years ago, # ^ |

I did not want to use DP, so I wrote the answer using memoized recursion. Submission

→ Reply

Penguin07

2 years ago, # |

In problem E I represent a line with (delta_y, delta_x, minimum_index_of_point_on_this_line) if line isn't vertical and (1, 0, x_coordinate) if line is vertical.submission

→ Reply

Misuki

2 years ago, # ^ |

+14

Actually we can uniquely determine a line by checking the first two index of point that is on the line, and to implement it just allocate a 2d bool array. I find it is more easy to code :)

→ Reply

Skeef79

2 years ago, # |

Simpler version of problem G 990G - GCD Counting (it uses the same idea, but instead of counting sum of all path lenghts you just need to count the number of paths)

→ Reply

SummerSky

2 years ago, # |

← Rev. 2 →

+19

Would anyone like to share some different solutions to problem F, except for the one in editorials. I find it difficult to determine the state of dp.

UPD1, I don't understand how to find that there could only be two states, state0 and state1, as mentioned in the editorials.

→ Reply

2 years ago, # ^ |

← Rev. 2 →

+30

For a smoother experience, read this gist which has embedded images.

Solving this problem requires a deep understanding of how Subset Sum DP works. In fact, with a proper abstraction, you can almost convert it to a Subset-Sum DP problem.

In the subset-sum problem, we have a zero-indexed array of $$$n$$$ elements. To simplify the mathematical notation, for each node $$$x$$$, define the node vertically below it as $$$x^*$$$. If we capture each $$$x$$$ and $$$x^*$$$ in a rectangular box, we will get an array of $$$n$$$ blocks (let's say, zero-indexed), and we need to perform some operations on these blocks. This is the first level of abstraction which makes the numbers in subset sum DP analogous to blocks in this problem.

Next, to figure out the DP definition, recall that in subset-sum problem, even though we are supposed to find the number of subsets of the entire array with sum exactly equal to $$$k$$$, we define $$$dp[i][j]$$$ as the number of subsets with sum equal to $$$j$$$ for all $$$j \leq k$$$. Why is this so? It's because $$$j_1 + j_2$$$ can become equal to $$$k$$$. In fact, for each DP problem, all states can be broadly classified into 3 categories,

Directly useful
Indirectly useful
Hopeless

So, any subset with sum equal to $$$k$$$ is directly useful, subsets with sum less than $$$k$$$ are indirectly useful and subsets with sum greater than $$$k$$$ are hopeless.

Recall that, in Knapsack problem, we also maintain another dimension of DP capturing the total weights we've already used so far. Since we need to delete some amount of edges from these blocks, let's define $$$dp[i][j]$$$ as the number of ways to delete exactly $$$j$$$ edges from blocks $$$[0, \dots i]$$$ such that the remaining graph is connected. Our answer would then be $$$dp[n - 1][j]$$$ for all $$$j$$$. This DP definition is incomplete, but in a contest, you're most likely to come up with this definition in the first try.

So, what's the flaw with this DP definition? It only captures states which are directly useful. Since $$$j_1 + j_2$$$ could've become equal to $$$k$$$, in this case, we could have 2 disconnected graphs which would've combined to created a connected one.

Now we know that we are missing indirectly useful states in our DP table, but we are not sure which ones exactly. Whenever you face this Dilemma, just start the algorithm from the $$$0^{th}$$$ element, and you'll quickly realize what you're missing.

Notice that each block can contain at most 3 edges. Label them as top, bot and mid edges.

In subset sum DP, the elements are processed in an online fashion. Meaning, we first compute our answer assuming that we only have the prefix $$$[0, \dots i]$$$ and we investigate the transitions when we introduce the $$$i^{th}$$$ element. We then make one of 2 choices for the $$$i^{th}$$$ element: Take or Leave. So, let's start with the first block. It only has a mid edge. We can either take it or leave it, which leaves us with 2 possibilities. Notice that our current DP does not handle the leave case, because it makes the graph disconnected. But we can see that it's a valid choice since the adjacent block can make it connected. This is our first hint on the new state that we need to introduce.

Now, consider the second block, it has 3 edges, top, bot and mid. We can take or leave each edge independently, so there are 8 possibilities. Let us list down all $$$2*8$$$ possibilities for both blocks combined. (The first column of result assumes that we kept the mid edge of the $$$0^{th}$$$ block and the second column assumes that we ignored it).

Can you spot the directly useful, indirectly useful and hopeless states from this graph? All states with one connected component are directly useful, and all the states where at least one connected component is isolated from the the incoming block is hopeless. Since that incoming block can only interact with the terminal $$$x$$$ and $$$x^*$$$, we can conclude that any element which is a part of a connected component not containing $$$x$$$ or $$$x^*$$$ is hopeless, because no matter what you do later, you cannot make the entire graph connected. Hence, for a state to be not hopeless, it should have no connected component not containing $$$x$$$ or $$$x^*$$$.