Anthony_Smith's blog

By Anthony_Smith, history, 21 month(s) ago, In English

Problem Summary: There are N houses, each of which contains some number of children, and in K of these N houses, there will be a school. The distance between house i and house j is abs(i — j). Then, each child will go to the nearest school to their house. Find the minimum total distance possible.

Naive DP: Let dp[i][j] = the min total distance for houses 0 to i, where there are exactly j schools in those houses, and the last house (house i) is guaranteed to be a school.

Then, the transition is dp[i][j] = min(dp[k][j — 1] + cost of children in houses k + 1 to i), where k < i. And to find the "cost of children in houses k + 1 to i" is easy, since we know that there is a school in both house k and house i. So, if you let m = (k + i) / 2, then you know that houses k + 1 to m will go to house k, and houses m + 1 to i will go to house i. Below is code for finding the cost of children in houses a + 1 to b:

int m = (a + b) / 2; // houses a + 1..m will go to house a, houses m + 1..b will go to house b
long long total_dist = 0;
for(int i = a + 1; i <= m; ++i) {total_dist += (long long) (i - a) * childs_at[i];}
for(int i = m + 1; i <= b; ++i) {total_dist += (long long) (b - i) * childs_at[i];}

This loop is O(b — a), but we can make it O(1). Use prefix sums over the childs_at[] array, and also keep another array child_val[] that stores i * childs_at[i] for each index i (and prefix sums over that array). Then,

total_dist += child_val_sum(a + 1, m) - a * childs_at_sum(a + 1, m);
total_dist += b * childs_at_sum(m + 1, b) - child_val_sum(m + 1, b);

So this gives a $$$O(N^2K)$$$ solution.

However, I'm not sure of how we can use the Convex Hull trick to optimize this DP. The transition I have currently comes out to be dp[i][j] = min(dp[k][j — 1] + copy-paste the formulas for calculating cost(k + 1, i). The main problem is the variable m, which is (i + k) / 2, since I don't know how to represent this as a linear function to use CHT on (for all the sums of subarrays, I tried splitting them up into prefix sums so we could represent the transition as a bunch of prefix sums, and some of the prefix sums require m as a variable, and I don't think it's possible to really represent this as a function of a single variable (since m = (i + k) / 2)).

Sorry is this is convoluted. Could someone confirm if my logic is correct, or if I'm missing something in this problem?

Note: I know that there's something called the "Divide and Conquer Optimization" that can be used to solve this problem. However, I want to try solving this with Convex Hull Trick for educational purposes. On this post at the USACO Guide Forum, it is said that CHT can be used to solve this problem: USACO Guide Link.

Full text and comments »

  • Vote: I like it
  • +4
  • Vote: I do not like it

By Anthony_Smith, history, 23 months ago, In English

The well-known Parquet Problem asks the following: Given an N x M board, find the number of ways to tile it using only 1 x 2 or 2 x 1 tiles. We can solve this with various DP formulations that result in various time complexities, but one thing all these methods have in common is the use of the "DP on Broken Profile" method — that is, using a bitmask to represent the state of a certain set of cells (usually a row or column).

My question is: Can someone fully explain the DP state for the $$$O(2^NNM)$$$ DP-on-Broken-Profile approach to this problem, including how the base cases fit conceptually with the overall process? (As for the transitions, I'm (possibly wrongly) assuming that once I fully understand the DP-state, the transitions will be trivial)

The only explanation I could find for this approach comes from USACO Guide. However, the explanation is confusing and seems to include

My understanding of their DP-state is that $$$dp[i][j][mask]$$$ = the number of ways to tile the grid so that basically columns 0 to $$$j - 2$$$ are all completely filled. Meanwhile, column $$$j - 1$$$ may not be completely filled, but must be filled from rows 0 to $$$i$$$. And $$$mask$$$ represents whether or not rows 0 to $$$i$$$ in column $$$j$$$, and rows $$$i + 1$$$ to $$$N - 1$$$ in column $$$j - 1$$$, are filled. Below is their explanation:

This explanation fits with the picture. But, when I actually run their provided program, I get DP-values that don't seem to agree with my understanding. Mainly, I don't understand what the DP-state means conceptually when $$$j = 0$$$. For instance, one value we get is that $$$dp[1][0][0011] = 1$$$. Based on my understanding, this basically represents the number of ways to tile only the leftmost-column, and only the bottommost two cells, so $$$1$$$ does make sense here.

But $$$dp[1][0][1100] = 0$$$! Shouldn't this just equal $$$1$$$ as well, since it represents the number of ways to tile only the top two squares of the leftmost column??

Full text and comments »

  • Vote: I like it
  • +15
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

UPDATE: I actually managed to hack the Rust Code with this testcase. This is the same as CSES test case #27, but I switch the order of the first two edges. The code fails if it sees the 1 2 -100 edge after the 1 3 1 edge, since that would cause it to traverse the 1 2 -100 edge first (it traverses edges in reverse order of how they show up in the input), and apparently that is enough to make it output 3 instead of the correct answer, -1. As according to CSES Guidelines, all submissions will be regraded. I wonder how this'll turn out!

Problem Summary: Given a directed and weighted graph, find the longest path from node 1 to node N, given that edges can be traversed multiple times. If there is an arbitrarily long path, then print -1. The problem link is here.

My method to solve this problem came straight from the CSES Book: Use the Bellman-Ford Algorithm to determine which nodes (if any) are in a positive cycle. If any positive cycle is reachable from both node 1 and node N, then return -1. Otherwise, just return the longest distance from node 1 to node N. Another method to solve this problem is by using a topological sort and finding the longest distance from node 1 to node N that way.

However, the top-performing code for this problem comes from Rust, and seems to use a completely different method. Here is the code. My main language is C++ so I can kind of interpret some parts of the code. The program seems to keep extra variables first_edge and first_reverse_edge to access edges, instead of using an adjacency list. The function mark_useful_edges() seems to set all edges that can reach node N as useful. But I have no clue how the search() function works.

Can someone interpret this code, or reveal the algorithm or logic the code-writer used?

Note: I understand that simply caring about best performance is a narrow-minded way of learning algorithms. I am interested in this solution because it seems to be a simple way that I have never seen before, and still solves the problem quickly. If this code involves a new technique, I want to learn it.

Full text and comments »

  • Vote: I like it
  • +41
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

I recently came across Problem 95E. The problem pretty much asks "Given a set of coins whose values sum to $$$N$$$, find for each number from $$$1$$$ to $$$N$$$ the minimum number of coins needed to create that total sum".

I know of a way to do this with a $$$O(logN)$$$ factor (if a weight $$$w$$$ occurs $$$o$$$ times, split it up into $$$log(o)$$$ weights $$${ w, 2^1*w, 2^2*w, ..., 2^{log_2(o)} w}$$$ then do regular knapsack), but I also want to learn how to do it with a $$$O(sqrtN)$$$ factor. The general idea I've learned is that there are at most $$$O(sqrtN)$$$ distinct coin values, and that there exists some way to process all coins of one fixed coin-value in $$$O(N)$$$. But I don't know how to process all occurrences of a fixed coin-value in $$$O(N)$$$. I have read this blog, which uses the approach of finding for each total sum the minimum number of the current coin needed to create it. But the blog is for 0/1 Knapsack only, while here I want to find the minimum number of coins needed for each total sum.

Can someone provide a detailed explanation/implementation of the $$$O(NsqrtN)$$$ Knapsack Algorithm? Thanks!

Full text and comments »

  • Vote: I like it
  • +2
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

An Aho-Corasick Automaton can solve the problem, "Given some text T and a set of patterns P, find every occurrence (every index) of every pattern in T", in O(total number of occurrences + |T| + total length of all patterns) time. My question is, why is the Aho-Corasick Automaton necessary when we can solve this problem with Suffix Arrays too? Are suffix arrays too slow in competition problems? Or am I simply misunderstanding the inner workings of these concepts?

Valid (I think) method using Suffix Arrays: Construct a suffix array over T in O(|T|log|T|) time. Then, for each pattern, we can find all its occurrences in O(pattern size * log|T| + number of occurrences) time by binary searching through the suffix array (See CSES Finding Patterns), then looping through all valid suffices and collecting their starting indices (the indices where the pattern occurs). So, the final time complexity would just be O(|T|log|T| + total pattern length * log|T| + total number of occurrences). This is pretty much the Aho-Corasick Automaton's time complexity multiplied by log|T|, which doesn't seem all that much of a big difference to me.

Full text and comments »

  • Vote: I like it
  • -2
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

Here's a problem I recently came across.

Problem Summary: You are given a N * M grid, where N * M <= 40 (1 <= N, M <= 40). Initially, each square on the grid contains a spider. Then, suddenly, all spiders either move one square up, down, left, right, or stay still (and they can't move outside of the grid). If one square can contain multiple spiders, find the maximum number of spider-free grid cells after this action.

Solutions I've found online (including the official editorial) generally say something like the following:

  • WLOG assume N < M. Then N <= 6.

  • Let dp[n][m1][m2] = the minimum number of occupied cells in rows 0..n, such that row n can be described by the bitmask m1, and row n + 1 can be described by the bitmask m2.

  • "To make a transition from D[k-1][..][..] we can iterate over all possible masks for the k-1-st column, check whether we can distribute spiders in kth column knowing the masks for k+1-st and k-1-st columns and find the minimal value of D[k-1][..][pmask] for all such masks." So this makes the time complexity O(N * 2^(3M)) = O(N * 8^M).

I don't understand the explanation for transitions. I think that what it's saying is that for dp[n-1][m1][m2] you can transition to dp[n][m2][..], where you loop over the state of the bitmask in row n + 1. But my questions are:

  • Can someone confirm if my current understanding of the transitions is correct? Is there a better way to formulate this DP?

  • How do we check if a certain pair of masks works?

Full text and comments »

  • Vote: I like it
  • +3
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

I recently came across this problem. There is no official editorial (except for a few short comments in this thread), and I have not been able to find an explanation I understand.

Problem Summary: You are given N (1 <= N <= 24) tennis balls, each located at a distinct lattice point, as well as a tennis basket also located at a lattice point distinct from all other lattice points. At one time, you can hold at most 2 tennis balls. Find the minimum total squared distance it'll take to put all tennis balls in the basket, if you start at the basket. Also:

  • Every lattice point is from (-100, -100) to (100, 100)

  • The time limit is an unusually high 4 seconds

A $$$O(2^N * N^2)$$$ solution is obvious: do a subset/bitmask DP over all N tennis balls. To find dp[i], simply test every individual set bit in the number i, as well as every pair of set bits in the number i (representing taking one and two balls in the previous step), and transition from that previous dp-state. There are $$$2^N$$$ states, and there are $$$O(N^2)$$$ pairs of tennis balls for each number, so the overall complexity is $$$O(2^N * N^2)$$$.

But N = 24, so this times out. In the comments, they mention a $$$O(2^N * N)$$$ solution, where for each dp[i] you only need to test N pairs of tennis balls for transitions. Specifically, they say that for dp[i], we only need to test pairs that contain the first set bit in the number i, which reduces the number of pairs, and therefore the number of transitions, to $$$O(N)$$$. Apparently, this approach still manages to cover all possibilities for transitions, but I cannot understand why.

Can someone provide some insights/hints to this problem's solution?

Full text and comments »

  • Vote: I like it
  • +22
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

Consider the following problem:

There are N cells in a row, each of which must be colored with a color C_i. In one step, you can choose to color any subarray of the N cells with their corresponding colors, but this has a cost equivalent to (number of distinct colors in that subarray)^2. Find the minimum cost to color all N cells.

I was able to quickly think of a simple O(N^2) DP solution: Let dp[n] = the cost to color cells 0 to n. Then, dp[n] transitions from all dp[<n]; O(N) transitions for O(N) states.

However, in this problem, N <= 50,000 (and C_i also <= 50000, but I'm not sure why this matters). This combined with the fact that the cost function is the square of the number of distinct colors makes me think that there is some sort of sqrt-bash solution. Also, dp[n] is monotonically increasing, but I'm not sure how this helps...

Can anyone provide some hints/a solution to this problem?

Full text and comments »

  • Vote: I like it
  • +10
  • Vote: I do not like it

By Anthony_Smith, history, 2 years ago, In English

The problem of finding the Longest Bitonic Sequence (the longest sequence that either first decreases then increases, or first increases then decreases) is well known. We can just do DP, keep for each prefix the longest increasing and decreasing subsequence and also for each suffix. Then just loop through all possible "middle" indices (indices where the sequence transitions from increasing to decreasing or the other way around) and test in that way. The complexity would be O(NlogN + N) (NlogN for each DP, test N center indices) I think.

But I was wondering, whaf is we are given queries that ask for the LBS in a subarray? As far as I know, this has not been in any contest I can find online. I thought we could use a Segment Tree to do this, but I don't know what we'd need to maintain in the nodes. I think we would need to maintain the LIS and LDS (longest increasing/decreasing subsequences) ending at each index as well as the longest bitonic subsequence ending at each index. But this would make queries O(N), which I don't think is good enough. Is there another way to do this?

EDIT: Actually, queries would be O(N^2) I think. My idea is like, for left and right subarrays to be merged, for each number in the right subarray loop over all indices in the left subarray to see if you can append that number to another sequence.

Full text and comments »

  • Vote: I like it
  • +2
  • Vote: I do not like it