Help on optimizing a naive DP solution

→ Pay attention

Before contest
2023 Post World Finals Online ICPC Challenge powered by Huawei
11:59:23
Register now »

→ Top rated

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	165
3	adamant	161
4	TheScrasse	160
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
9	SecondThread	145
9	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

Anthony_Smith's blog

Help on optimizing a naive DP solution

By Anthony_Smith, history, 2 years ago, In English

Consider the following problem:

There are N cells in a row, each of which must be colored with a color C_i. In one step, you can choose to color any subarray of the N cells with their corresponding colors, but this has a cost equivalent to (number of distinct colors in that subarray)^2. Find the minimum cost to color all N cells.

I was able to quickly think of a simple O(N^2) DP solution: Let dp[n] = the cost to color cells 0 to n. Then, dp[n] transitions from all dp[<n]; O(N) transitions for O(N) states.

However, in this problem, N <= 50,000 (and C_i also <= 50000, but I'm not sure why this matters). This combined with the fact that the cost function is the square of the number of distinct colors makes me think that there is some sort of sqrt-bash solution. Also, dp[n] is monotonically increasing, but I'm not sure how this helps...

Can anyone provide some hints/a solution to this problem?

problem, dp, sqrt

Anthony_Smith
2 years ago
9

Comments (9)

Write comment?

barun511

2 years ago, # |

I'm a little puzzled, do you also have to minimize steps? Otherwise it seems to me that just picking each subarray of size 1 gives n cost, and we can't seem to do better.

→ Reply

DeltaPavonis

2 years ago, # ^ |

Well, consider something like {1 3 1 3 1 3 1 3 1 3}. Clearly, using only one step to color all cells costs less than coloring each cell individually. (4 < 10)

→ Reply

barun511

2 years ago, # ^ |

That's true — my mistake. Thanks for spotting this :)

→ Reply

keyboard_virtuoso

2 years ago, # |

← Rev. 2 →

-8

Do you get TLE? I think that there is chance that this solution can get AC.

You said that dp[n] you get by consider dp[<n]... but if dp[i..n] have 7 colors and dp[i-1..n] have 7 colors... then there is no reason to consider dp[i] for the solution. Do you think you can exploit that?

→ Reply

nskybytskyi

2 years ago, # |

← Rev. 2 →

In your dp you are trying to answer the following question for each position:

should I append the current element to the last range/operation, or create a new one?

Appending is "profitable" only if the last range already contains the current color, or else $$$(c + 1)^2 > c^2 + 1^2$$$, where $$$c$$$ is the number of distinct colors in the last range. Hence you should append for free iff the last range starts at or before $$$\text{prev}(i)$$$, the previous occurrence of color $$$c_i$$$.

Let's say dp[i][j] is the best we can do on a prefix of length $$$i$$$ with the last segment starting at position $$$j$$$. This dp should be easy to update with a Fenwick tree, as the only operation appears to be adding $$$1$$$ on range $$$(\text{prev}(i), i]$$$.

You can probably even get rid of the induced extra log factor in the complexity, but I'm too lazy.

→ Reply

nskybytskyi

2 years ago, # ^ |

← Rev. 3 →

Uh oh, I was wrong, we should increment $$$(\text{prev}(i), i]$$$ by a linear function $$$2 (i - x) + 1 = -2x + (2i + 1)$$$ of position $$$x$$$.

This is harder but also doable. You will probably need a Fenwick tree that stores its values as a linear function $$$a x + b$$$ (i.e. two trees: one for $$$a$$$ and one for $$$b$$$).

→ Reply

nskybytskyi

2 years ago, # ^ |

Oh wait, it is not linear, as it depends on the number of different colors in that interval. Okay, I give up :(

→ Reply

Iftekhar_Hakim_K

2 years ago, # |

+31

Maximum answer can be $$$N$$$ (We can make $$$N$$$ sub-arrays with $$$1$$$ element in each). For the same reason, we will never make a sub-array with more than $$$\sqrt{N}$$$ distinct colors. That leaves us with $$$\sqrt{N}$$$ transitions from each state.

→ Reply

nskybytskyi

2 years ago, # |

← Rev. 2 →

+18

Alright, your sqrt ideas were correct from the beginning. Specifically, as there exists a trivial solution with cost $$$n$$$, an optimal solution cannot contain an interval with $$$c > \sqrt{n}$$$ different colors.

Therefore, each of $$$O(n)$$$ states only has $$$O(\sqrt{n})$$$ meaningful transitions. Namely, we should loop over $$$c$$$ from $$$1$$$ to $$$\sqrt{n}$$$ and consider the transition corresponding to the longest interval with $$$c$$$ colors.

Left endpoints of these $$$\sqrt{n}$$$ intervals should be easy to maintain efficiently with the use of $$$\text{prev}(c_i)$$$ introduced above. Looks like $$$i$$$ should be added for $$$c = 1$$$ and the one after $$$\text{prev}(c_i)$$$ should be deleted.

Upd: similar to above, didn't see it while writing.

→ Reply