upobir's blog

By upobir, 3 years ago, In English

I learned some stuff about quadrangle inequality recently. I wanted to collect those in one place, so here goes.

Note: I use $$$A \prec B$$$ to denote that $$$A$$$ is strictly more optimal than $$$B$$$. $$$\preceq$$$ is self-explanatory. The actual order depends on problems, in some cases it's $$$<$$$, in some cases, it's $$$>$$$. All of the arrays are 1-indexed and have length $$$n$$$ (if unspecified). Also for ease of following the proofs, I have tried to maintain that $$$a \leq b \leq c \leq d$$$ wherever they are used. By the notation $$$[l,r]$$$ I mean the subarray taken by using indices $$$i$$$ such that $$$l \leq i \leq r$$$.

Quadrangle Inequality: Consider a real valued function $$$C(l, r)$$$ which is defined for $$$1 \leq l \leq r \leq n$$$. Let us call this function the cost function. The cost function is said to satisfy the quadrangle inequality iff $$$C(a, c)+C(b, d) \preceq C(a, d) + C(b, c)$$$ for $$$a \leq b \leq c \leq d$$$. Intuitively this implies that for $$$C()$$$, "wider is worse". Let us call the cost function Quadrangle Friendly (or QF in short) iff it satisfies the quadrangle inequality.

Let us also define the function $$$opt(r)=$$$ rightmost $$$l$$$ such that $$$C(l, r)$$$ is optimum where $$$1 \leq l \leq r \leq n$$$.

Fact 1) If cost function is QF then $$$opt()$$$ is non-decreasing.
Proof: Suppose for some $$$1 \leq c < d \leq n$$$, we have $$$b = opt(c) > opt(d) = a$$$ so ($$$a < b \leq c < d$$$. Then by definition of $$$opt()$$$ we have $$$C(b, c) \preceq C(a, c)$$$ and $$$C(a, d) \prec C(b, d)$$$. Adding these two we get, $$$C(a, d) + C(b, c) \prec C(a, c) + C(b, d)$$$, a contradiction with quadrangle inequality. □

Remark: It is not required that $$$opt(r)$$$ is the rightmost optimal $$$l$$$. That is only used for the function to be well defined. In fact, the proof can be imitated to show the same fact iff $$$opt(r)$$$ is defined as leftmost optimal $$$l$$$ too.

Array Partition DP: Suppose we want to partition an array into $$$k$$$ non-empty subarray and if the subarrays are $$$[l_1, r_1], \cdots, [l_k, r_k]$$$ ($$$r_i + 1 = l_{i+1}$$$ and $$$l_1 = 1, r_k = n$$$) then the cost of the partition is $$$C(l_1, r_1) + \cdots + C(l_k, r_k)$$$. To find the optimal partition cost, we perform a dp where $$$dp[i][k] = $$$ optimal cost to partition subarray $$$[1, i]$$$ into $$$k$$$ subarrays. Then, we have
$$$dp[0][0] = 0$$$
$$$dp[i][0] = dp[0][k] = \infty $$$ ($$$i, k \geq 1$$$)
$$$dp[i][k] = \min \limits_{1 \leq j \leq i} \{ dp[j-1][k-1] + C(j, i) \} $$$ ($$$i,k \geq 1$$$)
Note that, the $$$\min$$$ and $$$\infty$$$ are all in terms of $$$\prec$$$.

Fact 2) Let us define $$$f_k(l, r) = dp[l-1][k-1] + C(l, r)$$$ for $$$k \geq 1$$$ and $$$1 \leq l \leq r$$$. If $$$C()$$$ is QF, then $$$f_k$$$ is QF.
Proof: $$$a \leq b \leq c \leq d$$$, then
$$$f_k(a, c) + f_k(b, d)$$$
$$$= dp[a-1][k-1] + dp[b-1][k-1] + C(a, c) + C(b, d)$$$
$$$\preceq dp[a-1][k-1] + dp[b-1][k-1] + C(a, d) + C(b, c) $$$
$$$= f_k(a, d) + f_k(b, c)$$$. □

Remark: So when we compute $$$dp[i][k]$$$, what we want is $$$opt(i)$$$ for the function $$$f_{k}$$$. And when $$$C()$$$ is QF, $$$f_{k}$$$ is also QF which lets us know $$$opt(i)$$$ is non-decreasing and we can apply Divide and Conquer Trick. Another interesting fact is that if we create the partition right to left instead of left to right i.e. $$$dp'[i][k] = $$$ optimal cost to divide $$$[i, n]$$$ to $$$k$$$ subarrays, then the corresponding $$$f'_k(l, r) =$$$ $$$ dp'[r+1][k-1] + C(l, r)$$$ is also QF. The proof is almost same as the one given above.

Fact 3) Let us define $$$opt_k(r)$$$ for $$$f_k$$$ i.e. $$$opt_k(r) = $$$ the rightmost $$$l$$$ such that $$$f_k(l, r)$$$ is optimum. If $$$C()$$$ is QF then $$$opt_k(r) \leq opt_{k+1}(r)$$$ ($$$k \geq 1$$$).
Proof: If $$$k = 1$$$, the fact is true by default. Suppose $$$k > 1$$$ and $$$opt_k(r) > opt_{k+1}(r)$$$. So that means we can find an optimal partition for $$$dp[r][k]$$$: $$$[a_k, d_k], \cdots, [a_1, d_1]$$$ and an optimal partition for $$$dp[r][k+1]$$$: $$$[b_{k+1}, c_{k+1}], \cdots, [b_1, c_1]$$$ where $$$a_1 = opt_k(r) >$$$ $$$ opt_{k+1}(r) = b_1$$$. Note that the subarrays are numbered in reverse order (first interval is rightmost), this is to make the proof writing a bit easy.

let $$$i$$$ be the minimum index such that $$$a_i \leq b_i$$$. There is at least one such $$$i$$$ as $$$a_k < b_k$$$ and $$$i > 1$$$ as $$$a_1 > b_1$$$. Then $$$a_{i-1} > b_{i-1} \implies d_{i} > c_{i}$$$. In particular $$$a_i \leq b_i \leq c_i \leq d_i$$$. Intuitively what we just did is choose a $$$i$$$ such that $$$[b_i, c_i]$$$ is inside $$$[a_i, d_i]$$$.

Now we'll create a contradiction with 3 inequalities. First, consider the partition $$$[b_{k+1}, c_{k+1}], \cdots, [b_{i+1}, c_{i+1}], [b_i, d_i], [a_{i-1}, d_{i-1}], \cdots, $$$ $$$[a_1, d_1]$$$. This is a $$$k+1$$$ size partition and thus it's cost will be strictly worse than the partition using only $$$[b_j, c_j]$$$ (Due to the definition of $$$b_1 = opt_{k+1}(r)$$$). So we can infer that $$$C(b_i, c_i) + \cdots C(b_1, c_1) \prec C(b_i, d_i) + C(a_{i-1}, d_{i-1}) + \cdots + C(a_1, d_1)$$$

Second, by quadrangle inequality, $$$C(a_i, c_i) - C(b_i, c_i) \preceq C(a_i, d_i) - C(b_i, d_i)$$$. Adding this inequality with that of last paragraph gives $$$C(a_i, c_i) + C(b_{i-1}, c_{i-1}) + \cdots C(b_1, c_1) \prec C(a_i, d_i) \cdots + C(a_1, d_1)$$$

Thirdly and finally, The last inequality implies that the partition $$$[a_k, d_k], \cdots, [a_{i+1}, d_{i+1}], [a_i, c_i], [b_{i-1}, c_{i-1}], \cdots, [b_1, c_1]$$$ has better total cost than $$$[a_k, d_k], \cdots, [a_1, d_1]$$$, A contradiction. So $$$opt_k(r) \leq opt_{k+1}(r)$$$. □

Remark: By last two facts, $$$opt_{k-1}(r) \leq opt_k(r) \leq opt_k(r+1)$$$. This lets us apply Knuth's optimization to partition dp too.

Monotonicity: The cost function $$$C()$$$ is said to be monotonous iff for $$$a \leq b \leq c \leq d$$$, we have $$$C(b, c) \preceq C(a, d).$$$

Note that monotonicity is not implied by quadrangle inequality. We can come up with cost functions that are QF but not monotonous.

Array Merging DP: Suppose we have to create an array elements of which are initially broken into 1-length arrays $$$[i, i]$$$ (i.e. the $$$i$$$-th entry). At any point, we can merge two subarrays $$$[p, q]$$$ and $$$[q+1, r]$$$ to form the subarray $$$[p, r]$$$, incurring a cost $$$C(p, r)$$$. We want the optimal way to merge the elements from the single subarrays to the full array. Then we can form the definition $$$dp[l][r] = $$$ the optimal cost to merge single subarrays to form the subarray [l, r]. Then
$$$dp[l][l] = C(l, l)$$$
$$$dp[l][r] = C(l, r) + \min \limits_{l \leq m < r} \{ dp[l][m] + dp[m+1][r]\} $$$ ($$$l < r$$$)

Note that even though we define $$$dp[l][l] = C(l, l)$$$, this has no overall effect, since in the end we can subtract all $$$C(i, i)$$$ from $$$dp[1][n]$$$ to negate the contributions. Also, we can redefine the problem to incur cost $$$C(p, q) + C(q+1, r)$$$ when merging $$$[p, q]$$$ and $$$[q+1, r]$$$. The problem is still equivalent, but our DP would have an extra $$$C(1, n)$$$. And, sometimes problems are formulated in terms of splitting the array instead of merging the arrays.

Fact 4) If $$$C()$$$ is QF and monotonous then, $$$dp[][]$$$ is also QF.
Proof: We'll use induction on $$$d-a$$$ of $$$a \leq b\leq c \leq d$$$. If $$$a = b$$$ or $$$c = d$$$ then the inequality $$$dp[a][c] + dp[b][d] \preceq dp[a][d] + dp[b][c]$$$ turns to equality and is true. This handles the base cases. Now suppose, we have fixed $$$a, b, c, d$$$ and have proved the fact for shorter $$$d-a$$$. let $$$a \leq d' < d$$$ be the splitting point of $$$dp[a][d]$$$ i.e. $$$dp[a][d] = C(a, d) + dp[a][d'] + dp[d'+1][d]$$$. There are two cases.

First case, $$$d' < b$$$ or $$$c \leq d'$$$ i.e. the subarray [b, c] is contained in one of the two parts. WLOG $$$c \leq d'$$$ (other case is similar). Then $$$dp[a][d] + dp[b][c] =$$$ $$$C(a, d) + dp[a][d'] + dp[d'+1][d] + dp[b][c] $$$
$$$\succeq C(a, d) + dp[a][c] + dp[b][d'] + dp[d'+1][d]$$$     (Since $$$a \leq b \leq c \leq d'$$$ and inductive hypothesis)
$$$\succeq C(b, d) + dp[a][c] + dp[b][d'] + dp[d'+1][d]$$$     (monotonicity)
$$$\succeq dp[a][c] + dp[b][d]$$$     (definition of $$$dp[b][d]$$$)

Second case, $$$b \leq d' < c$$$ i.e. the splitting point is inside [b, c]. Note that in this case $$$b \neq c$$$. Let us define $$$b \leq c' < c$$$ to be the splitting point of $$$dp[b][c]$$$ I.e. $$$dp[b][c] = C(b, c) + dp[b][c'] + dp[c'+1][c]$$$. WLOG $$$c' \leq d'$$$ (the other case is similar). Then $$$dp[a][d] + dp[b][c] =$$$ $$$C(a, d) + C(b, c) + dp[a][d'] + dp[d'+1][d] + dp[b][c'] + dp[c'+1][c]$$$
$$$\succeq C(a, c) + C(b, d) + dp[a][d'] + dp[d'+1][d] + dp[b][c'] + dp[c'+1][c]$$$    ($$$C()$$$ is QF)
$$$\succeq C(a, c) + C(b, d) + dp[a][c'] + dp[b][d'] + dp[d'+1][d] + dp[c'+1][c]$$$    ($$$ a\leq b \leq c' \leq d'$$$ and inductive hypothesis)
$$$\succeq C(a, c) + dp[a][c'] + dp[c'+1][c] + C(b, d) + dp[b][d'] + dp[d'+1][d]$$$     (rearranging)
$$$\succeq dp[a][c] + dp[b][d]$$$     (definition of $$$dp[a][c]$$$ and $$$dp[b][d]$$$) □

Remark: The monotonicity is actually required for the last fact, it is possible to define $$$C()$$$ such that it is QF, non-monotonous and corresponding $$$dp[][]$$$ is not QF. Also, note that the function $$$f(l, r) = dp[l][r] - C(l, r)$$$ is also QF if C() is QF and monotonous. This fact can't be inferred from the last fact, but we can mimic the proof.

Fact 5) If we define $$$opt(l, r)$$$ = right most $$$m$$$ such that $$$dp[l][r] = C(l, r) + dp[l][m] + dp[m+1][r]$$$ then if $$$C()$$$ is QF and monotonous then $$$opt(l-1, r) \leq opt(l, r) \leq opt(l, r+1)$$$
Proof: If we fix $$$l$$$ then $$$f_l(r) = dp[l][r] - C(l, r)$$$. Then $$$f_l(r)$$$ satisfies the hypothesis for fact 2 (due to fact 4), so $$$opt(l, r) \leq opt(l, r+1)$$$ (the $$$-C(l, r)$$$ have no effect on $$$opt(l, r)$$$). And if we fix $$$r$$$ then $$$g_r(l) = dp[l][r] - C(l, r)$$$. Then $$$g_r(l)$$$ is also QF by remark of fact 2, so $$$opt(l-1, r) \leq opt(l, r)$$$ □

Remark: due to the last fact, we can apply Knuth's optimization in array merging DP.

A final thing to remember is that these facts are not if and only if i.e. the facts can be true for non-QF cost functions too.

Let me know if there are any mistakes.

  • Vote: I like it
  • +298
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
  Vote: I like it +4 Vote: I do not like it

Thanks!I learnt a lot from your blog.

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

In the proof of fact 5, what does it mean for $$$f_l$$$, an one-argument function, to be satisfy the hypothesis of fact 2 and be QF?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    sorry, the last proof is a little thin in details, what was meant was that $$$f_l$$$ represents $$$dp[l][m] + dp[m+1][r]$$$, here the $$$l$$$ is fixed and $$$m+1$$$ and $$$r$$$ correspond to $$$l$$$, $$$r$$$ of fact (2).

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it +10 Vote: I do not like it

      Thanks for the clarification. Just to fill in some details:

      We fix $$$l$$$ and define $$$h_l(m, r) = C(l,r)+dp[l][m-1]+dp[m][r]$$$ $$$(l + 1 \leq m \leq r \leq n)$$$ and we can show $$$h_l$$$ is QF in a way similar to fact 2:

      Suppose $$$l+1 \leq a \leq b \leq c \leq d \leq n$$$, we have

      $$$h_l(a,c) + h_l(b,d) \\$$$ $$$= C(l,c) + dp[l][a-1] + dp[a][c] + C(l,d) + dp[l][b-1] + dp[b][d] \\$$$ $$$\preceq dp[a][d] + dp[b][c] + C(l,c) + dp[l][a-1] + C(l,d) + dp[l][b-1]$$$    (dp is QF) $$$\\ $$$ $$$= h_l(a,d) + h_l(b,c)$$$

      Hence $$$opt_l(r) := opt(l,r)$$$ is non-decreasing by fact 1, with lower bound $$$l+1$$$ instead of $$$1$$$. The case for fixing $$$r$$$ is similar.

      It seems we don't even need $$$dp[l][r]−C(l,r)$$$?

»
16 months ago, # |
  Vote: I like it 0 Vote: I do not like it

what do you mean by more optimal?