### Anai's blog

By Anai, history, 10 months ago, ,  #dp, Comments (8)
•  » » This trick relies on a simple fact sometimes known as the "Lagrange sufficiency theorem":Let $X$ be a set, $f : X \to \mathbb R$ a 'value function', $h : X \to \mathbb R$ a 'constraint function', and $b \in \mathbb R$ our specific constraint. Suppose $\lambda \in \mathbb R$ and $x \in X$ are such that $x$ maximises $L(x,\lambda) = f(x) - \lambda(h(x)-b)$. Moreover, suppose $h(x)=b$. Then $x$ maximises $f(x)$ subject to the constraint $h(x)=b$.(The proof is trivial: just observe that $L = f$ when $h(x)=b$.)In the dp problems, $X$ is something like "all partitions of the sequence 1..n into contiguous subsequences", $f$ is what it is, $h$ is "number of parts of the partition", and $b$ is k.The general situation where the Lagrange sufficiency theorem is useful is where $X$ is 'nice', so that we can optimise $L(-, \lambda)$ on $X$ (using a one-dimensional dp maybe), but the set $\langle x \in X \mid h(x)=b \rangle$ is less nice, so that we cannot optimise on it directly (or maybe only in quadratic time). In general you will find that, letting $x(\lambda)$ maximise $L(-,\lambda)$, $h(x(\lambda))$ is monotonically decreasing in $\lambda$ (let's not worry about ambiguity in choice of $x(\lambda)$). So you should be able to find a $\lambda$ with $h(x(\lambda)) = b$ using binary search; binary search is fast so this method can be good. It might be that no $\lambda$ with $h(x(\lambda))=b$ exists, in which case this method isn't helpful. Convexity comes into play to ensure that $\lambda$ exists, and this is where things become a bit technical.The Wikipedia article only seems to deal with the case where $X$ is a manifold (i.e. a 'continuous' space), and $f, h$ are differentiable. I think the reason for this is that instead of having to worry about global extremisation and existence of $\lambda$, you just have to look for (constrained) stationary points, which is simpler (it's local). But if you want a sufficient condition for when your solution is the constrained maximum, then you need the "sufficiency theorem". (Of course, derivatives and stationarity aren't going to help you in discrete dp problems...)Not sure if this is what you were asking for, but I hope it helps.
 » I think there are two typos. The summation in the definition of the second dp should be over $v_k$ and it should probably be $max(x_t - y_{t + 1} + 1, 0)^2$ in the last dp.