2018 ICPC Mid-Central Random Manhattan Distance a.k.a how do you do this geometry problem involving probabilities and trapezoids?

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

#	User	Contrib.
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	159
4	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

Problem: https://open.kattis.com/problems/randommanhattan

Official Solution:

/*

Within a zone:  Z = (l^2+3lr+r^2)w^3/15

Cx(i) = x0 + (l+2r)/(3(l+r))w

*/
#include <iostream>
#include <algorithm>
#include <iomanip>
#include <cstdlib>
#include <vector>
#include <math.h>
#include <tuple>
using namespace std ;
using ll = long long ;
using t3 = tuple<double, double, double> ;
int main() {
   cout << setprecision(15) ;
   int N{0} ;
   cin >> N ;
   vector<ll> xs(N), ys(N) ;
   for (int i=0; i<N; i++)
      cin >> xs[i] >> ys[i] ;
   double r {} ;
   for (int outer=0; outer<2; outer++) {
      int hix0 = min_element(xs.begin(), xs.end()) - xs.begin() ;
      int lox0 { hix0 } ;
      auto ht=[&](ll x, int p0, int p1) -> double {
         if (x == xs[p0])
            return ys[p0] ;
         if (x == xs[p1])
            return ys[p1] ;
         return ys[p0]+(x-xs[p0])*(ys[p1]-ys[p0])/(double)(xs[p1]-xs[p0]) ;
      } ;
      vector<t3> zones ;
      while (1) {
         int hix1 { (hix0 + 1) % N } ;
         int lox1 { (lox0 + N - 1) % N } ;
         ll x0 = max(xs[hix0], xs[lox0]) ;
         if (x0 == xs[hix1]) {
            hix0 = hix1 ;
            continue ;
         }
         if (x0 == xs[lox1]) {
            lox0 = lox1 ;
            continue ;
         }
         ll x1 { min(xs[hix1], xs[lox1]) } ;
         if (x1 < x0)
            break ;
         if (x1 == x0)
            throw "Failed while building zones" ;
         double lft { ht(x0, hix0, hix1) - ht(x0, lox0, lox1) } ;
         double rgt { ht(x1, hix0, hix1) - ht(x1, lox0, lox1) } ;
         zones.push_back(make_tuple(lft, rgt, (double)(x1-x0))) ;
//  cout << "Adding tuple xs " << x0 << " " << x1 << " heights " << lft << " " << rgt << endl ;
         if (x1 == xs[hix1])
            hix0 = hix1 ;
         if (x1 == xs[lox1])
            lox0 = lox1 ;
      }
      double s {} ;
      double cxa {} ;
      double sa {} ;
      double a2 {} ;
      double x0 {} ;
      for (int i=0; i<zones.size(); i++) {
         t3 &z = zones[i] ;
         double lft { get<0>(z) }, rgt { get<1>(z) }, w { get<2>(z) } ;
         s += (lft*lft+3*lft*rgt+rgt*rgt)*w*w*w/15 ;
         double ta { (lft+rgt)*w/2 } ;
         double cx { x0 + (lft+2*rgt)/(3*(lft+rgt))*w } ;
         if (i)
            s += 2 * (cx - cxa / sa) * sa * ta ;
         a2 += ta*(2*sa+ta) ;
         sa += ta ;
         cxa += cx * ta ;
         x0 += w ;
      }
      r += s / a2 ;
      swap(xs, ys) ;
      reverse(xs.begin(), xs.end()) ;
      reverse(ys.begin(), ys.end()) ;
   }
   cout << r << endl ;
}

I understand that Cx is the center of a trapezoid but I don't know how the rest of the code works. I know they do 2 passes and swap x and y coordinates, but I don't know how they got the equation for Z (why the 15? does it involve integration?), why they're using the (current area) * (2 * previously accumulated area + current area), why they multiple the trapezoid center by the trapezoid area to get cxa, etc..

Comments (1)

Show archived | Write comment?

meme

4 years ago, # |

This solution needs some amount of understanding of expectation (especially the contribution technique) and center of mass or centroid. First, $$$E(|x_1 - x_2| + |y_1 - y_2|) = E(|x_1 - x_2|) + E(|y_1 - y_2|)$$$ (since expectation is linear), so we can solve for the two individually separately and add them up. The formulas you noticed in the solution are obtained from the formula for the centroids of geometrical shapes, and yes, they are obtained by integration (and some more formulas, such as using weighted mean for finding the centroid of composite shapes).

We are now interested in finding $$$E(|x_1 - x_2|)$$$. If we add the restriction that $$$x_1 \ge x_2$$$, we then need to find $$$E(x_1 - x_2) = E(x_1) - E(x_2)$$$. Awesome. Imagine having a similar problem, but given only a finite number of points $$$p_1, p_2, \cdots, p_n$$$ where the x-coordinates of these points are in non-decreasing order. To solve for $$$E(|x_1 - x_2|)$$$, we note that it is equal to $$$\frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n |x_i-x_j|$$$. Why the $$$n^2$$$? It is the total size, or the size of the sample space — the number of ways of choosing a pair of points.To calculate this expression, we process points in increasing order of x-coordinates, and add individual contributions to the above expression. The contribution of the $$$i$$$-th point is $$$\frac{2}{n^2}\left((i-1)x_i - \sum_{j=1}^{i-1}x_j\right)$$$. There is a $$$2$$$ because we want to count both possibilities (first bigger than second and second bigger than first). So, at each point, we can maintain the sum of x-coordinates and add this contribution term to get the expected value of difference of x-coordinates.

In your problem, we apply the same idea — we take the original polygon, divide it into a finite number of sections, use the idea of centroids, and reduce the given problem to that involving only a finite number of points. Instead of $$$n^2$$$ above, we have the size of the sample as $$$area^2$$$. Now, here is how the sections are created (the red lines are supposed to be vertical):

You might begin to notice that each section lies between two red lines and is a trapezoid. Let there be $$$k$$$ such sections. Now, we process all sections in increasing order of x-coordinate. If we're at the $$$i$$$-th section, we need to add the contribution of terms where

a point belonging to this section is paired with a point belonging to preceding sections, and the contribution of terms, and
where a point in this section is paired with a point in the same section.

It is easy to see why counting this way would take into account all pairs of points in the polygon. To calculate the first one, here is the exciting part: recall that the centroid gives you $$$E(x)$$$ in a shape. So, to calculate $$$E(x_1 - x_2)$$$ in a union of two shapes, where $$$x_1$$$ is in the first shape and $$$x_2$$$ is in the other shape, we can subtract the x-coordinates of their centroids. That brings us to the final solution. When we are looping over the sections, instead of sum of $$$x$$$ in the finite case, we maintain the x-coordinate of the centroid of the previous sections (their combined shape), we subtract it from the x-coordinate of the centroid of the current section, and use that to get the contribution. Here, the contribution is actually $$$\frac{\rm area_{current} area_{old}}{\rm area^2}(x_{\rm centroid, current} - x_{\rm centroid, old})$$$. Now, once we are done with processing for this section, we would also like to update the centroid of the current shape. This would become $$$x_{\rm centroid, new} = \frac{x_{\rm centroid, old} {\rm area_{old}} + x_{\rm centroid, current} {\rm area_{current}}}{\rm area_{old}+\rm area_{current}}$$$. And also remember to add the current area to the old area to get the new area.

The second one is obtained using integration... please try to find $$$E(|x_1 - x_2|)$$$ for a trapezoid given its parameters (length of two non-parallel sides and height) on pen and paper using the contribution technique, but on differentials instead.

→ Reply

anduturacila6's blog