felipeblassioli's blog

By felipeblassioli, 9 years ago, In English

DISCLAIMER: I really want to understand why this solution (which I think is quite reasonable) is wrong. If this is a nonsensical solution, please say so. I am not looking for a AC solution, I'd like to prove that this one is wrong, but i've been unable to do so..


Problem Statement

The statement is from timus 1588 — Jamaica:

Each two cities are to be connected by a road going along a straight line. One road may connect several cities if they lie on the same straight line. Jamaican economists are sure that the new network will minimize transportation expenses. In order to estimate the cost of the project, it is required to determine the total length of the roads to be constructed, and this is the job for Andrey.


Solution to be proven wrong

So, we have N points and we want the sum of a subset of the lines.

Consider all point combinations, this give all lines. A line is described by its slope and its point of intersection with the Y axis. The ideas is simply to find the biggest segment in each line.

Some code:

from fractions import Fraction # available since python 2.6 . Irreducible fractions, very slow
...
        for a,b in combinations(pts,2):
		dx,dy = a[0]-b[0], a[1]-b[1]
		dist = _dist(a,b)
                # Vertical, horizontal and default line
		if a[1] == b[1]: 
			line = (0, a[1])
		elif a[0] == b[0]: 
			line = (INF, a[0])
		else:
                        # irreducible fraction Fraction(8,2) = 4/1
                        # M = dy/dx and B = Y - MX
			M = Fraction(dy,dx)
                        # line is tuple of integers. No division is done to avoid precision errors.
			B = Fraction(dx*b[1] - (dy*b[0]), dx)
			line = ((M.numerator,M.denominator, (B.numerator,B.denominator))

                # Find biggest line
		if line not in lines:
		    lines[line] = dist
		elif lines[line] <= dist:
		    lines[line] = dist

Here's the example input: (correct answer is 412)

0 0
0 100
100 0
50 50

And a verbose output:

(A,B) DIST , where (A,B) describes segment AB

((0, 0), (100, 0)) 100.0
((0, 0), (50, 50)) 70.7106781187
((0, 0), (0, 100)) 100.0
((0, 100), (100, 0)) 141.421356237
412

What may be wrong

Why is it wrong? I couldn't find a test in which this algorithm fail.

  • Is the general idea Faulty? I'm quite 'preocupied' because the problem's discussion said that the "best algo" for this problem was O(N^2 log N) or O(N^2 log N^2) or something like that, but in my solution I don't see a O(log X) factor hah (i.e I could be overlooking some case). In fact, both log(N) solutions mentioned sort (sort by angle or sort by distance), but I didn`t find a reason to sort..

  • Is it a precision problem? Or a rounding problem? I've used python's Fraction to (try) avoid precision problems and math.round(X + 0.5) to print the answer. Is that not enough?

OBS: Also I'm aware of some O(N^3) AC solutions, such as the one I put at the bottom of this post. But I didn't find they interesting..


I think my solution is quite similar to one mentioned in the problem's discussion:

  1. Sort n^2 lines by distance.
  2. Iterate from largest to smallest.
  3. We have to add line to the answer, if we not have bigger line in set, covering this. We can easily check it with set.

Where did I go wrong?


EDIT: As the user @ffao said, my problem could be anywhere in my code, so here's the whole thing.

Maybe it's a precision problem while printing the answer (i.e floor( X + 0.5) not enought) I don't think so, I kind of tested with some 'round problematic answers'.

I really suspect that has got something todo with the general idea of the solution and that I'm overlooking some sort of case, unfortunately haha

Here`s the whole code:

def main():
	from sys import stdin
	from itertools import izip, combinations
	from math import sqrt, floor
	from fractions import Fraction

	tkns = iter(map(int,stdin.read().strip().split()))
	n = tkns.next()

	pts = [ (x,y) for x,y in izip(tkns, tkns) ]
	_dist = lambda A,B: sqrt((A[0]-B[0])*(A[0]-B[0]) + (A[1]-B[1])*(A[1]-B[1]))
	lines = {}
	INF = 10000005

        for a,b in combinations(pts,2):
		dx,dy = a[0]-b[0], a[1]-b[1]
		dist = _dist(a,b)
		if a[1] == b[1]: 
			line = (0, a[1])
		elif a[0] == b[0]: 
			line = (INF, a[0])
		else:
			M = Fraction(dy,dx)
			B = Fraction(dx*b[1] - (dy*b[0]), dx)
			line = ((M.numerator,M.denominator), (B.numerator,B.denominator))
			#M = float(dy)/dx
			#line = (M, b[1] - M*b[0])

		if line not in lines:
		    lines[line] = dist
		elif lines[line] <= dist:
		    lines[line] = dist

	print int(floor(sum(lines.values())+0.5))

Also, here's an AC solution which I think is O(n^3): (most timus1588 solutions I googled were analogous to this straightforward one)

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;

int x[303], y[303];
int dist(int i, int j) {
	return (x[i] - x[j]) * (x[i] - x[j]) + (y[i] - y[j]) * (y[i] - y[j]);
}

int cross(int i, int j, int k) {
	return (x[j] - x[i]) * (y[k] - y[i]) - (y[j] - y[i]) * (x[k] - x[i]);
}

map<vector<int>, int> ss;
int main() {
	int i, j, k, n;
	scanf("%d", &n);
	for(i = 0; i < n; i++) {
		scanf("%d %d", &x[i], &y[i]);
	}
	for(i = 0; i < n; i++)
		for(j = i + 1; j < n; j++) {
			vector<int> v;
			for(k = 0; k < n; k++)
				if(k == i || k == j || cross(i, j, k) == 0)
					v.push_back(k);
			ss[v] = max(ss[v], dist(i, j));
		}
	double tot = 0;
	for(auto &e : ss)
		tot += sqrt(e.second);
	printf("%.0f\n", tot);
}
  • Vote: I like it
  • +1
  • Vote: I do not like it

| Write comment?
»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Since you are using python, a much better way to store a line would be by representing it as Ax + By + C = 0 and using a dictionary to store the points as values on the basis of keys tuple (A,B,C), where A,B,C should have no common factor apart from 1. This will avoid the floating point errors , that's where I think you are going wrong.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thank you for the answer.

    How is that diferent from using a tuple (M,B), where M,B are also integers? I mean, after finding A,B,C I still have to guarantee there is no common factor, right?

    Isn't that the same I did with M,B by using Fraction? (there's never a division)

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Your explanation of your solution was pretty good (thanks for that), but why didn't you post your own working code? Maybe there is a mistake there you didn't copy here, or maybe there is a mistake in a different place of the program.