ikk's blog

By ikk, 12 years ago, In English

The closest pair of points problem is a well-known problem in computational geometry (it is also in SPOJ) and several efficient algorithm to solve it exists (divide and conquer, line sweep, randomization, etc.) However, I thought it's a bit tedious to implement them, and decided to solve it naïvely.

The naïve algorithm is as follows: first, sort the points along the x axis, then, for each pair of points whose difference in the x coordinate is less than the optimal difference so far, compute the distance between the points. This somewhat heuristic algorithm fails when all of the x coordinates of the points are the same, so the running time is O(n2).

What I did to get an AC at SPOJ is pick up a random angle θ and rotate the points around the origin by θ at the beginning of the algorithm. This (I hope) makes the heuristic work and I got an AC.

The problem is the complexity of this randomized algorithm. Some say its average running time is . How do you prove this?

Full text and comments »

  • Vote: I like it
  • +47
  • Vote: I do not like it

By ikk, 12 years ago, In English

How would you solve #101 Problem C if n ≤  50000?

A typical solution for this involves repeatedly inserting an element at the ai-th position of a sequence n times.  This is the bottleneck of the overall solution, because this takes O(n2) time if the sequence is implemented by an array or by a linked list.  This obviously doesn't work if n ≤  50000.

In the solution submitted by hex539, the sequence is implemented by a custom balanced binary search tree (BST).  The running time of the bottleneck is reduced to , and thus this works even if n ≤  50000.

The problem with a balanced BST (if any) is that implementing it yourself is tedious and error-prone.  So the question is how you would solve it if n ≤  50000 without using a balanced BST, preferably by reducing the bottleneck to time.

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it

By ikk, 12 years ago, In English

The following fact may be elementary and obvious to some people, but I'd been confused regarding this topic for some time, so I wrote this blog post.

The definition of the big O notation is usually given for one-variable functions in introductory textbooks, so when I talked or read about graph algorithms, I used notations like O(|V| + |E|) without really understanding their meaning.

Adapting the definition for one-variable functions, I get this definition for multi-variable functions, say, :

, iff s.t. .
The question is what norm we are talking about.  Here are some norms we're used to:
  • ||(x, y)||1 = |x| + |y| (Manhattan)
  • (Euclidean)
  • ||(x, y)|| = max \{|x|, |y|\} (Infinity)

Of course, unless otherwise stated, norm means the Euclidean norm. But the other two norms seems simpler: in computer science x and y are in most cases integers, so why do you have to bother with irrational numbers?

As it turns out, the three norms above are equivalent in terms of defining the meaning of O(g). Suppose in terms of the Manhattan norm, i.e., s.t. .  The set of points {(x, y): ||(x, y)||1 ≥ D} is the region outside a square enclosing the origin, and if we replace the norm with the Euclidean norm, it becomes the region outside a circle.  So if we take D so that the circle {(x, y): ||(x, y)||2 = D} encloses the square {(x, y): ||(x, y)||1 = D}, we have , and this means in terms of the Euclidean norm, too.  The rest of the proof goes similarly.

EDIT: I've just read that all norms of a finite-dimensional vector space induce the same topology in a textbook.  I'd like to know if this is relevant to what I've written.

Full text and comments »

  • Vote: I like it
  • +32
  • Vote: I do not like it

By ikk, 12 years ago, In English

Some resources found on the Web state that the running time of the bipartite matching algorithm (as found in UL: Undefined's solution) is O(|V|(|V| + |E|)).  I suspect, however, that there exists a better upper bound for this.  In Problem H of Saratov Regional, the number of the vertices can be as large as .  A Θ(|V|(|V| + |E|)) algorithm usually gives TLE for graphs of this size, but many participants got ACs by submitting that algorithm.

I suspect the running time is determined by the lesser of the number of vertices on the left and the number of vertices on the right.  Am I wrong?  I couldn't prove or disprove myself.

Full text and comments »

  • Vote: I like it
  • +4
  • Vote: I do not like it

By ikk, 13 years ago, In English

The rank of the Tutte matrix T of a graph G is equal to twice the size of a maximum matching of G.  There exists a nondeterministic algorithm based on this fact for maximum matching problem on a general graph G.  That is, repeatedly replace T's indeterminates with random real numbers and compute the rank of T, and let the size of maximum matching be the maximum rank / 2.  This works because the rank of T does not incrase by random assignments.

But the algorithm didn't work for me.  In particular, the graph with the adjacency matrix at the bottom of this code has a maximum matching of size 7, but the code reports it has 8.

What mistake did I make?

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it