buGMaster's blog

By buGMaster, 9 years ago, In English

Hi again...
I studied the tutorial about the Data Structure by PrinceOfPersia. I had a problem with Suffix Array (the deterministic version), and I couldn't get it properly (or maybe it has bugs in it!). So I asked for help via comment, but no one answered (even the author). Then I asked the PrinceOfPersia (the author of tutorial) for double checking and give me more description about it by private message. Because he was busy and don't have enough time to help, I decided to make a blog for it.

Here's the the deterministic version of Suffix Array (used in the tutorial):

/*
Suffix array O(n lg^2 n)
LCP table O(n)
*/
#include <cstdio>
#include <algorithm>
#include <cstring>

using namespace std;

#define REP(i, n) for (int i = 0; i < (int)(n); ++i)

namespace SuffixArray
{
	const int MAXN = 1 << 21;
	char * S;
	int N, gap;
	int sa[MAXN], pos[MAXN], tmp[MAXN], lcp[MAXN];

	bool sufCmp(int i, int j)
	{
		if (pos[i] != pos[j])
			return pos[i] < pos[j];
		i += gap;
		j += gap;
		return (i < N && j < N) ? pos[i] < pos[j] : i > j;
	}

	void buildSA()
	{
		N = strlen(S);
		REP(i, N) sa[i] = i, pos[i] = S[i];
		for (gap = 1;; gap *= 2)
		{
			sort(sa, sa + N, sufCmp);
			REP(i, N - 1) tmp[i + 1] = tmp[i] + sufCmp(sa[i], sa[i + 1]);
			REP(i, N) pos[sa[i]] = tmp[i];
			if (tmp[N - 1] == N - 1) break;
		}
	}

	void buildLCP()
	{
		for (int i = 0, k = 0; i < N; ++i) if (pos[i] != N - 1)
		{
			for (int j = sa[pos[i] + 1]; S[i + k] == S[j + k];)
			++k;
			lcp[pos[i]] = k;
			if (k)--k;
		}
	}
} // end namespace SuffixArray

I don't get how the code works. Could you please give me some more clear description about it?
What does "tmp" store? It seems it contains something like [0,1,...,N-1], doesn't it?!
What about "pos"? What's the initialization of "tmp"?
Any help would be appreciated...

Full text and comments »

  • Vote: I like it
  • +6
  • Vote: I do not like it

By buGMaster, 9 years ago, In English

Hi again...
I need a tool which parses a source code (especially a c++ code) and generates a new one, that's all the unused parts of the original source code (such as includes, macros, functions, variables, ...) is removed in it.
I think, this tool helps in reducing compile time and run time; also makes the code more cleaner which makes debugging easier.
Is there such a tool? If no, how can I produce this tool? (I mean, how can I recognize unused parts of a my code?)
Any help would be appreciated!

Full text and comments »

  • Vote: I like it
  • +3
  • Vote: I do not like it

By buGMaster, 9 years ago, In English

Hi,
I've planned to concentrate on speed up solving problems A, B and C recently. So I need to solve just these problems, not all the problems of a round and evaluate my performance according to the scores (for each problem) that I've taken.
In order to do this, I thought about something like virtual participation; But I noticed that, I can have at most 1 virtual participation for each round, and because of I will have the same practice plan for D and E problems in the future, I don't want to do that!
Is there any other options for me (on Codeforces or other websites) to do this kind of practice?
What about a tool for calculating the scores (Score Calculator)? Is there?

Thanks..

Full text and comments »

  • Vote: I like it
  • -3
  • Vote: I do not like it

By buGMaster, 11 years ago, In English

I've read this problem in the USACO Training Pages. Here's the problem description...

"**You have just won a contest where the prize is a free vacation in Canada. You must travel via air, and the cities are ordered from east to west. In addition, according to the rules, you must start at the further city west, travel only east until you reach the furthest city east, and then fly only west until you reach your starting location. In addition, you may visit no city more than once (except the starting city, of course).** Given the order of the cities, with the flights that can be done (you can only fly between certain cities, and just because you can fly from city A to city B does not mean you can fly the other direction), calculate the maximum number of cities you can visit."

USACO, itself suggests this solution to solve it:

"**However, if, instead of trying to find the path as described, it is found a different manner, then the number of states greatly decreases. Imagine having two travelers who start in the western most city. The travelers take turns traveling east, where the next traveler to move is always the western-most, but the travelers may never be at the same city, unless it is either the first or the last city. However, one of the traveler is only allowed to make "reverse flights," where he can travel from city A to city B if and only if there is a flight from city B to city A.** It's not too difficult to see that the paths of the two travelers can be combined to create a round-trip, by taking the normal traveler's path to the eastern-most city, and then taking the reverse of the other traveler's path back to the western-most city. Also, when traveler x is moved, you know that the traveler y has not yet visited any city east of traveler x except the city traveler y is current at, as otherwise traveler y must have moved once while x was west of y. Thus, the two traveler's paths are disjoint. Why this algorithm might yield the maximum number of cities is left as an exercise."

But I didn't get it!
Please describe this solution, or suggest another solution to solve.
I'm waiting for your answer. Thanks for your help!

Full text and comments »

  • Vote: I like it
  • +2
  • Vote: I do not like it

By buGMaster, 11 years ago, In English

I've used this recursive relation to solve this classic problem. Here it is:

if (I == n || J == m)
        dp[I][J] = 0;
else if (x[I] == y[J])
        dp[I][J] = dp[I+1][J+1] + 1;
else
        dp[I][J] = max (dp[I+1][J], dp[I][J+1]);

but I've seen the USACO TEXT about Dynamic Programming, that it used this pseudocode for this problem:

   # the tail of the second sequence is empty
 1   for element = 1 to length1
 2     length[element, length2+1] = 0

    # the tail of the first sequence has one element
 3   matchelem = 0
 4   for element = length2 to 1
 5     if list1[length1] = list2[element]
 6       matchelem = 1
 7     length[length1,element] = nmatchlen

    # loop over the beginning of the tail of the first sequence
 8   for loc = length1-1 to 1
 9     maxlen = 0
10     for element = length2 to 1

    # longest common subsequence doesn't include first element
11       if length[loc+1,element] > maxlen
12         maxlen = length[loc+1,element]

    # longest common subsequence includes first element
13       if list1[loc] = list2[element] &&
14                       length[loc+1,element+1]+1 > maxlen
15           maxlen = length[loc,element+1] + 1

16     length[loc,element] = maxlen

It it a bit different with my solution. Instead of dp[I][J] = max (dp[I+1][J], dp[I][J+1]); , it's used below code.

    # longest common subsequence doesn't include first element
11       if length[loc+1,element] > maxlen
12         maxlen = length[loc+1,element]

Why this solution just checks length[loc+1,element] and doesn't check length[loc,element+1]? Does this solution guarantee to find the correct answer?

Please guide me to get this point! Thanks for you help...

Full text and comments »

  • Vote: I like it
  • +4
  • Vote: I do not like it

By buGMaster, 11 years ago, In English

I've confused with the code that is written in the TEXT Dynamic Programming section of USACO Training about a classical problem (Finding Maximum Decreasing Subsequence). This is Article Link. Please help me to get it!

Here's the code:

 1  #include <stdio.h>
 2  #define MAXN 200000
 3  main () {
 4      FILE *in, *out;
 5      long num[MAXN], bestrun[MAXN];
 6      long n, i, j, highestrun = 0;
 7      in = fopen ("input.txt", "r");
 8      out = fopen ("output.txt", "w");
 9      fscanf(in, "%ld", &n);
10      for (i = 0; i < n; i++) fscanf(in, "%ld", &num[i]);
11      bestrun[0] = num[n-1];
12      highestrun = 1;
13      for (i = n-1-1; i >= 0; i--) {
14          if (num[i] < bestrun[0]) {
15            bestrun[0] = num[i];
16            continue;
17        }
18        for (j = highestrun - 1; j >= 0; j--) {
19            if (num[i] > bestrun[j]) {
20                if (j == highestrun - 1 || num[i] < bestrun[j+1]){
21                    bestrun[++j] = num[i];
22                    if (j == highestrun) highestrun++;
23                    break;
24                }
25            }
26        }
27      }
28      printf("best is %d\n", highestrun);
29      exit(0);
30  }

I have 3 problems with it:

1) What exactly lines 14-17 do? For example for the sequence 10, 2, 8, 9, 4, 6, 3 , the result of the that code is 4 but it's subsequence is 10, 8, 4, 2 that it's wrong, because the element 2 in original sequence is before 8 and 4 but in the resulting subsequence is after 8 and 4!

2) Consider the sequence 5, 10, 8, 9, 4, 6, 3. According to above code, the length of the maximum decreasing subsequence is 4 and this subsequence is 10, 5, 4, 3. But in this subsequence opposite of the original sequence 5 is after 10.

3) Is it necessary to check num[i] < bestrun[j+1] condition in inner loop? I think it's satisfied before by num[i] > bestrun[j] condition.

I'm waiting for you helpful answers.
Thanks for your help!

Full text and comments »

  • Vote: I like it
  • +2
  • Vote: I do not like it