MciprianM's blog

By MciprianM, 14 years ago, In English
A couple of weeks ago one of my colleagues at college asked me if I could find the solution to this problem that was in a Microsoft interview. I thought about it and later that night I had a couple of solutions including the O(NlgN)  O(lgN) one (reading the sequences is counted out ) that was required for it. Next morning I also had an implementation in C++ of the solution which worked fine. Now if you want to take a shot at it, the problem statement goes like this:

You are given two sorted sequences of 32-bit integer numbers. Find the median of the sequence obtained by concatening the two sequences.


I will later post the solutions I found, because I like the sort of trade-off that appears whilst you improve your solution's execution time. But I don't want to spoil all the fun. So, for now, I will leave you the time to solve it. If you want to step up, you can e-mail me the solutions at [email protected] and I will post the names of the best 10 solvers in the solution post.

P.S.: Regarding round #17 and #18, I was very busy with the exams and I didn't have the time to participate at them. But I took my shot at round #20 and came out 9th and hopefully I will be free to take my shot at round #19.

UPD1. Thanks to fcdkbear for pointing out that I requested an O(NlgN) algorithm.
UPD2. The two sequences are stored in arrays, so you have random acces to the elements.

  • Vote: I like it
  • +2
  • Vote: I do not like it

14 years ago, # |
  Vote: I like it 0 Vote: I do not like it
// Preudo code: O(length1 + length2)


int next(const Seq& s1, Seq::iterator& it1, const Seq& s2, Seq::iterator& it2) {
int res = 0;
if (it1 == s1.end()) {
res = *it2;
it2++;
return res;
}
if (it2 == s2.end()) {
res = *it1;
it1++;
return res;
}
if (*it1 < *it2) {
res = *it1;
it1++;
return res;
} else {
res = *it2;
it2++;
return res;
}
}

int med(const Seq& s1, const Seq& s2) {
Seq::iterator it1 = s1.begin();
Seq::iterator m1 = s1.begin();
Seq::iterator it2 = s2.begin();
Seq::iterator m2 = s2.begin();
int res = 0;
bool flag = true;
while (it1 != s1.end() || it2 != s2.end()) {
next(s1, it1, s2, it2);
if (flag) {
res = next(s1, m1, s2, m2);
}
flag = !flag;
}
return res;
}

  • 14 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it
    Mine is O(lg^2 N) one.
    so you have 2 array of integers and they are sorted.
    #define size= total size of array 1 and array 2
    let's assume tot=sorted array after you combine array 1 and array 2
    function Occ(x) = total occurrence x in array 1 and array 2( you can do binary search or use STL lower_bound)
    1. if median is odd, the index of median in tot is (size+1)/2, so you do binary search.
        you get the smallest "answer" for Occ(answer)>=(size+1)/2 ->index of median. answer=median itself
    2. if median is even, the index of median in tot are size/2 and size/2+1; where median=(tot[size/2]+tot[size/2+1])/2
       you do binary search two times,
      you get the smallest "answer1" for Occ(answer1)>=size/2
      you get the smallest "answer2" for Occ(answer2)>=size/2+1
      thus median=(answer1+answer2)/2
    Correct me if I'm wrong thx :D
    • 14 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it
      ow.. sorry for that
    • 14 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Where wiil you store the values of Occ(x)? X may be as large as 2^32-1.

      And, as i understand, Occ(x)=total ototal occurrence of numbers less than or equal to x in array 1 and array 2. Am i right?

      • 14 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it
        Sorry. I wanted to ask for an O(lgN) algorithm. I updated the text.
      • 14 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it
        No. you misunderstood me. So Occ(x) is a function
        Occ(x)=lower_bound(array1,array1+size1,number x)+
                     lower_bound(array2,array2+size2,number x)
        thus you get total occurence of x in both array.
        That's why you need lg N again and total complexity become O(lg^2(N))
        • 14 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I think, your algo will be incorrect if there will be some equal numbers. For example, array1[]={4,4,4} and array2[]={4,4}.

          But I think, we can modify your algo and get the right answer. We can find the number of elemets, that are strictly less than x and number of elements, that are strictly bigger than x(in both arrays). So, we could know the positions of X in the final sorted sequence. We can find the number, which position is size/2(and, maybe, size/2+1), so we will know the answer. 

          • 14 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it
            hm.. for your case, I still can get it right with my way.
            Occ(4)=5( 3 in array 1 and 2 in array 2), which is bigger than 3(index of median) So median is 4.
            you are doing binary search, so when 4 is valid, you try to find smaller value  so that Occ(x)>=index of median, let's say you try 3,Occ(3) is 0, thus 4 is the smallest number for median.
            • 14 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it
              So we need to find lower_bound(array1,array1+size1,numberX+1), because lower_bound(array1,array1+3,4) -array1 in my case is equal to 0.
              • 14 years ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it
                ups.. yeah sorry for confusing you, you can use upper_bound of 4 then :)
                but yeah basically it's the same
    • 14 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it
      Solution of this problem strongly based on the meaning of the word "Sequence". I've used the general definition of a sequence. My sequence can:
      1) get current element
      2) move to the next element
      3) ask is the sequence has more elements
      You solution uses random access to is't elements, so it operates with another meaning of the "sequence" word.
      • 14 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it
        Sequence as in CLR :P
        I believe you can take in this problem the sequence as an array of numbers. Like:

        int a[65536];

        in C or C++. So you can have random acces to the elements. Sorry for the misunderstanding. :(
14 years ago, # |
  Vote: I like it 0 Vote: I do not like it
If you post the solutions at comments you might spoil other's fun :D