GRAYRHINO's blog

By GRAYRHINO, history, 7 weeks ago, In English

Given an array of integers (duplicates are possible), find the minimum number of swaps to sort the array. For [3,9,2,4,2] it should be 2 : [3, 9, 2, 4, 2] -> [2, 9, 3, 4, 2] -> [2, 2, 3, 4, 9]

If the elements are distinct then there's a time complexity O(nlog(n)) and space complexity O(n) solution here. However, I couldn't find any satisfactory proof for solutions where array contains duplicates. So, I have two questions:

  1. What is the most optimal solution when all the elements are distinct?
  2. Any algorithm when the array contains duplicates?

This seems to be very common questions in top tech interviews. Your ideas and suggestions (possibly code) would be much appreciated.

 
 
 
 
  • Vote: I like it
  • +91
  • Vote: I do not like it

»
7 weeks ago, # |
  Vote: I like it -16 Vote: I do not like it

It's with the same basic principle, let's take the i-th element in the array, we have to swap it (at some point) with the j-th element if, j < i and the i-th element is smaller, or j > i and the j-th element is smaller. this many swaps is necessary as there are at least this many inversions, and we can also see that it is sufficient, so we need to calculate this amount. doing so can be easily done using segment trees, in O(n.log(n)) time and O(n) memory.

  • »
    »
    7 weeks ago, # ^ |
      Vote: I like it +31 Vote: I do not like it

    You are counting no of inversions and claiming that answer is no of inversions. It is wrong.
    He never said you can swap only adjacent elements.

    In fact answer of this problem is always less than n. While no of inversions can be O(n^2).

    • »
      »
      »
      7 weeks ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Thanks for highlighting that. Before posting the question here, I have seen various solutions but almost all of them make some assumptions such as elements are distinct, only adjacent swaps are allowed, etc. While all those assumptions are valid and there's something to learn from them, the reason I posted here is to get feedback for the worst case that is when duplicates are allowed and you can swap any two elements. Hope we can get some satisfactory explanation.

      • »
        »
        »
        »
        7 weeks ago, # ^ |
          Vote: I like it +13 Vote: I do not like it

        The problem seems interesting but IDK how to solve it.

        Lebossle wrote about this problem -
        "you can model it as a graph where nodes are groups of equal elements and edges say where they want to go, which ends up being a circulation (degree in = degree out). The problem is thus decomposition into maximum amount of cycles. Dual is minimum amount of edges to cut to make it a DAG. Not sure how to move from here, I'll think if simplex translates into something fast, like a flow, or maybe it's matroid intersection which I still need to learn"

        • »
          »
          »
          »
          »
          7 weeks ago, # ^ |
            Vote: I like it +10 Vote: I do not like it

          Minimum number of edges to cut to make a Graph a DAG is known as the Feedback Arc Set Problem. This is known to be NP-Complete.

»
7 weeks ago, # |
Rev. 2   Vote: I like it +16 Vote: I do not like it

Cycle Sort — Read the second point please. UPD: It is wrong

  • »
    »
    7 weeks ago, # ^ |
    Rev. 5   Vote: I like it 0 Vote: I do not like it

    It's wrong. It goes the correct answer only for a stable sort. Basically, it assumes if i<j and a[i]==a[j] then ith element should be before j or maybe it fails on this as well. It never ensures that result it will give is minimum.

    In the linked pseudo code.
    First comment out writes+=1 in array[pos], item = item, array[pos];writes += 1. Then it will return no of swaps required instead of no of writes operations.

    Then consider 2 4 5 1 6 3 it will return 4 as an answer which is correct. But for 2 3 5 1 6 3 it returns 5 but the correct answer is 4.

    Upd: It minimizes no of writes which is different from no of swaps.

  • »
    »
    7 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Your suggested solution gives 4 for [3,9,2,4,2] and 3 for [5, 1, 3, 2] while for both of them the answer is 2. Am I missing something?

»
7 weeks ago, # |
  Vote: I like it +19 Vote: I do not like it

Hello, can you give a link to the task?

  • »
    »
    7 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Unfortunately, I do not have any links. I read about it from someone's interview experience at Google. The question is intentionally vague and obviously you can make assumptions and solve it for special cases as discussed above. To test your limits the interviewer will make it difficult after each correct solution.

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Pls, provide a link to the problem.

»
7 weeks ago, # |
  Vote: I like it -16 Vote: I do not like it

If you really need ANY solution for duplicates: just make an array of indexes (basically it is 0, 1, .. n-1). Then you can swap adjacent indexes of this array in original one and check if the original array is sorted after it or not. After that you take next permutation of your array of indexes. This solution works for N!. If you have any doubts you can ask me.