faresbasbas's blog

By faresbasbas, history, 4 years ago, In English

Recently I've read about minimum number of needed edges to construct SCC from a DAG. And it was the MAX(x,y) where x is the number vertices with 0 in-degree and y is the number of vertices with 0 out-degree. But i am wondering how to find such combination of edges in O(N) time. Thanks in advance :)

edit : solved. thank you very much for all people who helped me :)

  • Vote: I like it
  • +55
  • Vote: I do not like it

»
4 years ago, # |
Rev. 3   Vote: I like it +6 Vote: I do not like it

Let $$$s_1, s_2, ..., s_n$$$ will be $$$0$$$ in-degree vertices and $$$e_1, e_2, ..., e_m$$$ vertices with $$$0$$$ out-degree. Assume that $$$m \geq n$$$ (if it's wrong we can, for example, change direction of all edges).

Notice, that out-degree of every $$$e_i$$$ vertex should be at least one after adding edges. But we can add only $$$m$$$ edges. So, every edge should begins in one of $$$e$$$ vertices and out-degree of every vertex from $$$e$$$ will be exactly $$$1$$$. Let's build these edges.

Run dfs from $$$s_1$$$ and find all $$$0$$$ out-degree vertices, which achievable from $$$s_1$$$. Let's call these vertices $$$c_1, c_2, ..., c_k$$$. Now add edges $$$(c_1, c_2)$$$, $$$(c_2, c_3), ..., (c_{k-1}, c_{k})$$$ and $$$(c_k, s_2)$$$. Then repeat this process with other vertices from $$$s$$$. On every step we shouldn't touch vertices from $$$e$$$, which were visited on previous steps. And don't forget about edge $$$(c_{nk_n}, s_1)$$$ in the end.

Total complexity is $$$O(N+M)$$$.

  • »
    »
    4 years ago, # ^ |
    Rev. 5   Vote: I like it +10 Vote: I do not like it

    Ok it looks very clear for me thank you very much except this part (cnkn,s1) which is the last edge, but how can we deal with it if we had multiple DAGs in the same graph?

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 2   Vote: I like it +16 Vote: I do not like it

      It also works with multiple DAGs in the same graph. After adding edges there will be cycle $$$(s_1, c_{11}), (c_{11}, c_{12}), ... (c_{1k_{1}-1}, c_{1k_1}), (c_{1k_1}, s_2), (s_2, c_{21}), (c_{21}, c_{22}), ...,$$$ $$$(c_{2k_{2}-1}, c_{2k_2}), (c_{2k_2}, s_3), ..., (c_{(n-1)k_{n-1}}, s_n), ... , (c_{nk_{n}}, s_1)$$$. This cycle covers all $$$0$$$-in degree verteces, from which we can reach any vertex.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +22 Vote: I do not like it

        What's your response for Radewoosh's counter example?

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it +5 Vote: I do not like it

          His counter example works :)

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it +5 Vote: I do not like it

            why ? won't you connect 3 -> 4 , 4 -> 1 which is wrong ?

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
                Vote: I like it +13 Vote: I do not like it

              There are will be 2 CSS: (2) and (1, 3, 4)

              • »
                »
                »
                »
                »
                »
                »
                »
                4 years ago, # ^ |
                Rev. 2   Vote: I like it +5 Vote: I do not like it

                each vertex should be in a different SCC in the example 1,3,4 can't reach each other out only 1 can reach 3,4 but 3,4 can't reach anything

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +13 Vote: I do not like it

                   Graph after adding 3->4 and 4->1 edges.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +5 Vote: I do not like it

                  i mean it's still not a SCC but i need to be an SCC

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +5 Vote: I do not like it

                  the solution of this example should be 4 -> 1 , 3 -> 2 while your idea's logic say the answer is 3 -> 4 , 4 -> 1 which is wrong

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +10 Vote: I do not like it

                  I meant that Radewoosh's counter example works against my solution. So, this approach doesn`t work.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  ohh ok i misunderstoond you sorry :)

                  but i appreciate you helping me :)

                  thank you very much

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +34 Vote: I do not like it

    And what do you propose we do if all vertices reachable from $$$s_i$$$ were previously visited?

  • »
    »
    4 years ago, # ^ |
    Rev. 3   Vote: I like it +53 Vote: I do not like it

    n=4, m=4, edges:

    1 3
    1 4
    2 3
    2 4
    
»
4 years ago, # |
  Vote: I like it +3 Vote: I do not like it

Any other ideas of how to make it work :)?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +13 Vote: I do not like it

    I have a solution which I think should work. Let
    Number of nodes with 0 in-degree = x
    Number of nodes with 0 out-degree = y
    I am considering y $$$\geq$$$ x. If y $$$<$$$ x then we can reverse the edges.

    First, colour all the x nodes with colours from 0 to x-1. Now do a dfs from each of these x nodes. While doing dfs from ith node, colour all nodes you visit with the colour of the root node (the node with 0 in-degree from which we are running dfs currently). In dfs, whenever you encounter a node which has already been coloured, simply return as it is guaranteed that all nodes in it's subtree are already coloured. Now our graph will have these properties-

    Here,
    root node of ith colour is node which has colour i and in-degree 0 and,
    leaf nodes of ith colour are nodes which have colour i and out-degree 0.

    1. Each node will be coloured by some colour c.
    2. There will be atleast 1 node of each colour.

    Now let us consider all colours c such that there exist atleast 1 node with colour c and out-degree = 0. Let there are p such colours f1, f2....fp. Also, select one leaf node of each of these p colours. Now add edges like this-
    leaf(f1) -> root(f2), leaf(f2) -> root(f3), .... leaf(fp) -> root(f1). Total p edges would be added like this.

    Now, there are x-p colours left which do not have any leaf. Let these be c1, c2 ... cx-p. We have y-p leaf nodes left. Now out of these y-p leaves, choose any x-p leaves and let those be l1, l2....lx-p. Add edges like this-
    l1 -> root(c1), l2 -> root(c2) ... lx-p -> root(cx-p). Total x-p edges will be added like this.

    Total edges added till now = p+x-p = x.

    We still have y-x leaves left. Add edges from these leaves to root of any colour. So total y edges would be added. It can easily be shown that we can reach from any node of colour fi to any node of colour fj, from any node of colour fi to any node of colour cj and from any node of colour ci to any node of colour cj.

    I do feel that it might fail on a case like this-
    We have to go from some node of colour ci to some node of colour fj. It is possible in my algorithm that from each node of colour ci, I am only able to reach a leaf node of colour fj which is connected to a root node of colour ck and thus I won't be able to reach every node of colour fk. In this case, I will choose a colour k such that it has a leaf node connected to root(fi) and at least 1 leaf node connected to root(cj). I will swap these two leaf nodes and the problem would be solved.

    It is a very long solution, I am sorry for any typos in between or any mistake. I think this should solve your problem in linear time. Please correct me if I am wrong somewhere.

»
4 years ago, # |
  Vote: I like it +13 Vote: I do not like it

Hello, this is known as the strong connectivity augmentation problem. You can find a solution described in this paper.

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Actually, I've read this paper more than once but the idea isn't very clear to me. Also I tried to just implement it, And it didn't work, probably there is something I missed, But I am not able to figure it out.

»
4 years ago, # |
Rev. 2   Vote: I like it +21 Vote: I do not like it

There is a pretty nice and neat algorithm for this.

The algorithm implemented in Python

The basic idea of the algorithm is to greedily try to match sources with sinks using dfs. Then when no more matches can be found, the sources/sinks that never found a match must be able to reach/be reachable by some node in matched_sinks/matched_sources. You can then strongly connect everything using $$$\text{max}(\text{size}(\text{sources}), \text{size}(\text{sinks}))$$$ edges.

Properties of the greedy matching

The process of greedily matching source and sinks has the following properties:

  • matched_sources[i] can reach matched_sinks[i]
  • Every source in never_matched_sources can reach at least one sink in matched_sinks.
  • Every sink in never_matched_sinks can be reached by at least one source in matched_sources.

How to add the edges

To make the DAG strongly connected.

  1. Add edges to create one big SCC out of all of the matched sources and sinks. (Note that from the properties of the greedy matching, all of the unmatched sources can now reach this SCC, and the unmatched sinks can all be reached from this SCC.)
  2. Start adding edges to strongly connect pairs of unmatched sources and sinks. If there are more unmatched sources than sinks, or vice versa, then just strongly connect the excess to any node in the SCC.
  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    so as i did understand

    firstly:

    for any source i will try to connect it to exactly one sink which it can reach if not i will add it to never_matched_sources

    secondly:

    i will find all of the unmatched sinks and put them in never_matched_sinks

    thirdly:

    strongly connect the never_matched_sinks to the never_matched_sources

    fourthly:

    i have to strongly connect the rest

    is it this ?

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      "firstly: for any source i will try to connect it to exactly one sink which it can reach if not i will add it to never_matched_sources"

      Note that the dfs never visits the same node twice. So the formulation should be "it can reach using unvisited nodes"

      "thirdly: strongly connect the never_matched_sinks to the never_matched_sources"

      I start by strongly connecting all of the matched sources and sinks into one big SCC. Think of it like making one big cycle.

          ret =  [(0, P - 1)]
          ret += ((i, i - 1) for i in range(1, P))
      

      Once I've made the big SCC, I then strongly connect the unmatched sources and sinks to it.

          ret += ((i, i) for i in range(P, K))
          ret += ((0, i) for i in range(K, N))
          ret += ((i, 0) for i in range(K, M))
      
      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Yeah it's very clear to me i did implement it and it looks to work for single component graph, but can you solution be generalized for several DAGs in the graph ?

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it +10 Vote: I do not like it

          Still not sure what you are talking about. What do you even mean by single component graph when it comes to directed graphs? The algorithm works for any DAG.

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            take this for example

            n = 4 , m = 2

            1 2

            2 3

            you will connect 3 -> 1 then you will have 4 as a sink and source at the same time and you can add only 1 more edge what would you do in this case ?

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
              Rev. 3   Vote: I like it +5 Vote: I do not like it

              For that graph matched_sources will become [1,4] and matched_sinks will become [3,4] (note this means (source) 1 is matched with (sink) 3, and (source) 4 is matched with (sink) 4, so 4 is matched with itself). There are no unmatched sources or sinks.

              To create one big SCC out of the matched sources and sinks, I run this code

                  ret =  [(0, P - 1)]
                  ret += ((i, i - 1) for i in range(1, P))
              

              which connects 3->4 and 4->1. It does not connect 3->1. Think of this process as making one big cycle.

              • »
                »
                »
                »
                »
                »
                »
                »
                4 years ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                ohh ok

              • »
                »
                »
                »
                »
                »
                »
                »
                4 years ago, # ^ |
                  Vote: I like it +5 Vote: I do not like it

                is it possible to explain the last 4 loops cause i am not used to python :)

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +5 Vote: I do not like it
                  ret =  [(0, P - 1)]
                  ret += ((i, i - 1) for i in range(1, P))
                  

                  this is the same as

                  ret =  [(0, P - 1)]
                  for i in range(1, P):
                    ret.append((i, i - 1))
                  

                  which in C++ would look something like

                  vector<pair<int,int>> ret{{0, P - 1}};
                  for (int i = 1; i < P; ++i)
                    ret.push_back({i, i - 1});
                  
                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +5 Vote: I do not like it

                  ohh yeah i just got it all right now thank you very much :))

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it +5 Vote: I do not like it

            is my point clear to you ?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    Also how would you deal with it if the graph is made of multiple DAGs ?

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      The definition of DAG does not require the nodes to be connected. For example a graph with no edges is technically a DAG. So multiple DAGs is still a DAG.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by faresbasbas (previous revision, new revision, compare).