The problem statement is simple:

**Given a undirected unweighted graph with $$$V$$$ vertex and $$$E$$$ edges, $$$M$$$ ($$$1\leq M\leq V$$$) of the vertex are special vertex. For every vertex, find the $$$K$$$-th nearest special vertex to it.**

For $$$K=1$$$, I realize that this is just a BFS, starting from special vertex can get $$$\mathcal{O}(V+E)$$$ complexity.

But when $$$K>1$$$, I have no idea how to optimize the BFS that can get better complexity that brute force $$$\mathcal{O}(V(V+E))$$$

I am wondering is there any algorithm to find out the answer more efficiency, such as $$$\mathcal{O}(K(V+E))$$$ or $$$\mathcal{O}(V+E)$$$.

Thanks!

Auto comment: topic has been updated by practice080 (previous revision, new revision, compare).Auto comment: topic has been updated by practice080 (previous revision, new revision, compare).Learn about multi-source BFS https://www.geeksforgeeks.org/multi-source-shortest-path-in-unweighted-graph/

I think this will solve your problem

Hi 007bond007, thanks for your reply. But the link you shared is talking about the case when $$$K=1$$$ that I already mentioned in the post, which can solve in $$$\mathcal{O}(V+E)$$$. What I am wondering is the cases when $$$K>1$$$.

I believe you can do something like this to solve it in $$$\mathcal{O}(K(V+E))$$$.

Let's call $$$\operatorname{ans}(u, i), 0 \leq i < K$$$ the answers for each vertex (the answer can be the $$$i$$$-th closest source and its distance). You will start doing a multi-source bfs, only that the queue will store a pair $$$(u, s)$$$ of node $$$u$$$ and source $$$s$$$. For each pair $$$(u, s)$$$ in the queue, iterate over the adjacent vertices $$$v$$$ and check what's the first $$$i$$$ for which you have not found the answer to that vertex. If $$$i < k$$$ and $$$s$$$ is not in the already found closest $$$i$$$ vertices (which can be checked knowing the distance from the last one is smaller than that from $$$s$$$), the answer for $$$(v, i)$$$ has to be $$$s$$$, and push $$$(v, s)$$$ into the queue.

Why is this code correct? Well, we need to prove that you will get to vertex $$$u$$$ sooner from sources that are closer (to obtain a sorted answer) and that any of the first $$$k$$$ sources will reach that vertex. The first thing is easy, because you are processing the vertices in the queue in order of distance to the source. Therefore, if no path gets stopped midways, the vertices that are closer will be set as answers the first. And for the other part, we are only stopping a path when we have already visited that vertex through a path starting from this source, in which case the current path cannot be faster; or when we have already the answers for the first $$$k$$$ vertices, in which case any of those first $$$k$$$ sources will be closer than the current one of the nodes beyond this node.

Complexity? Each node is pushed with at most $$$k$$$ sources. $$$\mathcal{O}(K(V+E))$$$.

Hi edsa, thanks for the very detailed and nice answer.

But I have a question. How do we check that "$$$s$$$ is not in the already found closest $$$i$$$ vertices" by "checked knowing the distance from the last one is smaller than that from $$$s$$$". I think I might misunderstand this line, could you explain this line? Because before we processing the pair ($$$u$$$, $$$s$$$), the vertex $$$v$$$ might have already reached from $$$s$$$. In this case, although the current path is smaller than distance of the last source reach $$$v$$$, it still can't indicate that $$$s$$$ is not in the first $$$i$$$ vertices.

In this case, the $$$s_1$$$ is already reached $$$w$$$, then $$$s_2$$$ reach $$$w$$$, but when we visit $$$w$$$ from $$$v$$$, the distance from the last one(from $$$v_2$$$) is shorter than current path.

(sorry for the very ugly drawing)In the case you are drawing, if you did a multi-source bfs, $$$w$$$ would be reached from $$$s_2$$$ before. Then, it would be reached from $$$s_1$$$ and the distance would be higher than $$$1$$$, which was the latest. Hence, that would mean that $$$w$$$ wasn't visited from $$$s_1$$$, because otherwise the largest distance would be at least the same.

Isn't the $$$w$$$ will reached from $$$s_1$$$ with distance 1? (Using the long edge bottom)

And after that $$$v$$$ will reached from $$$s_1$$$ again, how do we know that the $$$w$$$ is already reached from $$$s_1$$$ before?

I didn't understand that was an edge. Look at the 5 nodes between those two, number then from $$$0$$$ to $$$w-1$$$.

$$$0$$$ and $$$w$$$ are visited with distance $$$1$$$. $$$1$$$ and $$$w-1$$$ are visited with distance $$$2$$$ $$$3$$$ and $$$w-2$$$ are visited with distance $$$3$$$ $$$4 = w-3$$$ is visited with distance $$$4$$$.

How do we know that after visiting $$$4 = w-3$$$ we won't visit $$$3$$$ and $$$w-4$$$? The distance to those vertices would be $$$5$$$, which is indeed greater than the last known distance to that node. We could try to check if the last node was $$$s_1$$$ but maybe the same vertices are visited by another source with distance also $$$4$$$. The simple solution of maintaining the best distances and querying adds $$$\log(k)$$$ to the solution. The other solution is to store for each node the edges that led to this node, needing extra $$$\mathcal{O}(E)$$$ space.

For the last sentence, how does storing the edges led to the vertex helps?

Different sourcescan reach the vertex bysame edge(but maybe different distance)I had to think about this and I'm not sure that this can be done this way, but I'll try.

If you store which edges led to this vertex with distance one less, you can do the following:

Whenever you push a node to the queue from another node, loop through the vertices that have already led to that vertex (from last to first) and add those nodes with the current distance + 2 and your this same node with the distance from that vertex sources to to a temporary queue that you'll add to the queue later. When processing each vertex, you can skip the nodes you already saw.

But the idea is a bit more refined. If you push the sources into the queue in order $$$s_0$$$, $$$s_1$$$, ... $$$s_S$$$, then all the nodes at distance $$$4$$$ from source $$$s_0$$$ will be processed before all nodes at distance $$$4$$$ from source $$$s_1$$$ and so on. Whenever you are going to do what I said in the previous paragraph, you will check if the last edge is from your same source, in which case you can omit all the vertices that come from your same source and only loop through interesting vertices. If this is the case, for each vertex you will only push yourself with the distance to that vertex source + 2.

I would have to program all this to be sure that it works and I feel like I'm missing things because I've not put to much time into thinking this. I'm sorry for not being of more help!

Thanks edsa for helping! This give me some direction for solving this problem.

Sorry, I misunderstood the question :))

Auto comment: topic has been updated by practice080 (previous revision, new revision, compare).Auto comment: topic has been updated by practice080 (previous revision, new revision, compare).