Minima/maxima over all fixed-size arrays (multi-dimensional)

Правка en2, от bicsi, 2017-08-13 13:53:14

This article will be presenting a rather classical problem that can be solved using deques, along with an extension that allows you to solve the problem in its more general multi-dimensional case. I have decided to write this article after this discussion on 2D range-minimum query.

The article will be mainly based on this following problem:

You are given an array of numbers A[] of size n and a number k ≤ n. Find the minimum value for each continuous subarray of size k.

We will be now focusing on the linear-time solution to this problem.

Solution:

Consider sweeping from left to right through the array. At every moment we keep a list of "candidates" for minimum values throughout the process. That means that at each moment, you have to add one element to the list and (potentially) remove one element from the list.

The key observation is that, during the sweep line process, we find two values A[i] and A[j] which have i < j and A[i] ≥ A[j], then we can safely discard A[i]. That is because, intuitively, A[j] will continue to "live" in our sweep line more than A[i], and we will never prefer A[i] instead of A[j].

We should now consider pruning all the "useless" values ("useless" as in the statement above). It is easy to see now that doing this will lead to a strictly increasing list of candidates (why?). In this case, the minimum will always be the first element (O(1) query).

In order to insert an element to the back of the pruned candidate list, we will do a stack-like approach of removing all elements that are greater than it, and to erase on element, we just pop the front of the list (if it is not already removed).

This is a well-known approach for finding minima over fixed-size continuous subarrays. I will now present an extensions that allows you to do the same trick in matrices and even multi-dimensional arrays.

The multi-dimensional extension

Problem (2D):

You are given an matrix of numbers A[][] of size n × m and two numbers k ≤ n, l ≤ m. Find the minimum value for each continuous submatrix of size k × l.

Solution:

Consider the matrix as a list of rows. For each row vector of A, use the 1D algorithm to compute the minimum value over all l-length subarrays, and store them in ColMin[][] (obviously, ColMin[][] is now a n × (m - l + 1)-sized matrix).

Now, consider the new matrix as a list of columns. For each column vector of ColMin, use the algorithm to compute the minimum value over all k-length subarrays, and store them in Ans[][] (of size (n - k + 1) × (m - l + 1)).

The Ans[][] is the solution to our problem.

The following picture shows the intutition behind how it works for computing Ans[1][1] for n = 5, m = 7, k = 3, l = 4

The pseudocode is as follows:

def solve_2d(M, k, l):
  column_minima = {} # empty list
  for each row in M.rows:
    # We suppose we have the algorithm that solves
    # the 1D problem
    min_row = solve_1d(row, l)
    column_minima.append_row(min_row)
  
  ans = {}
  for each col in column_minima.cols:
    min_col = solve_1d(col, k)
    ans.append_col(min_col)
  
  return ans

Note that the pseudocode is (deliberately) hiding some extra complexity of extracting rows / columns and adapting the 1D algorithm to the 2D problem, in order to make the understanding of the solution clearer.

The total complexity of the algorithm can be easily deduced to be O(n * m)

Multi-dimensional case analysis

The solution can be extended to an arbitrary order of dimensions. For a d-dimensional matrix of size s1, s2, ..., sd, the time-complexity of the problem is O(d * s1 * ... * sd), and the memory complexity is O(s1 * ... * sd). This is much better than other algorithms that do the same thing on non-fixed size submatrices (e.g. multi-dimensional RMQ has O(s1 * ... * sd * log(s1) * ... * log(sd)) time and memory complexity).

Finding the best k minima

The deque approach itself is limited in the sense that it allows you to find only the minimum value over the ranges. But what happens if you want to calculate more that one minimum? We will discuss an approach that I used during a national ACM-style contest where we were able to calculate the best 2 minima, and then argue that you can extend to an arbitrary number of minimum values.

In order to store the lowest 2 values, we will do the following:

Keep 2 deques, namely D1 and D2. Do a similar algorithm of "stack-like popping" on D1 when you add a new element, but instead of discarding elements from D1 when popping, transfer them down to D2 and "stack-like pop" it.

It is easy to see why the lowest 2 elements will always be in one of the two deques. Moreover, there are only 2 cases for the lowest two elements: they are either the first two elements of D1, or the first elements of D1 and D2 subsequently. Checking the case should be an easy thing to do.

The extension to an arbitrary number of minima is, however, not so great, in the sense that the complexity of this approach becomes O(n * k2) for a n-sized array, currently bottlenecked by the number of elements you have to consider in order to find the first k minima. [Maybe you can come up with a cleverer way of doing that?]

Useful links

This is the problem I referred to above: http://www.infoarena.ro/problema/smax. I recommend trying to think it through and implementing it, and translating the statement via Google Translate or equivalent.

Теги rmq, deque, 2d

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en2 Английский bicsi 2017-08-13 13:53:14 1716 Tiny change: 'blema/smax\n\nI recommen' -> 'blema/smax. I recommen'
en1 Английский bicsi 2017-08-12 16:45:17 4141 Initial revision (published)