2D Row Update in Boolean Matrix

Revision en7, by OrangeFFT, 2021-04-15 01:59:49

Hey everyone!

I am working on a project where I need to implement some operations efficiently. Unfortunately, I couldn't find any problem similar to this one, so I'll describe it.

The problem

Assume we are given an $$$N \times N$$$ matrix containing only zeroes and ones. I'm interested in a way, possibly a data structure built over the matrix, that can answer queries and perform updates of these types:

  1. Q j: find the number of the first (bottom-most) row that has a $$$1$$$ on the $$$j$$$-th column of the matrix, or report that there are no 1s on that column ($$$1 \le j \le N$$$).
  2. U i: clear the $$$i$$$-th row, i.e. completely fill it with zeroes ($$$1 \le i \le N$$$).

Although I'd also be happy to know a more general way to do so, in case that helps in my particular scenario it is guaranteed that all the queries are given in a non-decreasing order of columns (i.e. when a query is made about the first position of a $$$1$$$ in the $$$j$$$-th column, no query will be made later referring to columns $$$1..j-1$$$, even though multiple consecutive queries can be made for the same column). Also, no updates will try to clear the same row twice.

Example

So, here's an example of a possible sequence of queries/updates over the following $$$4 \times 4$$$ matrix (rows are numbered from $$$1$$$ to $$$4$$$ bottom-up and columns are numbered from $$$1$$$ to $$$4$$$ from left to right):

\begin{array}{|c|c|c|c|} \hline 1 & 1 & 0 & 0 \cr \hline 0 & 1 & 0 & 0 \cr \hline 0 & 0 & 1 & 0 \cr \hline 0 & 1 & 1 & 1 \cr \hline \end{array}

  1. Q 1: the first query asks us to find the bottom-most row containing a $$$1$$$ in the first column, so we report $$$4$$$.
  2. U 4: the fourth row is cleared and so the new updated matrix is this:

\begin{array}{|c|c|c|c|} \hline 0 & 0 & 0 & 0 \cr \hline 0 & 1 & 0 & 0 \cr \hline 0 & 0 & 1 & 0 \cr \hline 0 & 1 & 1 & 1 \cr \hline \end{array}

  1. Q 2: the bottom-most $$$1$$$ on column $$$2$$$ is on row $$$1$$$, so we report $$$1$$$.
  2. U 1: row $$$1$$$ is cleared and the new updated matrix becomes:

\begin{array}{|c|c|c|c|} \hline 0 & 0 & 0 & 0 \cr \hline 0 & 1 & 0 & 0 \cr \hline 0 & 0 & 1 & 0 \cr \hline 0 & 0 & 0 & 0 \cr \hline \end{array}

  1. Q 2: the bottom-most $$$1$$$ on column $$$2$$$ is on row $$$3$$$ now, so we report $$$3$$$.
  2. U 3: row $$$3$$$ gets erased:

\begin{array}{|c|c|c|c|} \hline 0 & 0 & 0 & 0 \cr \hline 0 & 0 & 0 & 0 \cr \hline 0 & 0 & 1 & 0 \cr \hline 0 & 0 & 0 & 0 \cr \hline \end{array}

  1. Q 3: we report $$$2$$$.
  2. U 2: the entire matrix becomes empty:

\begin{array}{|c|c|c|c|} \hline 0 & 0 & 0 & 0 \cr \hline 0 & 0 & 0 & 0 \cr \hline 0 & 0 & 0 & 0 \cr \hline 0 & 0 & 0 & 0 \cr \hline \end{array}

  1. Q 3: since there are no ones in column $$$3$$$, we report that fact.

Discussion

Can anyone think of a data structure that can be built in a time nearly proportional to the matrix size (something like between $$$\mathcal{O}(N^2)$$$ and $$$\mathcal{O}(N^2 \log^2 N)$$$ or so) that can support these operations online in $$$\mathcal{O}(\log N)$$$ or $$$\mathcal{O}(\log^2 N)$$$ per query/update?

I first thought of 2D segment trees/Fenwick trees, but the notion of 1D lazy propagation doesn't seem to extend to higher dimensions (although it is possible to perform range sum queries + range add updates or even support operations like $$$(+,+)$$$, $$$(\times, \times)$$$, $$$(\wedge, \wedge)$$$, $$$(\vee, \vee)$$$ in $$$\mathcal{O}(\log^2 N)$$$ per query/update). I thought of somehow implementing the notion of clearing a row using range sum updates and binary searching the first $$$1$$$ in a column with range sum queries, possibly maintaining multiple data structures at the same time for crossing results about ones and zeroes, but it seems a bit hard to model the problem using only range sums (but perhaps it is possible nonetheless).

Also, as far as I know a quad-tree takes $$$\mathcal{O}(N)$$$ time to clear an entire row (which is precisely the kind of operation required in this case).

Now, even though these operations may seem difficult to implement for the general case, this problem is very specific and has some interesting properties we could possibly take advantage of:

  • an update will simply clear an entire row (it seems just like "removing" it), we don't need to support range updates to arbitrary intervals.
  • given the presented assumptions, the number of updates is guaranteed to be $$$\mathcal{O}(N)$$$ given that no row will get cleared twice. Because of that, after all the updates are done we theoretically will have "visited" only $$$\mathcal{O}(N)$$$ ones in the matrix, one per each row, regardless of the density/sparsity of the original matrix (even if it started as completely filled with ones), so it seems that it may be possible to have something from $$$\mathcal{O}(N)$$$ to $$$\mathcal{O}(N \log^2 N)$$$ time for $$$N$$$ updates in total (instead of $$$\mathcal{O}(N^2)$$$ for $$$N$$$ updates in total).
  • a query will only refer to a specific column (and not to arbitrary subrectangles in the matrix) and, as said, the queries are made in non-decreasing order of columns.
  • we're only dealing with zeroes and ones, not with arbitrary integers, so for instance these queries/updates are all equivalent:

    • queries (enabling a binary search): range sum in column, range OR in column, range max in column, ...
    • updates: set entire row to $$$0$$$, multiply every element in row by $$$0$$$, perform an AND between each element in the row and $$$0$$$, update each element $$$x$$$ in the row to $$$\min(0,x)$$$, ...

Does anyone have an idea on how to solve this online in something close to $$$\mathcal{O}(\log N)$$$ or $$$\mathcal{O}(\log^2 N)$$$ per query/update (possibly amortized)? :)

Thank you!

Tags rsq, 2d, 2d arrays, 2d binary indexed tree, 2d segment tree, boolean-matrix, range query, range update, quad-tree, data structures, online query

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en7 English OrangeFFT 2021-04-15 01:59:49 2 changed variable capitalization to match the previously defined one
en6 English OrangeFFT 2021-04-13 14:37:04 588 added the observation that there are at most O(N) updates + added tag "online query"
en5 English OrangeFFT 2021-04-13 03:31:09 21 added "possibly amortized"
en4 English OrangeFFT 2021-04-13 03:28:10 32 removed statement about performing max/min queries/updates in 2D
en3 English OrangeFFT 2021-04-13 03:07:35 0 (published)
en2 English OrangeFFT 2021-04-13 03:06:38 58 added another possibility for query/update runtime
en1 English OrangeFFT 2021-04-13 02:50:38 5078 Initial revision (saved to drafts)