I was thinking about the following problem (I don't know if it's out there already) and wanted to know how people here would approach it.

You are given a N x M matrix and Q (say these values are <= 1000) transpose operations to execute each of which is a square sub-matrix to be transposed. Print the matrix after executing all operations.

Can we do better than brute force? My thoughts so far are: store the matrix as a list of diagonals. Transposing a sub-matrix is the same as reversing contiguous sub-arrays of some diagonals. The reversing can be done efficiently using lazy propagation but we still have to iterate over the diagonals of the sub-matrix. Can we do better?

O(N * Q * logN) — the algorithm you described should be able to work in time, thou you may want to use Treaps (not segment trees) for reversing.