Looping through numpy arrays (e.g. moving/rolling window)

Numpy is the cornerstone of matrix based calculations in QGIS (and elsewhere). It does wonders with raster data (unless it hits the limit of available live memory…).

A recurrent problem with Numpy is the implementation of various looping routines, such as the sliding window which is frequently used in image filtering and other approaches focused on cell neighbourhood. Below is the illustration of the problem: for each cell the window needs to query a specified neighbourhood (square, circular or other).

Now, there exists a solution in the form of “stride tricks” within the Numpy library. However it is really cryptic (the SciPy cookbook uses the “devious” epithet); after all the very name “tricks” implies it being … tricky. Numpy documentation actually discourages its use. We can do it in a more understandable manner – which also means more configurable – and without any particular computational overhead.

Let’s first examine the problem. Without Numpy we would need four nested loops: two for traversing the matrix and two for the analysed window.

We can immediately see that the level 2 loop can be easily replaced with a Numpy window: `win = matrix[y-1: y+2, x-1: x+2]`. Easy enough … but this is not the ideal approach. The window has 9 cells, while the matrix would have millions: shouldn’t we focus on the level 1 loop instead?

To avoid the level 1 loop we need to move the entire matrix, as when overlapping two sheets of paper. We’re now working on offsets between the two – which is the only tricky part. The (0,0) cell of the window now corresponds to all cells in the matrix that are on the upper left relative to analysed cells – which simply means shifting the entire matrix to the upper left. In order to move the matrix in that direction, we need to calculate offsets in relation to the central point. That would be `window_index - window_radius` , which for the upper left cell gives (-1, -1) (see below).

The only problem resides in handling these offsets on the level of the entire matrix. We need to extract matrix views that do not spill over the borders AND that match each other in size and shape. The overlap issue is represented on the illustration below.

This problem can be avoided with padding: adding extra row(s) and column(s) around the edges. The depth of the padded area should correspond to the maximum possible offset. The solution is now as follows:

And we’re fine!