What does Spatial Relationship mean on CNN

Relationship between convolution in math and CNN

Using the notation from the Wikipedia page, the convolution in a CNN will be the core from which we will learn some weights to extract the data we need and then maybe apply an activation function. G.

On the Wikipedia page, the folding is described as

(f ∗ g) [n] = ∑infm = - inff [m] g [n - m]

Suppose the function is f and b is the convolution function g, einfbG

To solve this we can first use the equation by flipping the function vertically based on the - m appearing in the equation. Then we calculate the sum for each value of n. While changing n, the original function does not move, but the convolution function is shifted accordingly. From n = 0, b- mnnn = 0

c [0] = ∑ma [m] b [−m] = 0 ∗ 0.25 + 0 ∗ 0.5 + 1 ∗ 1 + 0.5 ∗ 0 + 1 ∗ 0 + 1 ∗ 0 = 1

c [1] = ∑ma [m] b [−m] = 0 ∗ 0.25 + 1 ∗ 0.5 + 0.5 ∗ 1 + 1 ∗ 0 + 1 ∗ 0 = 1

c [2] = ∑ma [m] b [−m] = 1 ∗ 0.25 + 0.5 ∗ 0.5 + 1 ∗ 1 + 1 ∗ 0 + 1 ∗ 0 = 1.5

c [3] = ∑ma [m] b [−m] = 1 ∗ 0 + 0.5 ∗ 0.25 + 1 ∗ 0.5 + 1 ∗ 1 = 1.625

c [4] = ∑ma [m] b [−m] = 1 ∗ 0 + 0.5 ∗ 0 + 1 ∗ 0.25 + 1 ∗ 0.5 + 0 ∗ 1 = 0.75

c [5] = ∑ma [m] b [−m] = 1 ∗ 0 + 0.5 ∗ 0 + 1 ∗ 0 + 1 ∗ 0.25 + 0 ∗ 0.5 ∗ 0 ∗ 1 = 0.25

As you can see that is exactly what we get on the plot c [n]. So we shifted around the function b [n] over the function a [n].

For example, if we have the matrix in green

with the convolution filter

Then the resulting operation is an element-wise multiplication and addition of the terms as shown below. Very much like the wikipedia page shows, this kernel (orange matrix) g is shifted across the entire function (green matrix) f.

taken from the link that @ Hobbes reference. You will notice that there is no flip of the kernel g like we did for the explicit computation of the convolution above. This is a matter of notation as @Media points out. This should be called cross-correlation. However, computationally this difference does not affect the performance of the algorithm because the kernel is being trained such that its weights are best suited for the operation, thus adding the flip operation would simply make the algorithm learn the weights in different cells of the kernel to accommodate the flip. So we can omit the flip.