Avisynth’s convolution stuff explained

Since I’m working with avisynth I’ve always been impressed about people who came up with convolution stuff, many numbers and I had no idea what they mean. Reading the help pages on avisynth.org didn’t help – For all people who wanted to know how that works, here’s a little guide (thanks to Kuukunen for explaining)

Alright. First of all, some basics: GeneralConvolution takes an RGB Clip. RGB has 16 million colours (255 per channel, that means: 255 red, 255 green, 255 blue, and 255*255*255 = 16581375) – The Convolution thingy is processing each channel seperatly. You don’t even need to care.

Our first test movie is 1×5 pixel in resolution (let’s assume 0 0 255 0 0 which’d mean, the first 2 px are black, then a white px then 2 black ones – so in simple, a black movie with a white dot in the middle)

Now we can decide between a 3×3 and a 5×5 matrix. For simplicity we take a 3×1 matrix. A 3×1 matrix would be 0 0 1, a 5×1 matrix would be 0 0 0 0 1. So every value represents a pixel. Every pixel-value is done for every pixel in the movie. For the edges the outer px is mirrored. So you can look at it this way:

Current pixel value Pixels used by Matrix Matrix Matrix Calc Matrix result
0 1px mirrored to left = 0 0 0 0 0 1 (0*0 + 0*0 + 0*1) 0
0 0 0 255 0 0 1 (0*0 + 0*0 + 255*1) 255
255 0 255 0 0 0 1 (0*0 + 0*255 + 1*0) 0
0 255 0 0 0 0 1 (255*0 + 0*0 + 0*1) 0
0 1px mirrored to right = 0 0 0 0 0 1 (0*0 + 0*0 + 0*1) 0

As you can see the result of our matrix is, that it just shifted the white dot 1px to the left. So: 1 0 0 would shift the image to the right. In the matrix you can have any value, negative and positive. I guess mostly used are 0 -1 and some positive value. We’ll come later to -1. Anyway. You’ve just seen how shifting via convolution works – lemme show you how to blur. You can do a simple blur by just doing a matrix of 1 1 1. However at this matrix we have to talk about something else, called „divisor“. Basically all values in your matrix are sumed to make the divisor. So a matrix of 1 1 1 means a divisor of 3. A matrix of 2 4 2 would make a divisor of 8. A matrix of -1 2 -1 would mean a divisor of 0 (here the divisor’s auto detection fails). Let’s take a look at a table which shows what it does. Still using 0 0 255 0 0 (our white dot 5×1 px video)

Current pixel value Pixels used by Matrix Matrix Matrix Calc Matrix result
0 1px mirrored to left = 0 0 0 1 1 1 (div=3) (0*1 + 0*1 + 0*1) / 3 0
0 0 0 255 1 1 1 (div=3) (0*1 + 0*1 + 255*1) / 3 85
255 0 255 0 1 1 1 (div=3) (0*1 + 255*1 + 0*1) / 3 85
0 255 0 0 1 1 1 (div=3) (255*1 + 0*1 + 0*1) 85
0 1px mirrored to right = 0 0 0 1 1 1 (div=3) (0*1 + 0*1 + 0*1) / 3 0

So you can see that instead of 0 0 255 0 0 we now have: 0 85 85 85 0 that means the white dot px value was copied 1px to left and right and reduced (which we can refer to as blurred). Not that difficult, or? By the way, that should be Blur(1.58) (so blur, with it’s max value).

Now, let’s take a look at a more complicated example. We want to do „unsharping“ this is possible with a convolution. In simple you’d do a matrix like: -1 4 -1. That’s basically unsharping. However, unsharping at a white dot doesn’t make much sense, so we’re going to use another „video“. This time: 9x1px resolution represented by 100 100 150 200 200 200 150 100 100. Let’s look at the following table:

Current pixel value Pixels used by Matrix Matrix Matrix Calc Matrix result
100 1px mirrored to left = 100 100 100 -1 4 -1 (div=2) ((100*-1) + (100*4) + (100*-1)) / 2 100
100 100 100 150 -1 4 -1 (div=2) ((100*-1) + 100*4 + (150*-1)) / 2 75
150 100 150 100 -1 4 -1 (div=2) ((100*-1) + 150*4 + (200*-1)) / 2 150
200 150 200 200 -1 4 -1 (div=2) ((150*-1) + 200*4 + (200*-1)) / 2 225
200 200 200 200 -1 4 -1 (div=2) ((200*-1) + 200*4 + (200*-1)) / 2 200
200 200 200 150 -1 4 -1 (div=2) ((200*-1) + 200*4 + (150*-1)) / 2 225
150 200 150 100 -1 4 -1 (div=2) ((200*-1) + 150*4 + (100*-1)) / 2 150
100 150 100 100 -1 4 -1 (div=2) ((150*-1) + (100*4) + (100*-1)) / 2 75
100 1px mirrored to right = 100 100 100 -1 4 -1 (div=2) ((100*-1) + 100*4 + (100*-1)) / 2 100

So, you can see that we get

100 75 150 225 200 225 150 75 100

from

100 100 150 200 200 200 150 100 100

which results in a sharpened image. The center is 200 (the value in the middle) around that it’s highering the brightness a bit (making it more white from 200 to 225) and before the 150 it’s lowering the brightness from 100 to 75, making it more black. What this does you can easily see in an image at wikipedia. Just take a look at:

http://en.wikipedia.org/wiki/File:Usm-unsharp-mask.png

You can see the black and the white, which looks like shadow in the bottom part of the picture? That’s exactly what we did here, and that’s exactly what unsharping is about. This „white“ stuff is also called „halos“ those nasty things everyone tries to remove.

However, we still didn’t finished understanding the matrices. What we did so far was: unsharping, shifting, bluring. Now i’ll give two examples for edge detection, and then we’ll have to talk about dimensions.

Video: 100 100 100 200 200 200 200 100 100 100
Matrix: 0 -1 1
Divisor: 1 (so.. none)
Result: 0 0 100 0 0 0 -100 0 0 0

Values above 255 are clipped to 255.
Values below 0 are clipped to 0. So -100 = 0

As you can see at the result, we’ve got a black image with just the edge selected. Though this is only detecting the left edges (the point is, it will cause edges in one direction to fall below 0). For both directions you have to use a 2dimensional matrix or mt_edge for example.

Now, let’s talk about dimensions. This is where it get’s difficult. With our 5×1 videos we just faced 1 dimensional matrices. However, we want to do 2 dimensional matrices because our video is 2d. So let’s take a look at a 2d video with a white dot in the middle:

0 0 0 0 0
0 0 0 0 0
0 0 255 0 0
0 0 0 0 0
0 0 0 0 0

That’s a 5×5 video. a 3×3 matrix would look like this:

1 1 1
1 1 1
1 1 1

This would do the same, as our 1d blur. You could change the blurring (shape) to produce less strong blurrying, for example like this:

0 1 0
1 1 1
0 1 0

If you run this twice, the first one should look a bit blocky, the second will be as strong as the original blur but it will look a bit different. So.. You can really do a lot with this. However, you have to keep in mind, while at 1d the matrix was done to the left, current and right px, it’s different now. now it’s applied to the surrounding px.

For example:

-1 -1 -1
-1 12 -1
-1 -1 -1

That’d be our unsharp mask in 2d. The matrix is applied to ALL surrounding px. That means: not just to left and right, and not just to top and bottom and left and right – it’s also applied to the top left, top right, bottom left, bottom right.

Apart from 2d convolution there’s also 3d convolution (a filter made for that) this one is working temporal (using the previous frame) which „might“ be useful in very slow motion to improve things, tho in high motion it will create ghosting and thus might not be very useful.

I hope this article helps other people to better understand about this.

No Comments

Post a Comment