Core Image Kernels
From QuartzCompositions.com the central source for Quartz Composer :: wiki
| Table of contents |
Core Image Kernel Programming in Quartz Composer
Getting Started
Quartz Composer is a great environment for experimenting with OpenGL 2.0 Shader Language kernels (also known as Core Image kernels or GL-slang code.) There are a few resources on the web that offer a brief introduction, such as [Apple's guide (http://developer.apple.com/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_plugins/chapter_4_section_3.html)] (be sure to read the page on [Apple-specific language issues (http://developer.apple.com/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_gslang_ext/chapter_6_section_1.html#//apple_ref/doc/uid/TP30001185-CH206-TPXREF101)],) and the ARB's documentation on the official [OpenGL Shading Language (http://oss.sgi.com/projects/ogl-sample/registry/ARB/GLSLangSpec.Full.1.10.59.pdf)]. which has a nice function reference about halfway through.
Quartz Composer, in conjunction with Core Image, will compile and run the kernels live in any composition as long as all the image inputs are specified and the output leads to a Renderer. That means that, as you type, the compiler is constantly running and trying to display any changes. This can be a blessing and a curse... experiment wildly, but save often!
Programming a Kernel
Core Image kernels can have any number of image inputs (variables of type "sampler" in the code), but only one image output (the return value of "vec4" from the kernel). It is the kernel's job for each output pixel to sample any input pixels it needs, combine them with any other input parameters, do any calculation, and finally return a single vec4 for each output pixel. (The preceeding sentence contains a detail that is important: the kernel will be called once for each **output** pixel, not for each input pixel, so forget about fast kernels that produce histograms, general Hough transforms, or other applications that use the output image as an accumulator.)
The best way to learn, though is to experiment. The most fun way I've found to experiment is with a video source. Open Quartz Composer, hit "Cancel" on the pre-fab composition box, and select "New". Then drag out a Video Source patch, a Core Image Kernel patch, and a Billboard patch. Drag the outputs to the inputs, and you've got a basic filter chain. Now click on the kernel patch, open the inspector, and chose the second panel. There's your code. You can add a second image input by adding another "sampler" input parameter, a "float" setting, or a "color__" input variable. Once added to the code, if the code compiles cleanly they will immediately appear in the composition patch for hooking up. You can hook up any of the parameters to the output of another patch, or set them manually in the third pane of the inspector. Alternately, you can "publish" the inputs, then go to the preview window, press Command-T, and see the settings on a sheet overlay.
Now pick up your favorite image processing textbook, go find fun algorithms online, or imagine some new effects. In coding them, the language is extremely similar to C. The biggest difference is the vector operations, which will operate on **each** member of a vector. For example the following code:
vec4 pixels = vec4(0.5, 0.25, 0.5, 0.5); pixels *= 2.0;
will result in "pixels" having the value (1.0, 0.5, 1.0, 1.0). Similar operations are possible on vec2, which are usually used for (x, y) coordinates, while vec4's are usually used for (a, r, g, b) values, with "a" being an alpha (transparency) channel.
Coordinate Spaces
One of the aspects of Core Image Kernels that is not immediately apparent, and can take a little experimentation to get right, is the various coordinate spaces in which coordinates can be expressed. The following explanation is paraphrased from a post by John Harper on the quartzcomposer-dev list. (Note that besides the coordinate spaces listed below, QC output appears to be in a third coordinate space, with (0,0) at the center of the output window, and 1.0 being the standard width.)
- Working Coordinates
- This is the coordinate system that images exist in-- you can think of it as an infinitely large board on which you pin images. In Objective C/C++, you would use the -imageByApplyingTransform: method or the CIAffineTransform filter to move images around in this space. In this coordinate space, 1 unit in any direction is 1 pixel.
- Sampler Coordinates
- Only seen inside the kernels, each "sampler foo" variable (i.e. each texture when rendering with OpenGL) represents some finite subregion of the infinite working space. Point 0,0 is the bottom left corner of the physical texture, one unit is one texel.
So this means that whenever you use the sample () function in a kernel, the point you specify must be in the coordinate space of that sampler object, not in the working coordinate space (the actual pixels in the texture may be flipped vertically, or have a non-zero translation when rendering tiled, or maybe a weird scale factor if the engine moved an affine transform around)
- destCoord ()
- the current output point, in the working coordinate space.
- samplerTransform (x, p)
- working space point p, mapped into the coordinate space of sampler x
- samplerCoord (x)
- the current output point, mapped into the coordinate space of sampler x, equivalent to samplerTransform (x, destCoord ())
If you want to forget all these details, create a new sampling function, then just use working space coordinates everywhere:
vec4 sampleWorking (sampler x, vec2 p) {
return sample (x, samplerTransform (x, p));
}
Examples
Here is an example of a kernel that does nothing at all:
kernel vec4 doNothing(sampler image) {
return sample(image, samplerCoord(image));
}
The "sample" function takes an image and a vec2 as an (x,y) value, and returns the vec4 which represents the pixel at that location. Also note that pixel components are usually returned as floats in the range of 0.0 to 1.0. Unless you want your results auto-normalized later, you should usually also return values only in the range of 0.0 to 1.0. Two exceptions to the 0.0-1.0 range are floating point image inputs (such as ones that can use high dynamic range), or images that have been modified for color matching purposes, which can cause values to fall just outside the normal range.
Here is an example that averages a pixel with the pixels on its left and right, to create the cheapest blur possible:
kernel vec4 average3(sampler image) {
vec2 xy = samplerCoord(image);
return (sample(image, samplerTransform(image, xy+vec2(-1.0,0.0))) +
sample(image, samplerTransform(image, xy)) +
sample(image, samplerTransform(image, xy+vec2(1.0, 0.0))))/3.0;
}
This illustrates a local variable, some basic math, and sampling multiple input pixels for a given output. Thanks to John Harper at Apple for pointing out the suggested use of samplerTransform. You see, it's possible that 1 pixel is not the same as moving 1.0 in the X or Y direction if the image has been scaled. samplerTransform takes care of that for you.
Okay, now for something substantial:
const float kern00 = -1;
const float kern01 = -2;
const float kern02 = -1;
const float kern10 = 0;
const float kern11 = 0;
const float kern12 = 0;
const float kern20 = 1;
const float kern21 = 2;
const float kern22 = 1;
float getMonoValue(sampler image, const vec2 xy, const vec2 off)
{
return sample(image,xy +off).r;
}
kernel vec4 sobelFilter(sampler image)
{
float accumV = 0.0;
float accumH = 0.0;
const vec2 xy = samplerCoord(image);
float pixel;
pixel = getMonoValue(image, xy, vec2(-1.0, -1.0));
accumV += pixel*kern00;
accumH += pixel*kern00;
pixel = getMonoValue(image, xy, vec2( 0.0, -1.0));
accumV += pixel*kern01;
accumH += pixel*kern10;
pixel = getMonoValue(image, xy, vec2( 1.0, -1.0));
accumV += pixel*kern02;
accumH += pixel*kern20;
pixel = getMonoValue(image, xy, vec2(-1.0, 0.0));
accumV += pixel*kern10;
accumH += pixel*kern01;
pixel = getMonoValue(image, xy, vec2( 0.0, 0.0));
accumV += pixel*kern11;
accumH += pixel*kern11;
pixel = getMonoValue(image, xy, vec2( 1.0, 0.0));
accumV += pixel*kern12;
accumH += pixel*kern21;
pixel = getMonoValue(image, xy, vec2(-1.0, 1.0));
accumV += pixel*kern20;
accumH += pixel*kern02;
pixel = getMonoValue(image, xy, vec2( 0.0, 1.0));
accumV += pixel*kern21;
accumH += pixel*kern12;
pixel = getMonoValue(image, xy, vec2( 1.0, 1.0));
accumV += pixel*kern22;
accumH += pixel*kern22;
float val = sqrt(accumH * accumH + accumV * accumV);
return vec4(val, val, val, 1.0);
}
(The astute reader will note that I use my coordinate spaces improperly. Sorry about that-- I haven't updated my code since I learned about them myself. The above will work only on an unscaled, untiled image. I'll correct the example when I correct it in my own code.) The above code will produce the magnitude of the "edge" at a given location for the "red" channel of the image. It assumes the image has been pre-converted to monochrome, and that r = g = b, thus it cuts calculations by 1/3 to only deal with one channel. It then applies a 3x3 "Sobel" edge detection kernel horizontally and vertically, in an operation called "convolution" (where each component is multiplied by the corresponding one in a kernel, then all of them are added together.)
This example shows constant variables, function calling, simple math, and actually does a very useful image processing application.
Tips and Tricks
- Not all operators can handle vectors. The ternary operator ( The "? :" operator ) will NOT produce correct results for vector values. In Apple's Core Image, you can use "compare" instead, which is of the form compare(arg, vec1, vec2), and will return the components from vec1 if the corresponding component of arg is less than zero, and otherwise the component of vec2. Note that it's component-by-component-- you are not choosing an entire vector.
- You cannot use "for" or "if" on any input values. This makes some effects difficult to make user-configurable, but is often less limiting than you would think. The ternary operator still works for calculation. Loop unrolling in general is often more efficient, and some loops with user-specified iteration counts can be defined within Quartz Composer instead of inside the kernel itself.
- Since Quartz Composer is still a new program and somewhat buggy, it can easily crash if displaying a live video tooltip, or sometimes simply by changing a character in the code and having the output re-render. A handy tip I use is to put a stray "a" in the code, causing a syntax error, then editing the code, saving, and only then removing the "a" and letting it render. Nothing is more frustrating than to have to write an entire function twice from memory.
- Although vec4's components can be addressed by number, when dealing with pixels it is often much more natural to use the "r", "g", "b", and "a" designations for position 0, 1, 2, and 3 respectively. (eg. pixel.r)
- Similarly, vec2's components can be referenced using x and y (eg. point.x or point.y).
Sample Core Image Compositions
- http://www.samkass.com/blog - Not to overly push my own site, but I've put a lot of Core Image kernels up on my site. While not the cleanest code (it's all very experimental,) it will probably give you a good idea of what GL-slang is all about.
- http://www.subradar.net/~alex - Game of Life AI simulation written as a Core Image kernel.

