www.spiroprojects.com
This tutorial discusses how to use
MATLAB for image processing. Some familiarity with MATLAB is assumed (you
should know how to use matrices and write an M-file).
It is helpful to have the MATLAB
Image Processing Toolbox, but fortunately, no toolboxes are needed for most
operations. Commands requiring the Image Toolbox are indicated with.
There are five types of images in
MATLAB.
- Grayscale. A grayscale image M pixels tall and N pixels wide is represented as a matrix of double datatype of size M×N. Element values (e.g., MyImage(m,n)) denote the pixel grayscale intensities in [0,1] with 0=black and 1=white.
- Truecolor RGB. A truecolor red-green-blue (RGB) image is represented as a three-dimensional M×N×3 double matrix. Each pixel has red, green, blue components along the third dimension with values in [0,1], for example, the color components of pixel (m,n) are MyImage(m,n,1) = red, MyImage(m,n,2) = green, MyImage(m,n,3) = blue.
- Indexed. Indexed (paletted) images are represented with an index matrix of size M×N and a colormap matrix of size K×3. The colormap holds all colors used in the image and the index matrix represents the pixels by referring to colors in the colormap. For example, if the 22nd color is magenta MyColormap(22,:) = [1,0,1], then MyImage(m,n) = 22 is a magenta-colored pixel.
- Binary. A binary image is represented by an M×N logical matrix where pixel values are 1 (true) or 0 (false).
- uint8. This type uses less memory and some operations compute faster than with double types. For simplicity, this tutorial does not discuss uint8 further.
Grayscale is usually the preferred
format for image processing. In cases requiring color, an RGB color image can
be decomposed and handled as three separate grayscale images. Indexed images
must be converted to grayscale or RGB for most operations.
MATLAB can read and write images
with the imread
and imwrite commands.
Although a fair number of file formats are supported, some are not. Use imformats to see what your installation supports:
When reading images, an unfortunate
problem is that imread
returns the image data in uint8 datatype, which must be converted to double and
rescaled before use. So instead of calling imread directly, I use the following M-file function to read and
convert images:
Right-click and save getimage.m to
use this M-function. If image baboon.png is in the current directory (or somewhere in the MATLAB
search path), you can read it with MyImage =
getimage('baboon.png'). You can also use partial paths,
for example if the image is in <current
directory>/images/ with getimage('images/baboon.png').
To write a grayscale or RGB image,
use
imwrite(MyImage,'myimage.png');
Take care that MyImage is a double matrix with elements in [0,1]—if improperly
scaled, the saved file will probably be blank.
When writing image files, I highly
recommend using the PNG file format. This format is a reliable choice since it
is lossless, supports truecolor RGB, and compresses pretty well. Use other
formats with caution.
For example, image signal power is
used in computing signal-to-noise ratio (SNR) and peak signal-to-noise ratio
(PSNR). Given clean image uclean and noise-contaminated image u,
% Compute SNR
snr
= -10*log10( sum((uclean(:) - u(:)).^2) / sum(uclean(:).^2) );
% Compute PSNR (where the maximum possible value of uclean is 1)
psnr
= -10*log10( mean((uclean(:) - u(:)).^2) );
Be careful with norm: the behavior is norm(v) on vector v computes sqrt(sum(v.^2)), but norm(A) on matrix A computes the induced L2 matrix norm,
norm(A)
= sqrt(max(eig(A'*A))) gaah!
So norm(A) is certainly not sqrt(sum(A(:).^2)). It is nevertheless an easy mistake to use norm(A) where it should have been norm(A(:)).
Linear filtering is the cornerstone
technique of signal processing. To briefly introduce, a linear filter is an
operation where at every pixel xm,n of an image, a linear
function is evaluated on the pixel and its neighbors to compute a new pixel
value ym,n.
A linear filter in two dimensions
has the general form
ym,n = ∑j∑k hj,k xm−j,n−k
where x is the input, y
is the output, and h is the filter impulse response. Different
choices of h lead to filters that smooth, sharpen, and detect edges, to
name a few applications. The right-hand side of the above equation is denoted
concisely as h∗x and is called the “convolution of h
and x.”
Two-dimensional linear filtering is
implemented in MATLAB with conv2. Unfortunately, conv2 can only handle filtering near the image boundaries by
zero-padding, which means that filtering results are usually inappropriate for
pixels close to the boundary. To work around this, we can pad the input image
and use the 'valid' option when calling conv2.
A 2D filter h is said to be separable if it can be expressed as
the outer product of two 1D filters h1 and h2,
that is, h = h1(:)*h2(:)'. It is faster to pass h1 and h2
than h, as is done above for the moving
average window and the Gaussian filter. In fact, the Sobel filters hx and hy
are also separable—what are h1 and h2?
Spatial-domain filtering with conv2 is easily a computaionally expensive operation. For a K×K
filter on an M×N image, conv2 costs O(MNK2) additions and
multiplications, or O(N4) supposing M∼N∼K.
For large filters, filtering in the
Fourier domain is faster since the computational cost is reduced to O(N2 log N).
Using the convolution-multiplication property of the Fourier transform, the
convolution is equivalently computed by
% Compute y = h*x with periodic boundary extension
[k1,k2]
= size(h);
hpad
= zeros(size(x));
hpad([end+1-floor(k1/2):end,1:ceil(k1/2)],
...
[end+1-floor(k2/2):end,1:ceil(k2/2)]) = h;
y
= real(ifft2(fft2(hpad).*fft2(x)));
The result is equivalent to conv2padded(x,h) except near the boundary, where the above computation uses
periodic boundary extension.
Fourier-based filtering can also be
done with symmetric boundary extension by reflecting the input in each
direction:
% Compute y = h*x with symmetric boundary extension
xSym
= [x,fliplr(x)]; % Symmetrize horizontally
xSym
= [xSym;flipud(xSym)]; % Symmetrize vertically
[k1,k2]
= size(h);
hpad
= zeros(size(xSym));
hpad([end+1-floor(k1/2):end,1:ceil(k1/2)],
...
[end+1-floor(k2/2):end,1:ceil(k2/2)]) = h;
y
= real(ifft2(fft2(hpad).*fft2(xSym)));
y
= y(1:size(y,1)/2,1:size(y,2)/2);
(Note: An even more efficient method
is FFT overlap-add filtering. The Signal Processing Toolbox implements FFT
overlap-add in one-dimension in fftfilt.)
A nonlinear filter is an operation
where each filtered pixel ym,n is a nonlinear function of xm,n
and its neighbors. Here we briefly discuss a few types of nonlinear filters.
If you have the Image Toolbox, order
statistic filters can be performed with ordfilt2 and medfilt2. An order statistic filter sorts the pixel values over a
neighborhood and selects the kth largest value. The min, max, and median
filters are special cases.
If you have the Image Toolbox, bwmorph implements various morphological operations on binary
images, like erosion, dilation, open, close, and skeleton. There are also
commands available for morphology on grayscale images: imerode, imdilate and imtophat, among others.
Occasionally we want to use a new
filter that MATLAB does not have. The code below is a template for implementing
filters.
[M,N]
= size(x);
y
= zeros(size(x));
r
= 1; %
Adjust for desired window size
for n = 1+r:N-r
for m =
1+r:M-r
% Extract a
window of size (2r+1)x(2r+1) around (m,n)
w = x(m+(-r:r),n+(-r:r));
% ... write
the filter here ...
y(m,n) = result;
end
end
(Note: A frequent misguided claim is
that loops in MATLAB are slow and should be avoided. This was once true, back
in MATLAB 5 and earlier, but loops in modern versions are reasonably
fast.)
For example, the alpha-trimmed mean
filter ignores the d/2 lowest and d/2 highest values in the
window, and averages the remaining (2r+1)2−d values.
The filter is a balance between a median filter and a mean filter. The
alpha-trimmed mean filter can be implemented in the template as
%
The alpha-trimmed mean filter
w
= sort(w(:));
y(m,n)
= mean(w(1+d/2:end-d/2)); % Compute the
result y(m,n)
As another example, the bilateral
filter is
ym,n =
|
∑j,k
hj,k,m,n xm−j,n−k
|
∑j,k hj,k,m,n
|
where
hj,k,m,n = e−(j2+k2)/(2σs2)
e−(xm−j,n−k − xm,n)2/(2σd2).
The bilateral filter can be
implemented as
% The bilateral filter
[k,j]
= meshgrid(-r:r,-r:r);
h
= exp( -(j.^2 + k.^2)/(2*sigma_s^2) ) .* ...
exp( -(w - w(r+1,r+1)).^2/(2*sigma_d^2) );
y(m,n)
= h(:)'*w(:) / sum(h(:));
If you don't have the Image Toolbox,
the template can be used to write substitutes for missing filters, though they
will not be as fast as the Image Toolbox implementations.
Filter
|
Code
|
medfilt2
|
y(m,n) = median(w(:));
|
ordfilt2
|
w = sort(w(:));
y(m,n) = w(k); % Select the kth largest element |
imdilate
|
% Define a structure
element as a (2r+1)x(2r+1) array
SE = [0,1,0;1,1,1;0,1,0]; y(m,n) = max(w(SE)); |
Sign up here with your email
ConversionConversion EmoticonEmoticon