Image Segmentation using MATLAB

Segmentation partitions an image into distinct regions containing each pixels with similar attributes. To be meaningful and useful for image analysis and interpretation, the regions should strongly relate to depicted objects or features of interest. Meaningful segmentation is the first step from low-level image processing transforming a greyscale or colour image into one or more other images to high-level image description in terms of features, objects, and scenes. The success of image analysis depends on reliability of segmentation, but an accurate partitioning of an image is generally a very challenging problem.
Segmentation techniques are either contextual or non-contextual. The latter take no account of spatial relationships between features in an image and group pixels together on the basis of some global attribute, e.g. grey level or colour. Contextual techniques additionally exploit these relationships, e.g. group together pixels with similar grey levels and close spatial locations.

Non-contextual thresholding

Thresholding is the simplest non-contextual segmentation technique. With a single threshold, it transforms a greyscale or colour image into a binary image considered as a binary region map. The binary map contains two possibly disjoint regions, one of them containing pixels with input data values smaller than a threshold and another relating to the input values that are at or above the threshold. The former and latter regions are usually labelled with zero (0) and non-zero (1) labels, respectively. The segmentation depends on image property being thresholded and on how the threshold is chosen.
Generally, the non-contextual thresholding may involve two or more thresholds as well as produce more than two types of regions such that ranges of input image signals related to each region type are separated with thresholds. The question of thresholding is how to automatically determine the threshold value.

Simple thresholding

The most common image property to threshold is pixel grey level: g(x,y) = 0 if f(x,y) < T and g(x,y) = 1 if f(x,y) ≥ T, where T is the threshold. Using two thresholds, T1 < T1, a range of grey levels related to region 1 can be defined: g(x,y) = 0 if f(x,y) < T1 OR f(x,y) > T2 and g(x,y) = 1 if T1f(x,y)T2.

The main problems are whether it is possible and, if yes, how to choose an adequate threshold or a number of thresholds to separate one or more desired objects from their background. In many practical cases the simple thresholding is unable to segment objects of interest, as shown in the above images.
A general approach to thresholding is based on assumption that images are multimodal, that is, different objects of interest relate to distinct peaks (or modes) of the 1D signal histogram. The thresholds have to optimally separate these peaks in spite of typical overlaps between the signal ranges corresponding to individual peaks. A threshold in the valley between two overlapping peaks separates their main bodies but inevitably detects or rejects falsely some pixels with intermediate signals. The optimal threshold that minimises the expected numbers of false detections and rejections may not coincide with the lowest point in the valley between two overlapping peaks:

Adaptive thresholding

Since the threshold separates the background from the object, the adaptive separation may take account of empirical probability distributions of object (e.g. dark) and background (bright) pixels. Such a threshold has to equalise two kinds of expected errors: of assigning a background pixel to the object and of assigning an object pixel to the background. More complex adaptive thresholding techniques use a spatially varying threshold to compensate for local spatial context effects (such a spatially varying threshold can be thought as a background normalisation).
A simple iterative adaptation of the threshold is based on successive refinement of the estimated peak positions. It assumes that (i) each peak coincides with the mean grey level for all pixels that relate to that peak and (ii) the pixel probability decreases monotonically on the absolute difference between the pixel and peak values both for an object and background peak. The classification of the object and background pixels is done at each iteration j by using the threshold Tj found at previous iteration. Thus, at iteration j, each grey level f(x,y) is assigned first to the object or background class (region) if f(x,y)Tj or f(x,y) > Tj, respectively. Then, the new threshold, Tj+1 = 0.5(μj,ob + μj,bg) where μj,ob and μj,bg denote the mean grey level at iteration j for the found object and background pixels, respectively:

Colour thresholding

Color segmentation may be more accurate because of more information at the pixel level comparing to greyscale images. The standard Red-Green-Blue (RGB) colour representation has strongly interrelated colour components, and a number of other colour systems (e.g. HSI Hue-Saturation-Intensity) have been designed in order to exclude redundancy, determine actual object / background colours irrespectively of illumination, and obtain more more stable segmentation. An example below (from shows that colour thresholding can focus on an object of interest much better than its greyscale analogue:

Segmentation of colour images involve a partitioning of the colour space, i.e. RGB or HSI space. One simple approach is based on some reference (or dominant) colour (R0, G0, B0) and thresholding of Cartesian distances to it from every pixel colour f(x,y) = (R(x,y),G(x,y),B(x,y)):

where g(x,y) is the binary region map after thresholding. This thresholding rule defines a sphere in RGB space, centred on the reference colour. All pixels inside or on the sphere belong to the region indexed with 1 and all other pixels are in the region 0.
Also, there can be an ellipsoidal decision surface if independent distance thresholds are specified for the R, G, and B components. Generally, colour segmentation, just as the greyscale one, may be based on the analysis of 3D colour histograms or their more convenient 2D projections. A colour histogram is built by partitioning of the colour space onto a fixed number of bins such that the colours within each bin are considered as the same colour. An example below of the partitioned 11×11×11 RGB colour space is from (

If a chosen colour space separates colourless intensity values from intensity-independent colour components (such as hue and saturation or normalised red / blue colurs), colour segmentation can be based on a few pre-selected colours, e.g. on the eight primary colours (black, red, green, blue, yellow, cyan, magenta, white). An example below shows a digitised picture of the Rembrandt's canvas "Doctor Nicolaes Tulp's Demonstration of the Anatomy of the Arm" (1632; Mauritshuis Museum, The Hague, The Netherlands), its 8-bin histogram of the primary colours, and the corresponding colour regions:

More efficient adaptive thresholding of greyscale images can be extended to colour images, too, by replacing mean grey levels for each colour region with its mean colors, e.g. RGB-vectors with the mean component values.
Next Post »