Digital Image Processing: October 2011

Image Compression

1.       Objective : Reduce the amount of data required to represent an image.
2.       It is both an art and a science.
3.       It is useful and commercially successful technology in DIP.
4.       Need: Reduce the memory and help in increased data transfer per second.
5.       Benefits of compression :
a.       A 2 hour movie stored without compression in a DVD requires 27 dual layer DVDs of 8.5GB capacity.
b.       The time required to transmit a small 128x128x24bit full-colour image over a 56kbps or 12Mbps(broadband) is from 7.0 to 0.03 secs.
6.       Compression can reduce transmission time by a factor of 2 to 100 or more.
7.       Other areas : televideo conferencing, remote sensing, document and medical imaging, FAX.
8.       We study only the most frequently used compression techniques and some industry standards that make them useful.
9.       Data compression is the process of reducing the amount of data required to represent a given quantity of information.
10.   Data and information are not the same thing.
11.   Data are the means by which the information is conveyed.
12.   Various amounts of data can be used to represent the same amount of information.
13.   Redundant data : representations that contain irrelevant and/or repeated information.
14.   Relative data redundancy is obtained from compression ration C ( = b/b’ ) as R = 1 – 1/C where b and b’ represent the number of bits of the same picture before and after compression.
15.   Data redundancy in 2D intensity arrays is :
a.       Coding redundancy : The 8-bit codes that are used to represent the intensities in most 2-D intensity arrays contain more bits than are needed to represent the intensities.
b.       Spatial and temporal redundancy : Pixels of most 2D arrays are spatially correlated. Also in a video, the pixels are temporally correlated.
c.       Irrelevant information : Most 2D arrays contain information that is ignored by the human visual system.
16.   Coding redundancy is present when the codes assigned to a set of events (such as intensity values) do not take full advantage of the probability of the events.
17.   Coding redundancy is almost always present when the intensities of an image are represented using a natural binary code. The reason is that most images are composed of objects that have a regular and somewhat predictable morphology (shape) and reflectance, and are sampled so that the objects being depicted are much larger than the picture elements.
18.   The natural consequence is that for most image, certain intensities are more probable than others.
19.   A natural binary encoding assigns the same number of bits to both the mot and the least probable values, failing to minimize and resulting in coding redundancy.
20.   The compression results from assigning fewer bits to the more probable intensity values than to the less probable one. In the resulting ‘variable length code’ , the image’s most probable intensity is assigned the 1-bit code word while the least probable occurring intensity is assigned the 3-bit code word. Note that the best fixed-length code that can be assigned to the intensities of the image in Eg 8-1 is the natural 2-bit counting sequence but the resulting compression is 4:1 not 4.42:1 which is about 10% less than the 4.42:1 compression of the variable length code.
21.   Spatial and temporal redundancy : When an image cannot be compressed by variable length coding alone and when all intensity levels have equal probability but when observations reveal spatial redundancy that can be eliminated by representing the image as a sequency of run-length pairs where each run-length pair specifies the start of a new intensity and the number of consecutive pixels that have that intensity.
22.   In most images , pixels are correlated spatially (in both x and y) and in time(t) when the image is part of a video sequence.
23.   To reduce the redundancy associated with spatially and temporally correlated pixels , a 2-D intensity array must be transformed into a more efficient but usually ‘non-visual’ representation.
24.   When 2D intensity array is converted to run-lengths or the differences between adjacent pixels is used, the transformation is called mapping.
25.   A mapping is reversible if the pixels of the original 2D intensity array can be constructed without error from the transformed data set; otherwise the mapping is said to be irreversible.
26.   Irrelevant information : Compression by removing ‘superfluous’ data from the set. Eg. A homogeneous gray image can be represented by its average intensity alone – a single 8-bit value.
27.   Whether or not this information should be preserved is application dependent. If the information is important (like digital X-ray archive), it should not be omitted; otherwise , the information is redundant and can be excluded for the sake of compression performance.
28.   How to decide the bits that are actually needed to represent the information in an image ? Information theory helps.
29.   Information theory : Generation of information can be modeled as a probabilistic process that can be measured in a manner that agrees with intuition.

30.   A random event E with probability P(E) contains I(E) units of information where I(E) = -log P(E).
31.   If an event occurs always P(E) = 1 and hence no information is attached to it.
32.   The base of the logarithm decides the units used to measure the information. If base = 2 and P(E) = 0.5, I(E) = 1 bit. Meaning : 1 bit is the amount of information conveyed when one of two possible equally likely events occurs.
33.   The entropy of the intensity source H^~ = negative summation of Pr(rk)log2Pr(rk) from k=0 to L-1.
34.   The amount of entropy and thus information in an image is far from intuitive.
35.   Shannon’s first theorem or noiseless coding theorem :
Lim as n tends to infinity of [Lavg,n / n] = H where Lavg,n is the average number of code symbols required to represent all n-symbol groups.
36.   Fidelity criteria: Removal of “irrelevant visual” information involves a loss of real or quantitative image information. How to quantify this loss ? There are two ways
a.       Objective fidelity criteria
b.       Subjective fidelity criteria
37.   Objective fidelity criteria : The information loss can be expressed as a mathematical function of the input and output of a compression process. Eg. RMS error between two images. This is a simple and convenient way to evaluate information loss but not meaningful for human.
38.   Subjective fidelity criteria : Measuring image quality by subjective evaluations of people by presenting a decompressed image to a cross section of viewers and averaging their evaluation.
39.   Subjective evaluations can be as follows : { -3,-2,-1,0,1,2,3} for {much worse, worse, slightly worse, same, slightly better, better, much better} respectively.
40.   Care must be taken while choosing the results of the above two criteria. Because, a low rms error may also be due to an artificially generated image.
41.   Image compression models : The model of a image compression and decompression consists of two distinct functional components : an encoder and a decoder. This can be done using hardware and/or software.
42.   Codec : A device/program capable of both encoding and decoding.
43.   Compression process has three independent operations : mapping , quantizing and symbol coding.
a.       Mapping is a reversible process and transforms f(x,y) to a non-visual format designed to reduce spatial and temporal redundancy.
b.       Quantising is an irreversible process to reduce accuracy of the mapper output in accordance with a preestablished fidelity criterion. The goal is to keep irrelevant information out of the compressed representation.
c.       Symbol coder is reversible and generates a fixed-length or variable-length code to represent the quantiser output and maps the output according to the code.
44.   Shortest code words are assigned the most frequently occurring quantizer output values – thus minimizing coding redundancy.
45.   The decoder contains symbol decoder and inverse mapper. Obviously there is no de-quantiser as it is an irreversible process.
46.   In video applications, decoded output frames are maintained in an internal frame store and used to reinsert the temporal redundancy that was removed at the encoder.
47.   Image formats, containers and compression standards :
a.       Image file format is a standard way to organize and store image data. It defines how the data is arranged and the type of compression – if any – that is used.
b.       Image container is similar to a file format but handles multiple types of image data.
c.       Image compression standards define procedures for compressing and decompressing images.
48.   Standards for continuous tone still image : JPEG, JPEG-LS, JPEG 2000, BMP, GIF, PDF, PNG, TIFF.
49.   VIDEO Standards : DV, H.261, H.262, H.263, H.264, MPEG-1, MPEG-2, MPEG-4, MPEG-4 AVC, AVS, HDV, M-JPEG, QUICK-TIME, VC-1, WMV9.
50.   BASIC COMPRESSION STANDARDS :
a.       Huffman coding
b.       Golomb coding
c.       Arithmetic coding
d.       Lempel-Ziv-Welch (LZW) coding
e.       Run-Length coding
f.        Bit-plane coding
g.       Block Transform coding
h.       Predictive coding :- (i) lossless (ii) lossy
i.        Wavelet coding

Image Restoration and Reconstruction

Image enhancement is subjective and image restoration is objective.
Goal is to improve the image in some predefined sense.
Attempt to recover a degraded image by using the knowledge of the degradation phenomenon which is usually available before hand.
Understanding the degradation process and then modeling it is the key to success here.
Example : Given y[m, n] and we are able to model y[m, n] = x[m, n] +η(m, n) where η = noise. We can get x[m, n] if we are able to understand η which is noise and which had degraded the original x[m, n]. By performing the inverse process of degradation on y[m, n] the original x[m, n] can be restored.
In reality, one can only estimate η. The quality of restoration largely depends on the closeness of the estimate to η.
As in image enhancement, image restoration techniques are best formulated in the spatial domain, while others are better suited for the frequency domain.
Rule of thumb : If the noise is additive use spatial domain technique and if the degradation is a motion induced blur, use frequency domain techniques.
However, additive noise is also taken care of in the frequency domain.
Image restoration is divided into subtopics as follows :

a. A model of the image degradation/restoration process

b. Understanding how to model noise

c. Understand the PDF of some important noise distribution such as Gaussian, Rayleigh, Erlang(gamma), Exponential, Uniform, Impulse(salt-and-pepper) noise and finally periodic noise.

The process of image restoration is given below

Noise gets added to an image at the time of image acquisition and/or transmission.
Factors include environmental conditions, quality of sensing elements, light levels, sensor temperature during acquisition.
Interference in the channel, corruption due to lightning or other atmospheric disturbance during transmission.
Noise models : Can be undertood through the corresponding Probability Density Function (PDF). Some of the most common and important noise PDFs are for the following : Gaussian noise, Rayleigh noise, Erlang(gamma) noise, Exponential noise, Uniform noise, Impulse (salt-and-pepper noise).
The above PDFs, as a group, provide useful tools for modeling a broad range of noise corruption situations found in practice.
Gaussian noise is due to electronic circuit noise and sensor noise due to poor illumination and/or high temperature.
Rayleigh noise density is helpful in range imaging.
Exponential and gamma densities find application in laser imaging.
Impulse noise is found in situations where quick transients, such as faulty switching, take place during imaging.
An important observation is that it is difficult to differentiate visually between the first five noisy images even though their histograms are significantly different.
The salt-and-pepper appearance of the image corrupted by impulse noise is the only one that is visually indicative of the type of noise causing the degradation.
Periodic noise in an image arises typically from electrical or electromechanical interference during image acquisition. This is a spatially dependent noise which can be reduced significantly via frequency domain filtering.
Restoration in the presence of noise is possible through spatial filtering.
Spatial filters are of three types

a. Mean filters : Arithmetic Mean filter, Geometric Man Filter, Harmonic Mean Filter, Contraharmonic mean Filter

b. Order-Statistic Filters : Median Filter, Max filter, Min filter, Midpoint Filter, Alpha-trimmed mean filter.

c. Adaptive Filters : Adaptive local noise reduction filter, Adaptive Median Filter

Periodic noise reduction using Frequency domain filtering : These include Bandreject filters, Bandpass filters, Notch filters, Optimum notch filters.
Estimating the Degradation function is done as follows :

a. Estimation by image observation

b. Estimation by Experimentation.

c. Estimation by modeling

Inverse Filtering in general has poor performance and is improved by the following three methods :

a. Minimum Mean Square Error (Wiener) Filtering makes provision for the degradation function and statistical characteristics of noise into the restoration process. Here, images and noise are considered as random variables and the objective is to minimise the mean square error

b. Constrained Least Squares Filtering

c. Geometric Mean Filter

Digital Image Processing

Tuesday, October 11, 2011

Image Compression

Image Restoration and reconstruction