

10.4 Discrete Cosine Transforms in Image Compression
The discrete cosine transform (DCT) is used to transform a signal from the spatial domain into the frequency domain. The reverse process, that of transforming a signal from the frequency domain into the spatial domain, is called the inverse discrete cosine transform (IDCT).
A signal in the frequency domain contains the same information as that in the spatial domain. The order of values obtained by applying the DCT is coincidentally from lowest to highest frequency.
This feature and the psychological observation that the human eye and ear are less sensitive to recognizing the higherorder frequencies leads to the possibility of compressing a spatial signal by transforming it to the frequency domain and dropping highorder values and keeping loworder ones. When reconstructing the signal, and transforming it back to the spatial domain, the results are remarkably similar to the original signal.
This process, with a few extra bells and whistles and slightly modified versions of DCT, is the essence behind jpeg, mpeg, and mp3 compression.
Here, we look at a simplified case of compression using the DCT and IDCT without bells and whistles. The process:
 X = Apply DCT to a sequence of values.
 X' = Drop a portion of highorder values from X.
 X'' = Apply IDCT to X'
 Draw X'' and observe the similarity to the original X.
For the purpose of comparison, we drop highorder values in the following ranges:
 1/2 of all values, resulting in 50% compression.
 3/4 of all values, resulting in 75% compression.
 7/8 of all values, resulting in 87.5% compression.
We apply the above process to only one dimension (the horizontal) for now. Each row is considered as a sequence of values, and undergoes a DCT and an IDCT.
pomegranate = imread('pomegranateSeeds.jpg');

The original size of each row:
origWidth = size(pomegranate, 2);

Number of loworder samples to extract for the 3 cases (50%, 75%, 87.5% compression):
samplesHalf = floor(origWidth / 2); samplesQuarter = floor(origWidth / 4); samplesEighth = floor(origWidth / 8);

Final results are stored in these matrices:
pomegranateCompressed2 = []; pomegranateCompressed4 = []; pomegranateCompressed8 = [];

The following code traverses all colors and for each color all rows. Each row is transformed using the DCT, then highorder samples are discarded, and the loworder samples are reversetransformed using IDCT.
Function
dct
takes as an argument a sequence of numerical values, and returns a sequence of coefficients.
Function
idct
takes as an argument coefficients, as well as a value for the size of the desired inverse transform vector. Values of the input sequence are padded or truncated accordingly.
For efficiency of coding, all 3 images are built in one round
for k=1:3 % all color layers: RGB for i=1:size(pomegranate, 1) % all rows rowDCT=dct(double(pomegranate(i,:,k))); pomegranateCompressed2(i,:,k) = idct(rowDCT(1:samplesHalf), origWidth); pomegranateCompressed4(i,:,k) = idct(rowDCT(1:samplesQuarter), origWidth); pomegranateCompressed8(i,:,k) = idct(rowDCT(1:samplesEighth), origWidth); end end

subplot(2,2,1), image(uint8(pomegranate)), title('Original Image'); subplot(2,2,2), image(uint8(pomegranateCompressed2)), title('Compression Factor 2'); subplot(2,2,3), image(uint8(pomegranateCompressed4)), title('Compression Factor 4'); subplot(2,2,4), image(uint8(pomegranateCompressed8)), title('Compression Factor 8');


Figure 10.33 Click image to enlarge, or click here to open



Figure 10.34 Click image to enlarge, or click here to open


10.4.2 2dimensional compression
In addition to the previous compression of rows, we can compress columns of data as well. The compression rate for the entire image then increases by some factor:
 50% horizontal, 50% vertical = 25% total
 75% horiztonal, 75% vertical = 93.75% total
 87.5% horizontal, 87.5% vertical = 98.44% total
Starting from the horizontal compression:
origHeight = size(pomegranate, 1); samplesHalf = floor(origHeight / 2); samplesQuarter = floor(origHeight / 4); samplesEighth = floor(origHeight / 8);

pomegranateCompressed2full = []; pomegranateCompressed4full = []; pomegranateCompressed8full = [];

for k=1:3 % all color layers: RGB for i=1:size(pomegranate, 2) % all columns columnDCT2=dct(double(pomegranateCompressed2(:,i,k))); columnDCT4=dct(double(pomegranateCompressed4(:,i,k))); columnDCT8=dct(double(pomegranateCompressed8(:,i,k))); pomegranateCompressed2full(:,i,k) = idct(columnDCT2(1:samplesHalf), origHeight); pomegranateCompressed4full(:,i,k) = idct(columnDCT4(1:samplesQuarter), origHeight); pomegranateCompressed8full(:,i,k) = idct(columnDCT8(1:samplesEighth), origHeight); end end

subplot(2,2,1), image(uint8(pomegranate)), title('Original Image'); subplot(2,2,2), image(uint8(pomegranateCompressed2full)), title('Compression Factor 2 * 2'); subplot(2,2,3), image(uint8(pomegranateCompressed4full)), title('Compression Factor 4 * 4'); subplot(2,2,4), image(uint8(pomegranateCompressed8full)), title('Compression Factor 8 * 8');


Figure 10.35 Click image to enlarge, or click here to open



Figure 10.36 Click image to enlarge, or click here to open


10.4.3 Discussion
The compression shown here is somewhat simplistic  we compressed full rows separately from full columns. Typically, images are compressed in macroblocks of 8 rows by 8 columns, where each block is linearized into a onedimensional vector. This approach is certainly better, since it leverages localization of color within a macroblock. It is unlikely that color within an 8x8 block changes dramatically, and thus high compression rates can be achieved while retaining quality of detail.
When does compression fail? Frequently changing colors in dense spaces cannot be represented well with few coefficients. For example, a row of pixels interchanging between black and white pixelbypixel, is viewed as a high frequency in the frequency domain. However, a high frequency cannot be represented with few coefficients, and thus dropping highorder coefficients from the DCT removes the necessary detail. This is also the reason why diagrams are not compressed using jpeg compression.
Out of curiosity, we will compress the 3rd dimension of the image, i.e. the 3value color for each pixel. Because this requires computing the DCT for rows*columns vectors, and such computation is lengthy, we will resize the original image.
A simple algorithm for resizing by factors of 2 is the nearest neighbor algorithm. For a halfsize in both dimensions, we simply remove every other pixel in rows and columns. This is easily done in Matlab:
pomegranateSmall = pomegranate(1:2:size(pomegranate,1),1:2:size(pomegranate,2),:);

This image is roughly the same as the original, but it is missing 75% of its content.
Because there aren't many options for compressing 3 values, we will examine compression for:
 3 to 2 colors
 3 to 1 color (grayscale)
The final compressedanduncompressed images will be stored in these variables:
pomegranateColorCompression2 = []; pomegranateColorCompression3 = [];

It is necessary to iterate over all rows and all columns, and for each pixel, we linearize the RGB color values, take the DCT, drop values, and apply the IDCT.
for i=1:size(pomegranateSmall, 1) for j=1:size(pomegranateSmall, 2) rgb = pomegranateSmall(i,j,:); rgb = rgb(:)'; rgbDCT = dct(double(rgb)); rgbIDCT2 = idct(rgbDCT(1:2), 3); rgbIDCT3 = idct(rgbDCT(1), 3);
pomegranateColorCompression2(i,j,1) = rgbIDCT2(1); pomegranateColorCompression2(i,j,2) = rgbIDCT2(2); pomegranateColorCompression2(i,j,3) = rgbIDCT2(3);
pomegranateColorCompression3(i,j,1) = rgbIDCT3(1); pomegranateColorCompression3(i,j,2) = rgbIDCT3(2); pomegranateColorCompression3(i,j,3) = rgbIDCT3(3); end end

subplot(2,2,1), image(pomegranateSmall), title('Original Image'); subplot(2,2,3), image(uint8(pomegranateColorCompression2)), title('Compression 2 color values'); subplot(2,2,4), image(uint8(pomegranateColorCompression3)), title('Compression 3 color values');


Figure 10.37 Click image to enlarge, or click here to open



Figure 10.38 Click image to enlarge, or click here to open



