Digital Image Processing: Transforms, Compression, and Filters

Posted by Anonymous and classified in Technology

Written on in with a size of 143.34 KB

Fourier Transform

1. Definition

The Fourier Transform (FT) is used to convert an image from the spatial domain to the frequency domain. It expresses the image as a sum of sinusoidal functions of varying frequencies, amplitudes, and phases.

2. Intuition

In the spatial domain, we deal with pixel intensity at each location. In the frequency domain, we analyze how intensity varies—i.e., how fast brightness changes across pixels. This helps in analyzing patterns, removing noise, and performing filtering.

3. Mathematical Forms

  • A. 1D Continuous FT (for signals): vEAuSm7kmgAAAAABJRU5ErkJggg==
  • B. 2D Continuous FT (for images): AR3W3Zl1t8jSAAAAAElFTkSuQmCC
  • C. 2D Discrete FT (DFT): ehJmZGebPnw8nJyd+NoZhGJ1gjTDDMAzDtBPWHc0wDMMw7YQ1wgzDMAzTTlgjzDAMwzDthDXCDMMwDNNOWCPMMAzDMO2ENcIMwzAM005YI8wwDMMw7YQ1wgzDMAzTTlgjzDAMwzDthDXCDMMwDNNOWCPMMAzDMO2ENcIMwzAM005YI8wwDMMw7YQ1wgzDMAzTTlgjzDAMwzDthDXCDMMwDNNOWCPMMAzDMO2ENcIMwzAM007+H8oD35LE3oaQAAAAAElFTkSuQmCC

Used in DIP because images are digital:

  • f(x,y): Input image
  • F(u,v): Frequency domain representation
  • M×N: Size of image

4. Inverse Fourier Transform

A9hvzQ4UInahAAAAAElFTkSuQmCC

Used to go back from the frequency domain to the spatial domain.

Linearity
FT of sum = sum of FTs
Translation
Shifting in space → phase shift in frequency
Scaling
Compress in space → stretch in frequency and vice versa
Rotation
Rotating image → rotates spectrum by same angle
Convolution
Convolution in space ↔ Multiplication in frequency
Correlation
Similar to convolution, used in pattern matching

6. Applications in DIP

  • Image filtering (low-pass, high-pass)
  • Image compression (JPEG uses DCT, a form of FT)
  • Edge detection
  • Image enhancement
  • Noise removal

Discrete Cosine Transform (DCT)

Definition

The Discrete Cosine Transform converts an image from the spatial domain to the frequency domain using only cosine functions. Unlike the Fourier Transform, which uses both sine and cosine, DCT uses only cosine, making it more efficient for image compression. It produces real-valued output—no imaginary components.

1D DCT Equation

jxY2RnZ1MJY0IIhxIEQpqAtLQ0REZGYuDAgXVe1UAIIaAEgRBCCCEVoTkIhBBCCJFDCQIhhBBC5FCCQAghhBA5lCAQQgghRA4lCIQQQgiRQwkCIYQQQuT8P2sjp6CzYhDcAAAAAElFTkSuQmCC

2D DCT for Images

Dz0DBNmwp84CAAAAAElFTkSuQmCC

Properties of DCT

PropertyDescription
Real-valuedOutput contains no imaginary values
Energy compactionMost information is concentrated in fewer coefficients
OrthogonalityBasis functions are orthogonal
Separability2D DCT = 1D DCT on rows + 1D DCT on columns

Applications

  • Image and video compression (JPEG, MPEG)
  • Image denoising
  • Feature extraction (e.g., in face recognition)
  • Watermarking and image hiding

DCT vs FFT

FeatureDCTFFT
Functions usedCosines onlySines and cosines (complex)
OutputReal numbersComplex numbers
EnergyMore compactSpread across frequencies
Use-caseCompressionFrequency analysis

Wavelet Transform

Definition

The Wavelet Transform represents an image in terms of both space and frequency, unlike FT or DCT which focus on global frequency. It uses small, localized waveforms called wavelets instead of sine or cosine functions and offers multi-resolution analysis.

Key Concept

Fourier and DCT use global basis functions: good for frequency, bad for localization. The Wavelet transform uses short basis functions that are scaled and shifted—ideal for localized changes, like edges and textures.

Types

  • 1. Continuous Wavelet Transform (CWT): Infinite number of scales and positions; mainly used in theoretical analysis.
  • 2. Discrete Wavelet Transform (DWT): Used in practical applications (e.g., image processing); decomposes image into approximations and details at different scales.

DWT Process

At each level, the image is divided into 4 parts:

  • LL (Approximation): Low-frequency components (smooth areas)
  • LH (Horizontal detail): Vertical edges
  • HL (Vertical detail): Horizontal edges
  • HH (Diagonal detail): High-frequency corners and noise

Properties of Wavelet Transform

PropertyDescription
Multi-resolutionCan analyze image at different scales/resolutions
LocalizationGood spatial and frequency localization
Energy efficientStores edge and texture information compactly
Time-frequencyCombines advantages of both domains

Run Length Coding (RLC)

Definition

Run Length Coding (RLC) is a lossless compression technique used to reduce the size of data by encoding repeated values (runs) as a single value and count. It is simple, effective, and fast—especially when data has many repeating elements.

Working Principle

Instead of storing repeated values individually, RLC stores: (Value, Run Length).

Advantages and Limitations

FeatureExplanation
Simple algorithmEasy to implement and decode
LosslessNo loss of image quality
Efficient for sparse imagesGreat for binary and fax images
IssueWhy it matters
Not good for complex imagesHigh variation leads to poor performance
Inefficient for noisy imagesRandom pixel changes break runs
Dependent on scanning orderDifferent orders give different results

Lempel-Ziv Coding (LZ Coding)

A lossless compression algorithm that reduces file size by replacing repeated patterns with pointers to previous occurrences. It uses a dictionary-based approach.

Variants

  • LZ77: Uses a sliding window to find repeated sequences.
  • LZ78: Builds a dictionary dynamically.
  • LZW: Improves LZ78 by using fixed-length codes.

Image Filtering Techniques

Median Filter

A non-linear filter that replaces each pixel with the median value of the surrounding pixels. It is highly effective for removing impulse noise (salt-and-pepper noise) without blurring edges.

Geometric Mean Filter

A non-linear filter that replaces each pixel with the geometric mean of the pixel values in its neighborhood. nz5+Hj49Mujoe2FAUqIYTYCPWhEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjVCgEkKIjfwfzYz+9yRrcYoAAAAASUVORK5CYII=

Harmonic Mean Filter

A non-linear filter that replaces each pixel with the harmonic mean of the pixel values in its neighborhood. r1q3D3Llze60VUavVUKvVGjNqjDEcO3YMOjo62Lp1a69rtAEFCflZkEqlUKlUsLe351e9dvX19Xj48CGcnZ21MkRAQUIIGQo0RkIIEYyChBAiGAUJIUQwChJCiGAUJIQQwShICCGCUZAQQgSjICGECEZBQggRjIKEECIYBQkhRDAKEkKIYBQkhBDBKEgIIYJRkBBCBKMgIYQIRkFCCBGMgoQQIhgFCSFEMAoSQohgFCSEEMEoSAghglGQEEIEoyAhhAhGQUIIEYyChBAiGAUJIUQwChJCiGAUJIQQwShICCGCUZAQQgSjICGECEZBQggR7P8AIzQVyEmSQxkAAAAASUVORK5CYII=


Lossy Compression Techniques

1. Transform Coding

Converts an image from the spatial domain to a frequency domain. High-frequency components are discarded or quantized, while low-frequency components are kept.

2. K-L Transform (Karhunen-Loève Transform)

A technique used for dimensionality reduction. It finds the orthogonal basis that best approximates the image, preserving the most significant features.

3. Discrete Cosine Transform (DCT)

Separates the image into a sum of cosine functions. It is the standard for JPEG image compression.

4. Block Truncation Coding (BTC)

Divides the image into blocks and represents each block with a limited number of pixel values (usually two representative values based on mean and variance).

Related entries: