DCT in JPEG Compression
Signal Processing · Independent Research
Executed the core JPEG compression sequence on my own raw photographic data. Computed the Discrete Cosine Transform analytically, applied standard quantization matrices, and validated reconstruction accuracy against Fast Fourier Transform benchmarks.
At a glance (one-pager)
Low-Frequency Dominance Validated at DC
Manual computation of the primary pixel matrix yielded a DC coefficient of . This metric directly proves that the vast majority of photographic energy is concentrated in low-frequency block averages. Subsequent Inverse DCT reconstruction produced a near-lossless output, mathematically validating the aggressive truncation of high-frequency AC coefficients during the quantization phase.
Pixel Depth
Level-shifted to [−127, 127] before DCT.
Block Size
JPEG standard — a 512×512 image contains 4096 such blocks.
JPEG Quality
Balances compression ratio against perceptual fidelity.
DC Coefficient
Largest coefficient by far — confirms low-freq content dominates.
DCT Reconstruction
Near-identical. FFT gave [24,12,20…48] — visible artifacts.
Compression Algorithm
Downsampling of pixel values to reduce complexity.
JPEG remains the global standard for image compression by relying heavily on frequency-domain transformations. The objective of this analysis was to mathematically validate the efficiency of the Discrete Cosine Transform (DCT) by isolating its core mathematical pipeline. This involved extracting pixel intensities via Python and computing the transform algorithm manually to evaluate the mechanics strictly prior to encoding.
Methodology & reflection
Bypassing the entropy coding phase exposed a critical vulnerability in the models. Without Huffman encoding, the Inverse DCT reconstruction produced minor boundary clipping artifacts, outputting negative pixel intensities that required manual zeroing. This limitation provided a direct lesson for future biomedical signal processing research: theoretical algorithm isolation is insufficient. Future compression or kinematic models must be validated through complete, end-to-end pipelines to guarantee absolute data integrity at the physical output layer.
I converted a personal beach photo to greyscale in Python, extracted the first pixel block, and computed , where is the cosine transform matrix. The pipeline included level-shifting pixel values from to , element-wise division by the quality-50 quantisation matrix , and reconstruction via IDCT. The derivation of the DCT from the DFT was worked through analytically in the essay body.
The derivation began with the Discrete Fourier Transform and took only its real part, eliminating imaginary frequency nodes to arrive at a variant of the DCT. The 2D DCT-II used in JPEG follows step-by-step:
This established why the DCT is used in JPEG rather than treating it as an arbitrary historical choice. A secondary comparison against FFT reconstruction on the same input was included to test the claim that DCT is better-suited for natural images.
The manual calculation of the first block yielded a DC coefficient of . This magnitude vastly outweighed the high-frequency AC coefficients, mathematically validating the JPEG quantization strategy of aggressively zeroing out high-frequency data. IDCT reconstruction produced a near-lossless output matrix, with minor clipping artifacts stemming from the explicit exclusion of Huffman encoding in this model.