Differentiable JPEG: The Devil is in the Details

Differentiable JPEG: The Devil is in the Details
WACV 2024

¹NEC Laboratories America, Inc.
²Technische Universität Darmstadt

TL;DR: We present a differentiable implementation of JPEG coding, able to accurately approximate JPEG over the whole compression range.

Abstract

JPEG remains one of the most widespread lossy image coding methods. However, the non-differentiable nature of JPEG restricts the application in deep learning pipelines. Several differentiable approximations of JPEG have recently been proposed to address this issue. This paper conducts a comprehensive review of existing diff. JPEG approaches and identifies critical details that have been missed by previous methods. To this end, we propose a novel diff. JPEG approach, overcoming previous limitations. Our approach is differentiable w.r.t. the input image, the JPEG quality, the quantization tables, and the color conversion parameters. We evaluate the forward and backward performance of our diff. JPEG approach against existing methods. Additionally, extensive ablations are performed to evaluate crucial design choices. Our proposed diff. JPEG resembles the (non-diff.) reference implementation best, significantly surpassing the recent-best diff. approach by 3.47dB (PSNR) on average. For strong compression rates, we can even improve PSNR by 9.51dB. Strong adversarial attack results are yielded by our diff. JPEG, demonstrating the effective gradient approximation.

Method

Existing differentiable JPEG approaches do not model all discretizations and bounds of standard (non-diff.) JPEG, resulting in a vast deterioration in approximation performance for strong compression (low JPEG quality). We present surrogates of these missed discretizations and bounds. In particular, we introduce a differentiable clipping of both the codec image and the quantization table. Additionally, we propose to differentiably floor both the quantization table scale and the quantization table itself. Modeling these bounds and discretizations in a differentiable setting leads to a strong approximation over the whole JPEG quality range. We also propose a Straight-Through Estimator (STE) of our differentiable JPEG approach, further enhancing our approximation performance.

Results

Here we show qualitative results of our differentiable JPEG approach compared to the non-differentiable reference (OpenCV) JPEG implementation. We also showcase results of the differentiable JPEG approach by Shin et al. JPEG Quality = 1

JPEG Quality = 50

In Table 1, we provide qualitative results of our differentiable JPEG approach against other existing approaches. For more results and ablations, please refer to our paper where we analyze both the forward and backward performance of our differentiable JPEG approach.

Table 1. Forward function performance summary & ablation. To ablate our approach, we gradually add our novel components to Shin. We also report the performance of other diff. approaches.

overview

Presentation

Code

We offer a PyTorch implementation of our differentiable JPEG approach. Please refer to our GitHub repository. If you encounter issues or have questions about our implementation please open a GitHub issue or reach out to us.

Citation

If you use our differentiable JPEG or find this research useful in your work, please cite our paper:

@inproceedings{Reich2024,
    title={{Differentiable JPEG: The Devil is in the Details}},
    author={Reich, Christoph and Debnath, Biplob and Patel, Deep and Chakradhar, Srimat},
    booktitle={{IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}},
    year={2024}
}

Design / source code from Jon Barron's Mip-NeRF / Michaël Gharbi's website