Evaluating JPEG, WebP, and AVIF for PDF Thumbnails

After making some improvements to the PDF thumbnails in DSpace 7.5 and DSpace 7.6, I wanted to explore some of the ideas I had highlighted for future work. One of the most promising improvements is switching from the JPEG image format to a more modern format, such as WebP, AVIF, or JPEG-XL.

In brief, JPEG is an older image format, while WebP, AVIF, and JPEG-XL are each successively newer and more advanced. These newer formats provide similar visual quality to JPEG but with smaller file sizes. However, the level of support for these formats varies, with only WebP and AVIF having gained broad support in client-side applications like web browsers.¹

Based on my evaluation of WebP and AVIF against JPEG using a representative sample of real PDFs from our institutional repository, I recommend that DSpace default to WebP as its image format for PDF thumbnails.

Note: This was originally published on GitHub pages alongside the code and data for my mini “improving DSpace thumbnails” project.

Background and Methodology

More than a decade ago, Google conducted their second major WebP compression study. Although much time has passed, the methodology remains relevant: because “quality” settings between image formats are not consistent, we must use a visual quality metric to establish equivalent “quality” settings for each format before we can compare compression results accurately. For example, JPEG Q92 is not the same as WebP Q92 or AVIF Q92. The current state-of-the-art perceptual quality metric is ssimulacra2 (v2.1) by Jon Sneyers.

Using a sample size of thirty-five PDFs I performed thumbnail operations at quality settings one to 100 for JPEG, WebP, and AVIF with ImageMagick. For each source PDF, I generated a lossless reference (PNG) and calculated the ssimulacra2 score for each thumbnail with regards to the reference. These tasks were broken up into individual “jobs” and submitted to a local process queue called pueue. Needless to say, managing this workflow and the nearly 11,000 jobs became a significant part of the analysis!² Results were saved to CSV and plotted using gnuplot.

Bits Per Pixel (BPP)

For each of the thirty-five PDFs, I calculated the average ssimulacra2 scores for quality settings 1 to 100 for each image format relative to a lossless reference. Then I plotted the ssimulacra2 scores against the average number of bits per pixel (BPP) needed by each format:

From the above it is obvious that both WebP and AVIF outperform JPEG in terms of BPP: they both need less bits to achieve the same perceptual quality as JPEG.

Another way to visualize the same is by plotting the average file size in bytes versus average ssimulacra2 scores:

Equivalent Quality Settings

We can also see that both WebP and AVIF have diminishing returns after achieving a ssimulacra2 score of around 80. According to ssimulacra2, scores can range between -infinity and 100, with 70 being “high quality” and 90 being “very high quality”. For the purpose of this evaluation I will assume that a score of 80 is reasonably high enough. In any case, it seems that WebP (in particular) and AVIF have trouble achieving “very high quality” scores—with this dataset at least:

gnuplot graph showing average ssimulacra2 scores versus quality for JPEG, WebP, and AVIF.

Using a ssimulacra2 score of 80 as the target perceptual quality gives the following equivalencies between “quality” settings for each format:

  • JPEG: Q89
  • WebP: Q86
  • AVIF: Q65

Summary of Findings

After establishing equivalent “quality” settings for JPEG, WebP, and AVIF for this dataset, we can see that WebP and AVIF require 33% and 46% less bits than JPEG, respectively, to achieve a ssimulacra2 score of 80. A score of 80 falls between the “high quality” and “very high quality” brackets on ssimulacra2’s scoring scale. We observed that WebP and AVIF have problems reaching “very high quality” scores, which may or may not be a problem depending on application requirements.

Finally, while AVIF may be compelling technically, according to caniuse.com’s data about web browser share and feature implementation, support is still too low to consider in practice. Many DSpace repositories are deployed in developing country or corporate settings where users may not have the latest web browsers.

Thumbnail Gallery

The thumbnails below are 800 pixels on their longest side — usually height — and are rendered at 400 pixels in CSS for crispness and ease of side-by-side comparison here. This is a selection of the thirty-five PDFs I used in the analysis. See the main thumbnail gallery for a larger number of interactive comparisons.

Note: As DSpace 7.5 uses the default ImageMagick quality setting of 92 for JPEG thumbnails, I use that for the comparison here. For the WebP thumbnails I use a quality setting of 86, which I found to be sufficient in my tests above.

The goal of this thumbnail gallery is not to perform an “apples to apples” comparison, but to show how WebP compares visually to existing DSpace JPEG thumbnails. This approach gives JPEG an advantage when it comes to visual quality and a disadvantage on file size.

10568/103447

10568/3149

10568/68624

10568/3030

More Information

The thumbnails in this gallery were generated by the src/create-thumbnails.sh script using PDF bitstreams from the CGSpace repository. The scripts to reproduce the results are available on GitHub.

Upstream Progress

Future Work

Future work may include:

  • Switch to libvips, which is faster, uses less memory, and handles things like pagination, CMYK, PDF CropBox, and more automatically.
  • Re-work DSpace ImageMagick PDF filter to avoid generation loss. I proposed using MIFF to address this in the pull request on 2023-05-18 above.
  • Switch to AVIF or JPEG-XL when the browser support and tooling are more widespread.

¹ As of April, 2023 WebP and AVIF are support by 97.66% and 84% of web users, respectively, according to caniuse.com.

² To make matters more complicated, I performed a simultaneous analysis using libvips. The total number of jobs was over 20,000, which is well beyond what pueue was designed for, and which caused some scaling issues on my systems.