How to Measure Image Quality?

Francis Doumet


Often times, we need to compare the quality of two or more images. This is true for web designers who want to provide a premium experience to their users, e-commerce businesses that want to reflect high product detail and quality, and mobile game developers who want to make sure their games look best on various screen resolutions.

Given the variety of image compression tools available, how do we objectively compare the quality of each?

At first glance, it seems like it might be incredibly cumbersome to compare visual differences pixel-by-pixel across several bands (usually RGB). But it turns out there are a few tools that can provide an objective assessment of image quality.


Standing for Multi-scale Structural Similarity for Image Quality Assessment, MS-SSIM‘s key insight is that the perceived visual quality of an image depends largely on the distance between a human observer and the image itself. It examines the intensity of light (known as luminance) and contrast of the input images, and measures how closely both these features vary together in both images. This measurement is repeated at different scales of the input images to simulate increased distance between the image and an observer. A weighted average of the measurements at the different scales is then taken to settle on a final overall score determining the similarity of the two images.


PSNR, or Peak Signal-to-Noise Ratio, is a pixel-by-pixel calculation that determines the differences between two input images. Derived from mean squared error, PSNR is great at determining how close an image is of being an exact copy of another, but performs poorly as a measure of perceived perceptual quality. Two images might be visually very similar, yet have a low PSNR score if there are slight color differences imperceptible to the human eye, for instance.


Developed by Google, Butteraugli, ignores visually imperceptible differences and outputs a score that focuses on the parts of the compared image that looks the worst. It not only outputs a quality metric, but also a heatmap highlighting the levels of differences in an image.

Original Image           

Compressed Image Heatmap

The compute cost for these algorithms varies widely. Below are compute times we found when running on a 2.7 GHz Intel Core i7 processor with 16 GB RAM:

  • MS-SSIM: 23 seconds
  • Butteraugli: 1 minute, 36 seconds
  • PSNR: 2 seconds

So Butteraugli is about 4 times more expensive than MS-SSIM, which in turn is around 10 times more expensive than PSNR. Running on a GPU instead of a CPU reduces compute cost further by a factor of 10.

At, we selected MS-SSIM to train our algorithms for its advanced capability to approximate the human visual system, as well as its relatively low compute cost.

Share this article