Adaptive streaming techniques for responsive images

Responsive image is a term in the Web world corresponding to the techniques involved when an author wants its website to be rendered with the right image for the right client given its viewing condition (screen size, pixel density, network, …). This technique falls, from a research point of view, in the broad category of media adaptation techniques to the user’s context. To me, at first sight, the problem seemed a no-brainer as it has been solved several times, including in the web world for video streaming with the recent approaches of adaptive streaming such as DASH. Naively I thought the same techniques could be used. However, after attending some meetings, including this week’s meetup, it appears that the environment constraints are such that the problem is not so simple to solve. In this post, I want to highlight the differences and give an example of how DASH manifest could be used for responsive images (I’m not really proposing it though).


A first obvious difference is that adaptive video streaming has a notion of time that images don’t have. However, at each time in adaptive streaming you have to take a decision as to which video segment to choose. There are some differences but I think the comparison is interesting.

A second difference is the use of a manifest. For adaptive video streaming, a manifest is used describe a a set of video streams and segments. For responsive image, you cannot have a manifest per image as you would have to make 2 requests per image (one for the manifest, one for the image). However, you could have 1 manifest per page, though, included in the page to avoid the additional request. In some sense, it is similar to the picture element but in my view markup to be rendered and markup to describe resources should be separate.

A third difference is with respect to network behavior. Browsers typically use one TCP connexion per image and do parallel download, including other downloads for scripts, CSS… because apparently pipelining does not work so well for web content. As a consequence, adaptation to network bandwidth (enabled by providing segment sizes in DASH) is harder, because bandwidth estimation is harder.

A fourth difference is with regards to adaptation to DPI, aspect ratio and art direction. In adaptive streaming, in particular in DASH, all resources sharing the same aspect ratio but different DPI are grouped together, in an Adaptation Set. Adaptation Sets then have annotations describing how they differ or relate and subsets how they can be presented (e.g. if they are exclusive). For instance, you may have a spatial descriptor indicating how a (set of) video corresponds to a sub-region of another (set of) video. In some sentence that is similar to Art Direction.


As an exercise, here is a example of how an adaptive streaming manifest (DASH in this case) could be used to address a responsive images example (taken from the Adobe page):

An equivalent MPEG-DASH representation would be:

This could be used like this:

Leave a Reply

Your email address will not be published. Required fields are marked *