Barcode

Barcode is an internal service/endpoint which gathers all of the headline images used in FT articles from a provided date range, the last 24 hours for example, and squashes them to give one condensed image. The final result looks similar to a coloured barcode.

Barcode of articles published during 2018-11-13, sorted by colour

How did it come about?

We were experimenting with ways to show an extremely summarised view of the news. The aim being to give readers a speedier overview of recent events. During the rapid prototyping phase, one idea was to display all of the images from the last 24 hours worth of news all at the same time.

This gave us a large number of images which we tried to display, equally, on the screen at the same time.

Extreme summarisation headline image grid view

Pleasing to look at, however its ability to show a meaningful summary of the news is debatable.

While building the above, and looking for image summarisation inspiration, we came across an image created from every single frame of the Harry Potter movie series.

All the Harry Potter movie frames in one image https://www.reddit.com/r/dataisbeautiful/comments/194gmp/all_the_harry_potter_frames_from_all_the_movies/

Even though you cannot see the contents of each individual frame you can see the movies getting darker over each iteration and the (spoilers) white train station scene at the end of the series.

So we made a version of the for the FT news and called it Stretched Images:

24 hours of FT headline images displayed using HTML, CSS & JS

Similar to the gridded view it doesn't give a great view of what is actually happening in the news. We can, however, see that variety of images used by articles and it does create a pleasing image.

This was as far as the project came under the extreme summarisation banner.

However…we were not done yet.

Phase 2

Stretched Images worked on a hardcoded set of parameters and was quite fiddly to update or change. So, in what started as 20% time project, we rebuilt Stretched Images into project Barcode - a parameterised endpoint we could use to explore FT News images.

The first task was to actually turn a stack of images into one single image, as Stretched Images simulated this effect by display a series of stretched images on after the other. So we turned to Node, Sharp & Graphicsmagick. Node handles the backend image requests and operations, Sharp for resizing and optimisation of images & Graphicsmagick for stitching a series of image files into one image file.

Barcode works in a series of simple steps:

  • Accept a series of parameters
  • Request images that match the parameters from FT's Origami Image Service
  • Combine all selected images into a single image
  • Save and return that image

We wanted to experiment with the display of the images simpler than Stretched Images so we created a series of parameters to explore barcode images. Some of the initial parameters we added were quite basic:

  • width - width of final image
  • height - height of final image
  • dateFrom - date to start image selection from
  • dateTo - date to end image selection on

Then we started to come up with some more exploratory features we could mix-n-match for different results:

  • orientation - stack images horizontally or vertically
  • fit - gets images as masks, squashed or solid colour
  • order - display order of images (published/colour)
  • sort - sort order by asc or desc

We used rgb-hex, Color Sort & Image Average Color to help with the colour conversion, organising and sorting.

Finally we created a service you could query and get some rather interesting images back.

Horizontal block 'average image colour' lines, sorted by colour
Vertical masked lines, sorted by published date
Vertical stretched images, sorted by published date

Lessons learned

A couple of times during the project we planned to implement a feature thinking “this will be quite simple to add”, and quickly realised it was more complex, but not impossible, than we thought.

  • Creating stretched images - replicating CSS background-position: cover with an image library turned out to be an expensive operation. Thankfully the Origami team gave us a big hand and added an option to the their image service we could use
  • Image generation speeds - - resizing a series of images can be quite process-intensive, we had to try out a few libraries before settling on Sharp
  • Extract pixel colour from an image? Easy...right?- not so much. Thought we could do this without a library, gave up and used Image Average Color insead
  • Sort by colour? Easy...right? - again, not so much. The initial thought was to add up the RGB values and sort by the resulting sum however the result didn't look sorted at all. Color Sort to the rescue
  • Generation timeouts - Some images take longer than average to generate resulting in test server timeouts. We added optimisations to improve the speed of the process
  • Concurrency issues - Our test server refreshes every 24 hours, wiping any generated/cached images. When the server restarts it is vulnerable to a race condition where we could end up with 2 or more items being processed concurrently, as multiple images are requested (with no available cache). We're currently looking into using a mutex flag to restrict access to the queue

Next steps/future creep

Throughout the creation of Barcode we were pleasantly plagued by new ideas and features for the service. This is commonly known as feature creep. While quite a few feature creeps made it into the project, there are some that we'd still like to explore:

  • Adding a greater variety of colour sorting algorithms
  • Add the option to search for images based on other tags, such as genre
  • Use a facial recognition service to identify images of a given person and return a barcode/mosaic of that person
  • Post daily images to a social media thread, "Barcode of the day"
  • Generate an animation (gif or video) of a series of Barcode images, e.g. a month of Barcode images

We've also had a few suggestions from other teams that would be interesting to have a look into.

So perhaps we’re not quite finished with Barcode yet...