Everyone’s heard of Six degrees of separation, often through popular derivatives like the Bacon number or the Erdős number. The FT links people together all the time by writing about them in the same article. What can we do with that?
We set out to find some interesting use cases for so called ‘co-occurrence data’: instances of two or more people being tagged in the same article. Working with Mattijas Larsson from our product team, we came up with two ideas that we explored in more detail.
We called our project ‘Six degrees of Angela Merkel’, because everyone featured in the FT seems to be connected to her somehow. Our first prototype is a route finding engine that will plot a path through our content to find the strongest link between two individuals.
For example, we can link Taylor Swift to Pope Benedict via this chain:
- Taylor Swift
- Russell Brand
- Tony Blair
- Ban Ki-Moon
- Pope Francis
- Pope Benedict
Looking at the links is fascinating. Swift and Jay-Z have both been associated with trying to change the way digital music is sold – in the former case most notably by pulling her tracks from Spotify and for the latter’s troubled music streaming service Tidal. The link between Jay-Z and Russell Brand is shakier – they have both been customers of jeweller Stephen Webster, among other relatively tenuous connections.
Brand and Blair are best buds in comparison, appearing together in plenty of content, though their relationship is not exactly a collaboration! Blair and Ban-Ki Moon are both long associated with the peace process in the middle east, Pope Francis weighed in on climate change which connected him to the UN secretary general, and finally the two popes are frequently mentioned together in articles about Vatican policy on subjects such as gay rights, feminism, and the EU.
The longest chain we’ve discovered (where there was no shorter route available linking the two individuals) is 13 links long, though it changes all the time as the application uses a rolling window of the last 100 days of metadata.
Person in the news
Our second project moves through time one day at a time, showing who we wrote about most during that period, and who they are connected to. Using the D3.js graphing library, the visualisation show how the news agenda changes – connections are forged, profiles are raised, and when events pass, people fade from the headlines.
This is intended as a non-interactive exhibition piece, to be displayed on large screens around the FT building.
Having started with a large set of potentially interesting but under-used co-occurrence metadata, readily gleaned from recent site articles, covering an assortment of ontologies such as people, organisations, regions, etc, we settled on two very different ways of viewing the data.
Merkel Chain emphasises the articles themselves, and the possibility of stepping from one person in the news to (almost) any other person in the news. Part serendipity, part research, this is an engaging tool, combining the initial jolt of interest when the chain is unveiled with the articles that explain each link in the chain, inviting the user to dive into the site to read all about it.
The animated graph of ‘Person in the news’ emphasises the dynamic ebb and flow of connections between the leading players in the news. It is not surprising that they are connected, but it is interesting how the connections change, and that not all of the main players are actually connected.
This project lasted a mere two and a half weeks, and it just hints at the rich seam of metadata we have and the multifarious possibilities arising from it. Factor in the new, richer, and massively detailed, structured ontologies coming along, and the number of possible use-cases explode.