BBC newsHack: Multilingual newsrooms

Last month, we took part in another BBC newsHack, focused on multilingual tools. Having already experimented with automated article translations earlier this year, we thought it might be a good opportunity to expand our exploration of the topic. Along with a cross-disciplinary team from various departments of the FT — and some new acquaintances made on the day — we set out to bring new translations ideas to life.

Four fine folk from from three different @FT teams have combined to tackle the 'Tools for multilingual newsrooms' #newshack organised by @bbc_connected & @BBC_News_Labs:

@lily_2point0, @jamiebrownco, @joannaskao, @MHA_076

(https://t.co/lUkLbB9O3e)
— FT Labs (@FTLabs) June 11, 2019

Forming an idea

Early ideas for the hack on Post-It notes

Exercises at the beginning of the first day proved really useful to refine our hack. We started with a POINT (Problem, Opportunity, Insight, Need, Theme) system, then formulated ideas starting with “how might we”, carrying on with dot voting and a rapid generation of 8 ideas; to finally settle on one after another round of voting.

After some refinement and regrouping of ideas, the team came up with 8 propositions:

Connecting readers via voice devices
Translating article comments
Picture drawing about article/concept. “Picture is worth 1000 words”
Highlight text and ask group for translation
Side-by-side thread of a conversation conducted in different languages
Emoji translation -- Universal language?
Linking phrases/idioms that have a very local meaning to a source of explanation automatically
Filtering

Fairly soon, propositions started converging towards a core idea: creating a community around languages; an ideal way to break language barriers by discussing and correcting machine translations.

MetaCrowd

What it is?

MetaCrowd is a commenting system specifically focused on translated content. Poor or unclear translations can be highlighted by readers in articles. The crowd (readers and journalists alike) can respond, comment and discuss, improving readability. A voting feature identifies the most recommended and useful translation for a phrase/keyword, which allows to update the translation for this specific article.

Demo

Who is it for?

The tool is primarily for non-native speakers of English who read the news and might use machine translation services to get clarification on sticky points (idioms, mistranslations, etc.) It also relies on a community of multilingual people/journalists using the same platform to pitch in and answer queries. The main user flow can be seen below:

1. French speaker translates English article and seeks clarity on a specific phrase

2. Native English speaker who is also fluent in French responds to query

3. French speaker views response to query, and provides a rating

Technical challenges

Putting a working prototype together in a couple of days came with its challenges. If we had the basic translation framework in place, our efforts were focused on setting up yet another comments system. We are conscious of all the (multilingual) moderations issues that might entail, and for the sake of the prototype, assumed reporting systems were in place. Much more work would be needed in that area for a client-facing product.

Synchronising text highlights proved to be a bit of a hack. Since translations are presented side-by-side with the original text, we thought it would be useful to highlight both sides when text gets selected. We chose text that suited the example; but a word-for-word match is impossible in most cases. There is no cost-effective solution to match each word with their translation, as it would require repeated calls to the translation API, although NLP might be an option?

One challenge was with the organisation of the team itself. With participants from outside the FT, we couldn’t share access to our systems, and therefore find the right balance of tasks to share. It seems to have worked out quite well in the end.

With a polished presentation which finished *within* the allotted 4 mins, the Metacrowd team included a working demo, and started by getting right to the point... pic.twitter.com/jqBqhzVozn
— FT Labs (@FTLabs) June 13, 2019