3 Tips for Captioning & Subtitling Languages with Accented Characters

The 3 Elements of Marketing Transcreation Voice-Over & Video Production
November 14, 2018
Why Script Timing Is Crucial for Marketing Spot Voice-Over & Dubbing
November 28, 2018

Diacritic marks – often called accents – are a feature of languages as diverse as Korean, Hindi, Thai and French. Despite their prevalence, however, they can pose challenges for captioning and subtitling projects. So what do multimedia localization professionals need to know to avoid issues with accented characters in their burned-in and text subtitle deliveries?

This post will list three tips to help you when captioning & subtitling languages with accented characters.

[Average read time: 3 minutes]

What exactly are diacritic marks, or accents?

They are marks added to a letter or glyph, usually to change the way it’s pronounced. For example, in French, the cédille diacritic mark turns the letter c into a ç, which is pronounced as a soft “ess” before the vowels a, o and u – as in the word garçon, which is pronounced “gahr-ssohn.”  They can also be used to differentiate words that would otherwise be spelled the same – for example, “como” and “cómo” in Spanish, which would translate to “as” and “how” respectively. There are various other uses, but in general a diacritic mark changes the properties of the letter or word to which it’s added.

Why are they a challenge for captioning & subtitling localization?

For one main reason – that English doesn’t use diacritic marks. Loan words like café and resumé have them, but that’s about it, making English an anomaly among European languages. At the same time, much of the post-production captioning and subtitling technology in use today was initially developed in the US, and early versions of it supported only the very limited English-language character set.

This has been rectified for the most part with the advent of Unicode and increased language support, including locale-specific keyboards and robust font sets. But languages with diacritics can still trip up multimedia localization professionals on captioning and subtitling projects, especially when the source language is English. Here’s what you need to know to avoid issues with these languages.

1. Be careful with captioning & subtitling font support.

Not all fonts are created equal – certainly not in terms of character support. Most fonts don’t support the full Unicode character set, in fact. And moreover, fonts that support a specific script or language don’t always support the full complement of diacritic marks. This is a common issue in Hindi subtitling, for example – some fonts created to support the Devanagari script have issues with letters that have multiple diacritic marks added to them. This is true as well for other languages that use the Devanagari script, like Marathi and Nepali. Make sure that any fonts you use support not just the character set of your language, but also all the configurations of diacritic marks it may employ.

For more information, see our previous post What You Need to Know for Captioning & Subtitling in Indian Languages.

2. Make sure the leading is right for your language.

Most diacritic marks are placed above or below letters. That means that languages that use diacritics generally require more “leading,” or space between lines. For example, note how the English subtitle that follows has enough space between lines, but the corresponding Spanish subtitling line doesn’t:


This issue is even more pronounced when subtitling languages that “stack” multiple diacritic marks on letters, as well as languages that place diacritic marks both above and below letters, like Arabic. Arabic subtitling, for example, requires what can seem to an English speaker like an unusually large amount of leading just to avoid diacritic marks overlapping.

3. Be extra careful when implementing minor linguistic tweaks.

Because accents get added to letters, it can be very easy to delete them without realizing it, especially when English speakers make minor linguistic tweaks based on QA or in-country reviews. They can delete a letter but not all its accents, or vice versa, or even copy and paste strings without their full set of diacritics – particularly easy to do on right-to-left languages like Arabic and Hebrew. In fact, this is a common issue for video localization in general, affecting on-screen titles replacement and e-Learning course integration. Be extra careful if tweaks have to be made directly in subtitle files by non-native speakers.

Rigorous captioning & subtitling QA is critical

Finally, it’s absolutely essential to implement a QA review process when captioning and subtitling into languages with heavy accenting. It’s the only way to ensure that no character issues creep in from human error during translation/review, or template issues during implementation, or even from the specs in deliverable text files. This is best practice for all audio and video localization, of course, and something that JBI Studios provides on every single subtitling project, as well as on voice-over and dubbing productions. Make sure it’s part of all your productions as well.