Time-stamping, time-coding, and spotting are all crucial parts of audio and video workflows, especially for captioning and subtitling services and translation. The terms also get confused with each other, not least because they refer to similar processes – with critical differences. Knowing these differences, and their application, is crucial for the success of video localization projects.
This post will explain time-stamping, time-coding and spotting, how they’re used, and what you need to know to source the correct one for your project.
[Average read time: 5 minutes]
In general, all three processes involve one thing – adding some kind of timing information to a transcription of the voice-over or other sound elements in audio or video files. What kind of information is added, as well as how it’s added, is what makes them different.
If you’re unfamiliar with time-codes in general, you should check out our previous post, What are time-codes? Do they matter? (Yes!) – it’s good reading before delving into this post, in particular if you don’t understand time-code formats, which will be discussed at length.
This refers to the process of adding timing markers – also known as time-stamps – to a transcription. The time-stamps can be added at regular intervals, or when certain events happen in the audio or video file. Usually the time-stamps just contain minutes and seconds, though they can sometimes contain frames or milliseconds as well.
It’s easier to understand if you see it. The following example is a transcription from an interview video. Note that there are red time-stamps indicating when the speaker changes, at 12, 18 and 24 seconds:
These time-stamps are event-based, meaning that they are inserted every time a specific event happens in the video. In this case that event is a “change of speaker,” but it can be many things, like time-stamping every question in an interview, or time-stamping when a specific person appears in the footage, or when someone speaks in a foreign language, etc. Time-stamps can also be in intervals, as in the following example:
Here, the time-stamps just note when the footage gets to the 30-second and 1-minute mark – these particular ones are are added in 30-second intervals.
Both kinds of time-stamps – event and interval-based – are useful for finding content in the source footage. For example, on documentaries directors write editing instructions on the time-stamped transcriptions, which editors can then use to find the footage relatively easily. Same thing with court transcriptions of testimony – lawyers can use the time-stamps to find the exact audio and video they want to play for a jury.
Remember that time-stamps are generally used to help humans navigate through long audio and video files – therefore, they don’t have to be more accurate than minutes and seconds, and they’re attuned to what people actually need to find. Because of this, time-stamp content, interval length, and format can vary, sometimes substantially.
Time-coding refers to gathering actual timing information as well. Unlike time-stamps, though, time-codes are always frame-accurate – that is to say, they also contain frames or milliseconds, and follow a particular format.
Time-codes are often gathered for captioning and subtitling, it’s crucial to note that this isn’t always the case. They have multiple uses, usually ones that require frame-accurate or millisecond-accurate timings. For example, in lip-sync dubbing, gathering the time-codes of the English-language loops can help sound engineers lay down markers quickly – but again, only if the time-codes are in the correct, frame-accurate format.
For more information about loops check out our previous post, What is an audio loop? Why is filenaming crucial to localization?)
The following example adds loop marker time-codes to our above script.
Note that this time-coded script could be used in the studio for an audio voice-over production – however, it could not be used for a subs project.
Spotting is collecting time-codes and formatting them specifically for video subtitle services or closed captioning. This means that not only must the time-codes be in the right format, but that the final time-coded file must be formatted also so that it can be used as a captioning or subtitling file.
In the following example, we’ve spotted the interview transcription from above into the SRT format:
Note that this spotted file has time-codes for both when then caption should appear and when it should disappear, and that it adheres to a strict segmentation. It also breaks down the voice-over transcript into smaller sections, and adds line-breaks, which some are preferable for some languages. (For example, most Japanese subtitles deliverables add them.)
To start, from the fact that the three terms have much in common. After all, spotting is just a very specific kind of time-coding, and time-coding is most commonly done specifically for spotting projects. In fact, time-coding and spotting are almost always used interchangeably, though spotting is more prevalent in entertainment contexts, while time-coding is more common in online and e-Learning contexts. That said, it’s still crucial to understand the difference to avoid issues.
In particular, remember that time-stamps standards used for online media platforms – especially for e-Learning and web animations – will not work for spotting. For example, some video systems and programs like Brainshark can import timings lists, applying these time-stamps to animation elements. Same for PowerPoint – users can adjust timings in the Animations panel, usually to align each one to an audio voice-over element. The issue with those time-stamps is that they’re not time-codes – meaning that they’re not frame-accurate, and can’t really be leveraged for spotting projects.
There are three ways to do that:
Specificity is the key here. Professional transcription studios like JBI can adjust to most client requests. More importantly, they can come up with a workflow test, to make sure that the time-stamped, time-coded or spotted files will work fully with your final integration. Remember that this requires a little time up front to test the workflows, and as well to adjust them if any issues come up. But it’s well worth it – as with most audio and video translation, proper preparation is the best way to keep projects on time, on scope, and on budget.