A Comprehensive Look at HTML5’s track Element

Ankul Jain
Share

Video and audio are a mundane affair these days on the web. The internet isn’t simply text based anymore. Video contributes to almost 40% of search results. From a user’s point of view, multimedia is an interactive and amusing session. On the other hand, for a developer, enhancing the experience was an arduous thing to do until HTML5’s <track> element came to the rescue.

After reading this article you’ll understand how to add timed media tracks like subtitles to your media files. Also, you’ll get to know how these add to the SEO value of the web page. Before moving ahead, you might want to go through an intro on HTML video and audio, explaining container formats, codecs, markup and the basics.

The <track> Element

The <track> element defines any text that you want to display along with the playing media file. Text may include subtitles, captions, descriptions, chapters or metadata.

In other words, the <track> element allows you to specify additional time-synchronized text resources that align with the audio element’s audio files timeline or the video element’s video files timeline.

The track element is an empty element, i.e. it must not have a closing tag (meaning it’s a void element). It must be contained inside <video> or <audio> tags. Also, if there is any <source> element inside the video or audio tags, then the <track> element should appear after the <source> element.

For example:

<video width="640" height="320" controls>
  <source src="some_video.mp4" type="video/mp4">
  <source src="some_video.ogg" type="video/ogg">
  <track src="some_video_subtitles.srt"
         kind="subtitles"
         srclang="en"
         label="English_subs">
</video>

The src Attribute

This attribute specifies the source address for the text file that contains the track data, and is naturally a required attribute. The value should be an absolute or relative URL. This means the files need to be put on a web server; the track element cannot be used from a file:// URL.

For example:

<track src="video_captions.srt">

The srclang Attribute

srclang defines the language of the time-tracked data. This attribute must be included if the kind attribute is set to a value of subtitles (more on that below). The value of the srclang attribute must be a valid BCP 47 language tag. For instance, the value hi represents Hindi and en is used for English. There are about 8,000 language subtags available.

<track src="video_subtitles.srt" kind="subtitles" srclang="en">

THe above example code specifies the language of the text timed track file as English.

The kind Attribute

This attribute defines the kind of track we want to add. It may contain one of a number of values, each explained below.

subtitles
These are usually the translation of dialogue being played in the video or audio file. These come in handy when the user is unable to understand the language being played in the soundtrack, but when the user can read the dialogues written in the preferred language. Specifying the language of the source is mandatory. This is done by adding the appropriate value to the srclang attribute:

<track src="video_subtitles.srt" kind="subtitles" srclang="en">

captions:
A caption is a brief description accompanying the video being played. These are appropriate in a case where we want to inform the user of some relevant information or even when the sound is not clear or inaudible. A simple example.

<track src="video_captions.srt" kind="captions">

descriptions:
As the name suggests, these are used to describe the media content. This is appropriate when the user is blind or video is not available or is obscured. Timed tracks that are marked as descriptions are usually synthesized as a separate audio track. For example:

<track src="video_descriptions.srt" kind="descriptions">

metadata
This is intended for timed tracks that offer metadata content. It is not normally displayed by the user agent. The metadata value is meant to be used by a script like JavaScript. Here’s an example:

<track src="video_metadata.srt" kind="metadata">

chapters
This represents a chapter title, intended to be used when the user is navigating the media resource. The tracks that are marked as chapters are usually displayed with an interactive list on the user agent’s interface.

<track src="video_chapters.srt" kind="chapters">

The label Attribute

This attribute is used to define the title of the text track included in the video/audio file. It is used by the browser while listing the available text tracks and is a user-readable label of the text track.

If you add the label attribute to the <track> element, the value of the label attribute cannot be an empty value, or else the code will not validate. The value must be a string. If this attribute is not present, the browser will assign a default value like “untitled”.

<track src="video_subtitles.srt"
       kind="subtitles"
       srclang="en"
       label="English_subtitles">

In the example above we can see that the label attribute contains a value of “English_subtitles”.

The default Attribute

This is a Boolean attribute, used to identify a track as the default track. Obviously, default can be included with only one <track> element. It can be enabled if the user’s preferences do not indicate that another track would be more appropriate.

The following example explains that out of the subtitles specified in the three languages present (Hindi, English, and Japanese), the subtitles in Hindi are instructed to be played as the default track:

<track kind="subtitles" src="video_subtitles_hi.srt" srclang="hi" default>
<track kind="subtitles" src="video_subtitles_en.srt" srclang="en">
<track kind="subtitles" src="video_subtitles_ja.srt" srclang="ja">

Now that we’ve examined most of the relevant attributes you can use with the <track> element, a complete example summarizing the usage of <track> along with the video and source elements is shown below:

<video src="sample.ogv">
  <source src="some_video.mp4" type="video/mp4">
  <source src="some_video.ogg" type="video/ogg">
   <track kind="captions" src="video_captions.srt" srclang="en">
   <track kind="descriptions" src="video_desciptions.srt" srclang="en">
   <track kind="chapters" src="video_chapters.srt" srclang="en">
   <track kind="subtitles" src="video_subtitles_en.srt" srclang="en" default>
   <track kind="subtitles" src="video_subtitles_oz.srt" srclang="oz">
   <track kind="metadata" src="video_metadata1.srt" srclang="en" label="Metadata 1">
   <track kind="metadata" src="video_metadata2.srt" srclang="en" label="Metadata 2">
</video>

SEO Benefits of Media Tracks

The <track> element has completely opened the doors of video Search Engine Optimization (SEO) and has allowed for a more cost effective way to get your video files better understood by search engines.

The search engines are becoming more multimedia aware and the more information you can attach to your files means better targeted and higher traffic that will convert into income for you.

Some of the key SEO advantages are:

  • Improves your search presence: Search engines crawl any text content associated with the video while searching.
  • Deep Linking: Search engines return search results that point to a specific part of a video associated with the time codes.
  • Associated content: The text files can be easily incorporated into associated text content on the same page.
  • Accessibility and UX: The subtitles and captions improve the usability and accessibility for those with disabilities.
  • Thumbnail in search results: A video page has the advantage of being displayed as a rich snippet with a thumbnail in search results which can increase click-through rates.

More info on the SEO benefits of video transcripts and captions can be found in this article.

Browser Support for <track>

All things considered, the <track> element has excellent browser support:

  • Chrome
  • Firefox 31+
  • IE10+
  • Safari 6+
  • Opera 15+

Demos and Conclusion

To see the <track> element in action here are two links:

The <track> element standardizes adding track data to media files. It enables the use of dynamic content linked to media playback, which in turn adds value to the audio and video elements and has the potential for SEO benefits.