The Web Receiver SDK supports three types of streaming protocols today:
DASH, HTTP Live Streaming, and Smooth Streaming.
In this document we list our support for each of the streaming protocols. Note the explanation of supported tags for each protocol is quite abbreviated compared to the detailed protocol spec. The goal is to provide a quick glimpse and understanding of how to use each protocol, and which features of the protocol are supported on Cast enabled devices to deliver their streaming experiences.
Dynamic Adaptive Streaming over HTTP (DASH)
ISO's detailed specification of DASH.
DASH is an adaptive bitrate streaming protocol that enables high quality video
streaming through HTTP(S) servers. A manifest, composed in XML, contains most
of the metadata information for how to initialize and download the video
content. The key concepts that the Web Receiver Player supports are <Period>
,
<AdaptationSet>
, <Representation>
, <SegmentTemplate>
,
<SegmentList>
, <BaseUrl>
, and <ContentProtection>
.
A DASH manifest starts with a root <MPD>
tag and inside includes one or
more <Period>
tags, which represent one streaming content.
<Period>
tags allow ordering of different pieces of streaming content
and are often used to separate main content and advertisement or multiple
consecutive video contents.
An <AdaptationSet>
under <MPD>
is a set of representations for
one type of media stream, in most cases video, audio, or captions. The most
commonly supported mimetypes are "video/mp4", "audio/mp4", and "text/vtt". An
optional <ContentComponent contentType="$TYPE$">
can be included
under <AdaptationSet>
.
Inside each <AdaptationSet>
a list of <Representation>
tags should
be present and the Web Receiver Player uses the codecs
information to
initialize MSE source buffer and the bandwidth
information to
automatically choose the right representation/bitrate to play.
For each <Representation>
, media segments are described using either
a <BaseURL>
for single segment representation, <SegmentList>
for
list of segments (similar to HLS), or <SegmentTemplate>
.
For a <SegmentTemplate>
, it indicates how initialization segment and
media segments can be represented through templating. In the below example
$Number$
indicates the segment number as available from the CDN. So it
translates to seg1.m4s, seg2.m4s, etc. as playback continues.
<MPD xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:ns2="http://www.w3.org/1999/xlink"
profiles="urn:mpeg:dash:profile:isoff-live:2011,http://dashif.org/guidelines/dash264" type="static"
publishTime="2016-10-05T22:07:14.859Z" mediaPresentationDuration="P1DT0H0M0.000S" minBufferTime="P0DT0H0M7.500S">
<Period id="P0">
<AdaptationSet lang="en" segmentAlignment="true">
<ContentComponent id="1" contentType="audio"/>
<SegmentTemplate media="seg$Number$.m4s" initialization="seginit.mp4"
duration="10000" startNumber="1" timescale="1000" presentationTimeOffset="0"/>
<Representation id="1" bandwidth="150123" audioSamplingRate="44100"
mimeType="audio/mp4" codecs="mp4a.40.2" startWithSAP="1">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/>
<BaseURL>http://www.google.com/testVideo</BaseURL>
</Representation>
</AdaptationSet>
<AdaptationSet segmentAlignment="true">
<ContentComponent id="1" contentType="video"/>
<SegmentTemplate media="seg$Number$.m4s" initialization="seginit.mp4"
duration="10000" startNumber="1" timescale="1000" presentationTimeOffset="0"/>
<Representation id="1" bandwidth="212191" width="384" height="208" sar="26:27"
frameRate="25" mimeType="video/mp4" codecs="avc1.42c01f" startWithSAP="1">
<BaseURL>http://www.google.com/testVideo/bitrate1/</BaseURL>
</Representation>
<Representation id="1" bandwidth="366954" width="512" height="288" sar="1:1"
frameRate="25" mimeType="video/mp4" codecs="avc1.42c01f" startWithSAP="1">
<BaseURL>http://www.google.com/testVideo/bitrate2/</BaseURL>
</Representation>
<Representation id="1" bandwidth="673914" width="640" height="352" sar="44:45"
frameRate="25" mimeType="video/mp4" codecs="avc1.42c01f" startWithSAP="1">
<BaseURL>http://www.google.com/testVideo/bitrate3/</BaseURL>
</Representation>
</AdaptationSet>
</Period>
</MPD>
For a <SegmentTemplate>
, it’s common to use the <SegmentTimeline>
tag to
indicate how long each segment is and which segments repeat. A timescale
(units to represent one second) is often included as part of the attributes of
<SegmentTemplate>
so that we can calculate the time of the segment based on
this unit. In the example below, the <S>
tag signifies a segment tag, the
d
attribute specifies how long the segment is and the r
attribute
specifies how many segments of the same duration repeat so that $Time$
can be calculated properly for downloading the media segment as specified in
the media
attribute.
<SegmentTemplate>
timescale="48000"
initialization="$RepresentationID$-init.dash"
media="$RepresentationID$-$Time$.dash"
startNumber="1">
<SegmentTimeline>
<S t="0" d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
</SegmentTimeline>
</SegmentTemplate>
For representation using <SegmentList>
, here is an example:
<Representation id="FirstRep" bandwidth="2000000" width="1280"
height="720">
<BaseURL>FirstRep/</BaseURL>
<SegmentList timescale="90000" duration="270000">
<RepresentationIndex sourceURL="representation-index.sidx"/>
<SegmentURL media="seg-1.ts"/>
<SegmentURL media="seg-2.ts"/>
<SegmentURL media="seg-3.ts"/>
</SegmentList>
</Representation>
For a single segment file, a <SegmentBase>
is often used with byte
range requests to specify which part of a <BaseURL>
file contains the
index, and the rest can be fetched on demand as playback continues or a seek
happens. Here the Initialization
range specifies the init metadata range
and the indexRange
specifies the index for the media segments. Note that
right now we only support consecutive byte ranges.
<Representation bandwidth="4190760" codecs="avc1.640028"
height="1080" id="1" mimeType="video/mp4" width="1920">
<BaseURL>video.mp4<BaseURL>
<SegmentBase indexRange="674-1149">
<Initialization range="0-673" />
</SegmentBase>
</Representation>
Regardless of which representation is used, if the streams are protected, a
<ContentProtection>
section can appear under <AdaptationSet>
,
where a schemeIdUri
uniquely identifies the DRM system to use.
An optional key ID can be included for common encryption.
<!-- Common Encryption -->
<ContentProtection
schemeIdUri="urn:mpeg:dash:mp4protection:2011"
value="cenc"
cenc:default_KID="7D2714D0-552D-41F5-AD56-8DD9592FF891">
</ContentProtection>
<!-- Widevine -->
<ContentProtection
schemeIdUri="urn:uuid:EDEF8BA9-79D6-4ACE-A3C8-27DCD51D21ED">
</ContentProtection>
For more examples and details please refer to the MPEG-DASH specification. Below is a list of additional DASH attributes on tags not mentioned above that we currently support:
Attribute Name | Attribute Function |
---|---|
mediaPresentationDuration | How long the video content is. |
minimumUpdatePeriod | Attribute of the <MPD> tag; specifies how often we need
to reload the manifest. |
type | Attribute of the <MPD> tag; "dynamic" to indicate that
this is a live stream. |
presentationTimeOffset | Attribute of <SegmentBase> tag; specifies the
presentation time offset from the beginning of the period. |
startNumber | Specifies the number of the first media segment in a presentation in a period. This is often used in live stream. |
We also support recognizing EMSG box inside MP4 fragments for DASH and
provide an
EmsgEvent
to developers.
While our current Web Receiver Player supports the major DASH use cases, here is a list of common attributes that our current implementation of DASH ignores or does not use. This means regardless of whether the manifest contains them, they have no impact on the playback experience of the content.
- availabilityStartTime
- segmentAlignment
HTTP Live Streaming (HLS)
The overview and full spec of HTTP live streaming can be obtained here.
One of the key strengths of the Web Receiver Player is its ability to support playback of HLS in MSE. Different from DASH, where a manifest comes in a single file, HLS sends the master playlist containing a list of all the variant streams with their respective URL. The variant playlist is the media playlist. The two major HLS tags that the Web Receiver Player currently supports in the master playlist are:
Tag Name | Functionality |
---|---|
#EXT-X-STREAM-INF | Specifies a bitrate/variant stream. The BANDWIDTH attribute is
required which supports adaptive bitrate streaming selection. The
CODECS attribute is strongly recommended for initializing MSE, such
as "avc1.42c01e,mp4a.40.2" . If not specified, the default case is
set to H264 main profile 3.0 video and "mp4a.40.2" audio encoded
content. |
#EXT-X-MEDIA | Specifies additional media playlist (in the URI attribute) that
represents the content. These are usually alternative audio streams in other
format (5.1 surround sound) or language. An attribute of TYPE
containing either VIDEO , AUDIO ,
SUBTITLES , or CLOSED-CAPTIONS are allowed. Setting
the DEFAULT attribute to YES will indicate choosing
this alternative stream by default. |
Here is a list of HLS tags that the Web Receiver Player currently supports in the media playlist:
Tag Name | Functionality |
---|---|
#EXTINF | Stream information, usually followed by the duration of the segment in seconds, and on the next line the url of the segment. |
#EXT-X-TARGETDURATION | How long in seconds each segment is. This also determines how often we download/refresh the playlist manifest for a live stream. The Web Receiver Player does not support durations shorter than 0.1 sec. |
#EXT-X-MEDIA-SEQUENCE | The sequence number (often for a live stream) that the first segment in this playlist represents. |
#EXT-X-KEY | DRM key information. The METHOD attribute tells us what key
system to use. Today we support AES-128 and SAMPLE-AES
. |
#EXT-X-BYTERANGE | The byte range to fetch for a segment url. |
#EXT-X-DISCONTINUITY | Specifies a discontinuity between consecutive segments. This is often seen with server side ad insertion where an ad segment appears in the middle of the main stream. |
#EXT-X-PROGRAM-DATE-TIME | Absolute time of the first sample of the next segment, for example "2016-09-21T23:23:52.066Z". |
#EXT-X-ENDLIST | Whether this is a VOD or live stream. |
For live stream, we use #EXT-X-PROGRAM-DATE-TIME
and #EXT-X-MEDIA-SEQUENCE
as the key factors to determine how to merge a newly refreshed manifest. If
present, the #EXT-X-PROGRAM-DATE-TIME
is used to match the refreshed segments.
Otherwise, the #EXT-X-MEDIA-SEQUENCE
number will be used. Note that per the
HLS spec, we do not use file name comparison for matching.
Our HLS implementation supports selecting an alternative audio stream, such as
5.1 surround sound, as the main audio playback. This can be accomplished by
having an #EXT-X-MEDIA
tag with the alternative codecs as well as providing
the segment format in the stream configuration.
The Web Receiver Player expects certain per-spec behavior. For example, after a
#EXT-INF
tag, we expect a URI. If it’s not a URI, for example a
#EXT-X-DISCOUNTINUITY
will cause the parsing to fail for the playlist.
Every #EXT-X-TARGETDURATION
seconds, we reload the playlist/manifest to get
new segment lists and we update the new internal representation of all the
segments to the new one. Any time a seek is requested, we only seek within
the seekable range. For live, we only allow seeking from the beginning of the
newest list until a three target duration from the end. So for example,
if you have a 10 segment list, and you are on segment 6, you can only seek up
to 7, but not 8.
Segment format support
The CAF SDK supports playing content delivered in multiple formats as referenced
in HlsSegmentFormat
for audio and HlsVideoSegmentFormat
for video. This includes support for
packed audio
such as AAC and AC3 playback, both encrypted and non-encrypted. It is required
to specify this information in the MediaInformation
of the LoadRequestData
in order to properly describe your content to the player. If not specified, the
default player configuration will attempt to play the content as Transport
Stream packaged content. This property can be set from any of the senders in the
load request data (Android,
iOS
and Web)
or within the receiver through message interceptors.
Check out the sample code snippet below or the Loading media using contentId, contentUrl and entity guide for more information on how to prepare content on the Web Receiver.
playerManager.setMessageInterceptor(
cast.framework.messages.MessageType.LOAD, loadRequestData => {
...
// Specify segment format for an HLS stream playing CMAF packaged content.
loadRequestData.media.contentType = 'application/x-mpegurl';
loadRequestData.media.hlsSegmentFormat = cast.framework.messages.HlsSegmentFormat.FMP4;
loadRequestData.media.hlsVideoSegmentFormat = cast.framework.messages.HlsVideoSegmentFormat.FMP4;
...
return loadRequestData;
});
Content protection
As listed in the #EXT-X-KEY
tag section above, the Cast SDK supports
SAMPLE-AES
or SAMPLE-AES-CTR
where a URI to the key an initialization vector
can be specified:
EXT-X-KEY: METHOD=SAMPLE-AES, \
URI="data:text/plain;base64,XXXXXX", \
IV=0x6df49213a781e338628d0e9c812d328e, \
KEYFORMAT="com.widevine", \
KEYFORMATVERSIONS="1"
The KEYFORMAT
we support now is Widevine, and the URI contains a
BASE64 encoded DRM info XXXXXXX
which when decoded contains the key id:
{
"content_id": "MTQ1NjkzNzM1NDgxNA==",
"key_ids": [
"xxxxxxxxxxxxxxxx"
]
}
Version 1 defines the following attributes:
Attribute | Example | Description |
---|---|---|
KEYFORMATVERSIONS |
"1" |
This proposal defines key format version 1 |
KEYFORMAT |
"urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed" |
The UUID is the Widevine UUID from DASH IF IOP. The same exact string is used in MPD with Widevine encrypted streams. |
URI |
"data:text/plain;base64, <base64 encoded PSSH box>" |
URI of the stream containing the data type and PSSH box. |
METHOD |
SAMPLE-AES-CTR |
Indicates the encryption cipher used when encrypting the content. SAMPLE-AES signals that the content is encrypted using ‘cbcs’. SAMPLE-AES-CTR signals that the content is encrypted using one of the AES-CTR protections schemes, namely ‘cenc’. |
Attributes mapped to DASH MPD:
Attribute | Description |
---|---|
KEYFORMAT |
ContentProtection element’s schemeIdUri attribute. |
URI |
The content of cenc:pssh element. |
KEYID |
16-byte hexadecimal string encoding the key ID which has the same role as the default_kid in MPEG DASH. If using a hierarchical key scheme, this would be the "root" key. |
Example HLS Playlist with V2 Signaling:
#EXTM3U
#EXT-X-VERSION:6
#EXT-X-TARGETDURATION:2
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="init_segment.mp4"
#EXTINF:1.001,
output_video-1.mp4
#EXT-X-DISCONTINUITY
#EXT-X-KEY:METHOD=SAMPLE-AES,URI="data:text/plain;base64,AAAAPXBzc2gAAAAA7e+LqXnWSs6jyCfc1R0h7QAAAB0aDXdpZGV2aW5lX3Rlc3QiDHRlc3QgY29udGVudA==",KEYID=0x112233445566778899001122334455,KEYFORMAT="urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed",KEYFORMATVERSION="1"
#EXTINF:1.001,
output_video-2.mp4
#EXTINF:0.734,
output_video-3.mp4
#EXT-X-ENDLIST
Below is a list of features and tags in HLS that we currently do not use or support. Their presence or absence do not affect the streaming behavior.
RESOLUTION=
attribute in#EXT-X-STREAM-INF
is ignored.AUTOSELECT=
attribute in#EXT-X-MEDIA
is not used. Instead we rely onDEFAULT=
#EXT-X-I-FRAME-STREAM-INF
in master playlist is ignored.#EXT-X-DISCONTINUITY-SEQUENCE
is ignored#EXT-X-PLAYLIST-TYPE:EVENT
can be present in a live stream and#EXT-X-PLAYLIST-TYPE:VOD
can be present in a VOD stream, but currently our Web Receiver Player only relies on the existence of#EXT-X-ENDLIST
to determine live v.s. VOD.
Smooth streaming
Microsoft's official Smooth Streaming spec.
Smooth streaming provides adaptive streaming protocol and XML specification over HTTP (similar to DASH). Different than DASH, Smooth Streaming recommends only MPEG-4 packaging for the media segments.
Here is a table of the most common tags and attributes in Smooth Streaming that the Web Receiver Player supports today. Many concepts are already explained in the DASH section above.
Tag/Attribute | Usage |
---|---|
<SmoothStreamingMedia> | Main tag for the manifest, contains attributes of:
|
<StreamIndex> | One set of stream, similar to DASH’s AdaptationSet. The type is usually "text", "video", or "audio". The Url attribute usually contains a templated fragment URL using information like bitrate or start time. |
<QualityLevel> | Each QualityLevel tag specifies its Bitrate and a FourCC codec. The FourCC code are often ‘H264’, ‘AVC1’, ‘AACL’ etc. For video, it specifies its resolutions through MaxWidth and MaxHeight. For audio, it specifies its frequency (such as 44100) through SamplingRate and number of Channels. |
<c> | Stream Fragment Element. Contains:
|
<Protection> | A tag with the optional SystemID attribute listing the ID of the system DRM to use under <SmoothStreamingMedia> tag. |
<ProtectionHeader> | Under <Protection>, can contain an attribute of SystemID and custom data, usually Base64 encoded. For Widevine, it will contain the key id, the key length, the algorithm ID, such as AESCTR, the LA_URL (license acquisition url), LUI_URL (license user interface url), and DS_ID (domain service id). |
Content protection
To encode the protection system IDs properly, please use the below mapping:
- WIDEVINE: 'EDEF8BA9-79D6-4ACE-A3C8-27DCD51D21ED',
- CLEARKEY: '1077EFEC-C0B2-4D02-ACE3-3C1E52E2FB4B',
- MPEG_DASH_MP4PROTECTION: 'URN:MPEG:DASH:MP4PROTECTION:2011'
For <ProtectionHeader>
, below is an example with Base64 encoded data. The
data, when decoded, conforms to the same decoded format as described in the
DASH content protection support above.
<Protection>
<ProtectionHeader SystemID="9a04f079-9840-4286-ab92-e65be0885f95">
$BASE64ENCODED_DATA
</ProtectionHeader>
</Protection>
Below is an example of a live Smooth streaming manifest with a 3000 second duration of content:
<?xml version="1.0"?>
<SmoothStreamingMedia MajorVersion="2" MinorVersion="0" Duration="3000000000"
TimeScale="10000000" IsLive="TRUE" LookAheadFragmentCount="2" DVRWindowLength="600000000" CanSeek="TRUE" CanPause="TRUE">
<StreamIndex Type="text" Name="textstream301_swe" Language="swe" Subtype="CAPT" Chunks="0"
TimeScale="10000000" Url="QualityLevels({bitrate})/Fragments(textstream301_swe={start time})">
<QualityLevel Index="0" Bitrate="20000" CodecPrivateData="" FourCC="DFXP"/>
<c d="40000000" t="80649382288125"/>
<c d="39980000"/>
<c d="40020000"/>
</StreamIndex>
<Protection>
<ProtectionHeader> SystemID="$BASE64ENCODEDDRMDATA$"</ProtectionHeader>
</Protection>
<StreamIndex Type="audio" Name="audio101_eng" Language="eng" Subtype="AACL" Chunks="0"
TimeScale="10000000" Url="QualityLevels({bitrate})/Fragments(audio101_eng={start time})">
<QualityLevel Index="0" Bitrate="128000" CodecPrivateData="1290" FourCC="AACL" AudioTag="255"
Channels="2" SamplingRate="32000" BitsPerSample="16" PacketSize="4"/>
<c d="40000000" t="80649401327500"/>
<c d="40000000"/>
<c d="40000000"/>
</StreamIndex>
<StreamIndex Type="video" Name="video" Subtype="AVC1" Chunks="0" TimeScale="10000000"
Url="QualityLevels({bitrate})/Fragments(video={start time})">
<QualityLevel Index="0" Bitrate="400000" CodecPrivateData="000000016742E01596540C0EFCB808140000000168CE3880"
FourCC="AVC1" MaxWidth="384" MaxHeight="216"/>
<QualityLevel Index="1" Bitrate="800000" CodecPrivateData="00000001674D401E965281004B6020500000000168EF3880"
FourCC="AVC1" MaxWidth="512" MaxHeight="288"/>
<QualityLevel Index="2" Bitrate="1600000" CodecPrivateData="00000001674D401E965281B07BCDE020500000000168EF3880"
FourCC="AVC1" MaxWidth="854" MaxHeight="480"/>
<QualityLevel Index="3" Bitrate="2200000" CodecPrivateData="00000001674D401F96528080093602050000000168EF3880"
FourCC="AVC1" MaxWidth="1024" MaxHeight="576"/>
<c d="40000000" t="80649401378125"/>
<c d="40000000"/>
<c d="40000000"/>
</StreamIndex>
</SmoothStreamingMedia>
In the above example for the video stream, the url template is:
QualityLevels({bitrate})/Fragments(video={start time})
So the first two segments (assuming we are on index 2 quality level) will be the following, with initial time extracted from t="80649401378125" under the video StreamIndex and the increment of time of 4 seconds * 10000000 per segment:
QualityLevels(2)/Fragments(video=80649401378125) QualityLevels(2)/Fragments(video=80649441378125) ...
Here is a list of Smooth Streaming attributes that we currently ignore and have no effect on streaming experiences regardless of whether they are provided:
- CanSeek, CanPause in
<SmoothStreamingMedia>
tag. - Chunks, QualityLevels in
<StreamIndex>
tag. Instead, we calculate the number of segments and number of quality levels based on information provided inside<StreamIndex>
such as the actualQualityLevel
tag and the<c>
tags. - BitsPerSample, PacketSize in
<QualityLevel>
is not used.
Check display type
The canDisplayType
method checks for video and audio capabilities of the Web Receiver device and
display by validating the media parameters passed in, returning a boolean. All
parameters but the first are optional — the more parameters you include, the
more precise the check will be.
Its signature is canDisplayType(<em>mimeType</em>,<em>codecs</em>,<em>width</em>,<em>height</em>,<em>framerate</em>)
Examples:
Checks whether the Web Receiver device and display support the video/mp4 mimetype with this particular codec, dimensions, and framerate:
canDisplayType("video/mp4", "avc1.42e015,mp4a.40.5", 1920, 1080, 30)
Checks whether the Web Receiver device and display support 4K video format for this codec by specifying the width of 3840 and height of 2160:
canDisplayType("video/mp4", "hev1.1.2.L150", 3840, 2160)
Checks whether the Web Receiver device and display support HDR10 for this codec, dimensions, and framerate:
canDisplayType("video/mp4", "hev1.2.6.L150", 3840, 2160, 30)
Checks whether the Web Receiver device and display support Dolby Vision (DV) for this codec, dimensions, and framerate:
canDisplayType("video/mp4", "dvhe.04.06", 1920, 1080, 30)
DRM
Some media content requires Digital Rights Management (DRM). For media content
that has its DRM license (and key URL) stored in their manifest (DASH or HLS),
the Cast SDK handles this case for you. A subset of that content requires a
licenseUrl
which is needed to obtain the decryption key. In the Web Receiver, you can use
PlaybackConfig
to set the licenseUrl
as needed.
The following code snippet shows how you can set request information for license
requests such as withCredentials
:
const context = cast.framework.CastReceiverContext.getInstance();
const playbackConfig = new cast.framework.PlaybackConfig();
// Customize the license url for playback
playbackConfig.licenseUrl = 'http://widevine/yourLicenseServer';
playbackConfig.protectionSystem = cast.framework.ContentProtection.WIDEVINE;
playbackConfig.licenseRequestHandler = requestInfo => {
requestInfo.withCredentials = true;
};
context.start({playbackConfig: playbackConfig});
// Update playback config licenseUrl according to provided value in load request.
context.getPlayerManager().setMediaPlaybackInfoHandler((loadRequest, playbackConfig) => {
if (loadRequest.media.customData && loadRequest.media.customData.licenseUrl) {
playbackConfig.licenseUrl = loadRequest.media.customData.licenseUrl;
}
return playbackConfig;
});
If you have a Google Assistant integration, some of the DRM information such as
the credentials necessary for the content might be linked directly to your
Google account through mechanisms such as OAuth/SSO. In those cases, if the
media content is loaded through voice or comes from the cloud, a
setCredentials
is invoked from the cloud to the Cast device providing that
credentials. Applications writing a Web Receiver app can then use the
setCredentials
information to operate DRM as necessary. Here is an example of
using the credential to construct the media.
Tip: Also see Loading media using contentId, contentUrl and entity.
Audio channel handling
When the Cast player loads media, it sets up a single audio source buffer. At the same time, it also selects an appropriate codec to be used by the buffer, based on the MIME type of the primary track. A new buffer and codec are set up:
- when playback starts,
- at every ad break, and
- every time the main content resumes.
Because the buffer uses a single codec, and because the codec is chosen based on the primary track, there are situations where secondary tracks may be filtered out and not heard. This can happen when a media program's primary track is in surround-sound, but secondary audio tracks use stereo sound. Because secondary tracks are frequently used to offer content in alternate languages, providing media containing different numbers of tracks can have a substantial impact, such as large numbers of viewers being unable to hear content in their native language.
The following scenarios illustrate why it is important to provide programming where primary and secondary tracks contain the same number of channels:
Scenario 1 - media stream lacking channel parity across primary and secondary tracks:
- english - AC-3 5.1 channel (primary)
- swedish - AAC 2-channel
- french - AAC 2-channel
- german - AAC 2-channel
In this scenario, if the player's language is set to anything other than English, the user does not hear the track they expect to hear, because all two-channel tracks are filtered out during playback. The only track that could be played would be the primary AC-3 5.1-channel, and then only when the language is set to English.
Scenario 2 - media stream with channel parity across primary and secondary tracks:
- english - AC-3 5.1 channel (primary)
- swedish - AC-3 5.1 channel
- french - AC-3 5.1 channel
- german - AC-3 5.1 channel
Because this stream's tracks all have the same number of channels, an audience will hear a track regardless of the selected language.
Shaka audio channel handling
The Shaka player (DASH) defaults to a preferred channel count of two, as a mitigation measure when encountering media that lacks parity across secondary audio tracks.
If the primary track is not surround sound (for instance, a two-channel stereo track), then the Shaka player will default to two channels, and will automatically filter out any secondary media tracks that have more than two channels.
Shaka's preferred number of audio channels can also be configured by setting
the preferredAudioChannelCount
in the shakaConfig
property on
cast.framework.PlaybackConfig.
For example:
shakaConfig = { "preferredAudioChannelCount": 6 };
With the preferredAudioChannelCount
set to 6, Shaka Player checks to see if
it can support the surround sound codecs (AC-3
or EC-3
), and
automatically filters out any media tracks that do not conform to the preferred
number of channels.