1. Introduction
This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.
The captured media is formed into a MediaStream ([mediacapture-streams]), which can then be consumed by the various APIs that process streams of media, such as [WEBAUDIO], or [WEBRTC].
2. HTML Media Element Media Capture Extensions
This section defines a method captureStream() on HTMLMediaElements.
Both MediaStream and HTMLMediaElement expose the concept of a track. Since there is no common type used for HTMLMediaElement, this document uses the term track to refer to either a VideoTrack or an AudioTrack. MediaStreamTrack is used to identify the media in a MediaStream.
partial interface HTMLMediaElement { MediaStream captureStream(); };
2.1. Methods
captureStream()-
captureStream()method produces a real-time capture of the media that is rendered to the media element.The captured
MediaStreamcomprises ofMediaStreamTracks that render the content from the set ofVideoTrack.selected(forVideoTracks, or other exclusively selected track types) orAudioTrack.enabled(forAudioTracks, or other track types that support multiple selections) tracks from the media element. If the media element does not have a selected or enabled tracks of a given type, then noMediaStreamTrackof that type is present in the captured stream.A
<video>element can therefore capture a videoMediaStreamTrackand any number of audioMediaStreamTracks. An<audio>element can capture any number of audioMediaStreamTracks. In both cases, the set of capturedMediaStreamTracks could be empty.Unless and until there is a track of given type that is selected or enabled, no
MediaStreamTrackof that type is present in the captured stream. In particular, if the media element does not have a source assigned, then the capturedMediaStreamhas no tracks. Consequently, a media element with a ready state ofHAVE_NOTHINGproduces no capturedMediaStreamTrackinstances. Once metadata is available and the selected or enabled tracks are determined, new capturedMediaStreamTrackinstances are created and added to theMediaStream.A captured
MediaStreamTrackends when playback ends (and theendedevent fires) or when the track that it captures is no longer selected or enabled for playback. A track is no longer selected or enabled if the source is changed by setting thesrcorsrcObjectattributes of the media element. The steps instop()are performed on theMediaStreamTrackwhen it ends.The set of captured
MediaStreamTracks change if the source of the media element changes. If the source for the media element ends, a different source is selected.If the selected
VideoTrackor enabledAudioTracks for the media element change, anaddtrackevent with a newMediaStreamTrackis generated for each track that was not previously selected or enabled; and aremovetrackevent is generated for each track that ceases to be selected or enabled. AMediaStreamTrackMUST beendedprior to being removed from theMediaStream.Since a
MediaStreamTrackcan only end once, a track that is enabled, disabled and re-enabled will be captured as two separate tracks. Similarly, restarting playback after playback ends causes a new set of capturedMediaStreamTrackinstances to be created. Seeking during playback without changing track selection does not generate events or cause a capturedMediaStreamTrackto end.The
MediaStreamTracks that comprise the capturedMediaStreambecomemutedor notmutedas the tracks they capture change state. At any time, a media element might not have active content available for capture on a given track for a variety of reasons:- Media playback could be paused.
- A track might not have content for the current playback time if that time is either before the content of that track starts or after the content ends.
- A
MediaStreamTrackthat is acting as a source could bemutedor notenabled. - The contents of the track might become inaccessible to the current origin due to cross-origin protections. For instance, content that is rendered from an HTTP URL can be subject to a redirect on a request for partial content, or the enabled or selected tracks can change to include cross-origin content.
Absence of content is reflected in captured tracks through the
mutedattribute. A capturedMediaStreamTrackMUST have amutedattribute set totrueif its corresponding source track does not have available and accessible content. An event namedmuteis raised on theMediaStreamTrackwhen content availability changes.What output a muted capture produces as a result will vary based on the type of media: a
VideoTrackceases to capture new frames when muted, causing the captured stream to show the last captured frame; a mutedAudioTrackproduces silence.Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding a media element cause captured video to stop.
Captured audio from an element with an effective playback rate other than 1.0 MUST be time-stretched. An unplayable playback rate causes the captured audio track to become
muted.
3. HTML Canvas Element Media Capture Extensions
The captureStream() method is added to the HTMLCanvasElement. The resulting CanvasCaptureMediaStreamTrack provides methods that allow for controlling when frames are sampled from the canvas.
partial interface HTMLCanvasElement { MediaStream captureStream(optional double frameRequestRate); };
3.1. Methods
captureStream(optional double frameRequestRate)-
This method produces a real-time video capture of the surface of the canvas. The resulting
MediaStreamhas a single videoCanvasCaptureMediaStreamTrackthat matches the dimensions of the canvas element.Content from a canvas that is not origin-clean MUST NOT be captured. This method throws a
SecurityErrorexception if the canvas is not origin-clean.A captured stream MUST immediately cease to capture content if the origin-clean flag of the source canvas becomes false after the stream is created by
captureStream(). The capturedMediaStreamTrackMUST becomemuted, producing no new content while the canvas remains in this state.Each track that captures a canvas has an internal
frameCaptureRequestedproperty that is set to true when a new frame is requested from the canvas.The value of the
frameCaptureRequestedproperty on all new tracks is set totruewhen the track is created. On creation of the captured track with a specific, non-zeroframeRequestRate, the user agent starts a periodic timer at an interval of1/seconds. At each activation of the timer, theframeRequestRateframeCaptureRequestedproperty is set totrue.In order to support manual control of frame capture with the
requestFrame()method, browsers MUST support a value of 0 forframeRequestRate. However, a captured stream MUST request capture of a frame when created, even ifframeRequestRateis zero.This method throws a
NotSupportedErrorifframeRequestRateis negative.A new frame is requested from the canvas when
frameCaptureRequestedis true and the canvas is painted. Each time that the captured canvas is painted, execute the following steps, for each track capturing from the canvas:- If new content has been drawn to the canvas since it was last painted, and if the
frameCaptureRequestedinternal property of track is set, add a new frame to track containing what was painted to the canvas. - If a
frameRequestRatevalue was specified, set theframeCaptureRequestedinternal property of track tofalse.
When adding new frames to track containing what was painted to the canvas, the alpha channel content of the canvas must be captured and preserved if the canvas is not fully opaque. The consumers of this track might not preserve the alpha channel.
This algorithm results in a captured track not starting until something changes in the canvas.Parameter Type Nullable Optional Description frameRequestRate double✘ ✔ Return type:MediaStream - If new content has been drawn to the canvas since it was last painted, and if the
3.2. CanvasCaptureMediaStreamTrack
CanvasCaptureMediaStreamTrack is an extension of MediaStreamTrack that provide a single requestFrame() method. Applications that depend on tight control over the rendering of content to the media stream can use this method to control when frames from the canvas are captured.
interface CanvasCaptureMediaStreamTrack : MediaStreamTrack { readonly attribute HTMLCanvasElement canvas; void requestFrame(); };
3.2.1. Attributes
canvas, of type HTMLCanvasElement, readonly- The
HTMLCanvasElementelement being captured.
3.2.2. Methods
requestFrame()-
This method allows applications to manually request that a frame from the canvas be captured and rendered into the track. In cases where applications progressively render to a canvas, this allows applications to avoid capturing a partially rendered frame.
As currently specified, this results in no
SecurityErroror other error feedback if the canvas is not origin-clean. In part, this is because we don’t track where requests for frames come from. Do we want to highlight that?
4. Security considerations
Media elements can render media resources from origins that differ from the origin of the media element. In those cases, the contents of the resulting MediaStreamTrack MUST be protected from access by the document origin.
How this protection manifests will differ, depending on how the content is accessed. For instance, rendering inaccessible video to a canvas element causes the origin-clean property of the canvas to become false; attempting to create a Web Audio MediaStreamAudioSourceNode succeeds, but produces no information to the document origin (that is, only silence is transmitted into the audio context); attempting to transfer the media using RTCPeerConnection results in no information being transmitted.
The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn’t result in exposure of cross origin content.
5. Acknowledgements
This document is based on the stream processing specification [streamproc] originally developed by Robert O’Callahan.