Streams and MediaStreamTracks

Editor’s Draft,

This version:
https://wicg.github.io/streams-mediastreamtrack
Issue Tracking:
GitHub
Inline In Spec
Editors:
(Google Inc.)
(Google Inc.)

Abstract

This document describes an API providing ReadableStreams (and associated data types) out of MediaStreamTracks.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

Streams ([streams]) are designed to provide real time streams of data with powerful semantics (e.g. built-in backpressure and queuing) to allow users to build higher-level abstractions. MediaStreamTracks ([getusermedia]) are opaque handles to Real-Time video/audio being transported in the browser. This document describes the ways in which ReadableStreams can be created out of a MediaStreamTrack.

Please see the Readme/Explainer in the repository for use cases and more rationale.

2. MediaStreamTrack API extension

partial interface MediaStreamTrack {
  // |any| should be ReadableStream, but that is not an idl type.
  [CallWith=ScriptState] readonly attribute any readable;
};
readable, of type any, readonly
Constructs a ReadableStream out of the MediaStreamTrack following the MediaStreamTrack lifetime. A ReadableStreamReader created out of this will produce VideoFrames.

3. VideoFrame

typedef (Uint8Array or FrozenArray<Uint8Array>) VideoFrameDataArray;

interface VideoFrame {
  readonly attribute VideoFrameDataArray data;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute PixelFormat format;

  readonly attribute DOMHighResTimeStamp timecode;
};
When format == rgba, VideoFrame is just an HTML Canvas 2D Context §imagedata with a timecode. How to represent that in WebIDL?
data, of type VideoFrameDataArray, readonly
Consider using Web IDL §ArrayBufferView ("used to represent objects that provide a view on to an Web IDL §ArrayBuffer.") -- it might expose too many format combinations though.
width, of type unsigned long, readonly
Actual horizontal dimension of the data in the data object, in pixels.
height, of type unsigned long, readonly
Actual vertical dimension of the data in the data object, in pixels.
format, of type PixelFormat, readonly
This attribute specifies the concrete pixel format of data; rgba is equivalent to the one of HTML Canvas 2D Context §imagedata.
timecode, of type DOMHighResTimeStamp, readonly
The difference between the timestamp of the first generated chunk of data in VideoFrame and the timestamp of the first chunk in the first VideoFrame produced by this reader. Note that the timecode in the first produced VideoFrame does not need to be zero.

3.1. PixelFormat

enum PixelFormat {
  "rgba",
  "yuv420",
};
rgba
Specifies one-dimensional data array in RGBA order, as integers in the range 0 to 255. This is the same format as the one in HTML Canvas 2D Context §imagedata.
yuv420

4. Examples

4.1. VideoFrame reading and casting onto a <canvas>

// Assuming |theCanvas| and |theStream| exist already.

let context = theCanvas.getContext("2d");

let track = theStream.getVideoTracks()[0];

track.readable.pipeTo(new WritableStream({
  write(videoFrame) {
    console.assert(videoFrame.format == "rgba");
    if (videoFrame.format != "rgba")
      return;

    theCanvas.width  = videoFrame.width;
    theCanvas.height = videoFrame.height;
    context.putImageData(videoFrame, 0, 0);
  }
  , close() {
    console.log("All data successfully read!");
  }
  , abort(e) {
    console.error("Uh, oh, something went wrong: ", e);
  }
}));

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[GETUSERMEDIA]
Daniel Burnett; et al. Media Capture and Streams. URL: https://www.w3.org/TR/mediacapture-streams/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[STREAMS]
Domenic Denicola; 吉野剛史 (Takeshi Yoshino). Streams Standard. Living Standard. URL: https://streams.spec.whatwg.org/
[WebIDL]
Cameron McCormack; Boris Zbarsky; Tobie Langel. Web IDL. URL: https://heycam.github.io/webidl/

IDL Index

partial interface MediaStreamTrack {
  // |any| should be ReadableStream, but that is not an idl type.
  [CallWith=ScriptState] readonly attribute any readable;
};

typedef (Uint8Array or FrozenArray<Uint8Array>) VideoFrameDataArray;

interface VideoFrame {
  readonly attribute VideoFrameDataArray data;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute PixelFormat format;

  readonly attribute DOMHighResTimeStamp timecode;
};

enum PixelFormat {
  "rgba",
  "yuv420",
};

Issues Index

When format == rgba, VideoFrame is just an HTML Canvas 2D Context §imagedata with a timecode. How to represent that in WebIDL?
Consider using Web IDL §ArrayBufferView ("used to represent objects that provide a view on to an Web IDL §ArrayBuffer.") -- it might expose too many format combinations though.