1. Introduction
The API defined in this document captures images from a photographic device referenced through a valid MediaStreamTrack
. The produced image can be in the form of a Blob
(see takePhoto()
method) or as a ImageBitmap
(see grabFrame()
). Picture-specific settings can be optionally provided as arguments that can be applied to the device for the capture.
2. Image Capture API
The User Agent must support Promises in order to implement the Image Capture API. Any Promise
object is assumed to have a resolver object, with resolve()
and reject()
methods associated with it.
[Constructor(MediaStreamTrack track)] interface ImageCapture { readonly attribute MediaStreamTrack videoStreamTrack; Promise<Blob> takePhoto(); Promise<PhotoCapabilities> getPhotoCapabilities(); Promise<void> setOptions(optional PhotoSettings photoSettings); Promise<ImageBitmap> grabFrame(); };
takePhoto()
returns a captured image encoded in the form of a Blob
, whereas grabFrame()
returns a snapshot of the videoStreamTrack
video feed in the form of a non-encoded ImageBitmap
. 2.1. Attributes
videoStreamTrack
, of type MediaStreamTrack, readonly- The
MediaStreamTrack
passed into the constructor.
2.2. Methods
ImageCapture(MediaStreamTrack track)
-
Parameter Type Nullable Optional Description track MediaStreamTrack
✘ ✘ The MediaStreamTrack
to be used as source of data. This will be the value of thevideoStreamTrack
attribute. TheMediaStreamTrack
passed to the constructor MUST have itskind
attribute set to"video"
otherwise aDOMException
of typeNotSupportedError
will be thrown. takePhoto()
-
takePhoto()
produces the result of a single photographic exposure using the video capture device sourcing thevideoStreamTrack
, applying anyPhotoSettings
previously configured, and returning an encoded image in the form of aBlob
if successful. When this method is invoked:- If the
readyState
ofvideoStreamTrack
provided in the constructor is notlive
, return a promise rejected with a newDOMException
whose name isInvalidStateError
. -
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps:
-
Gather data from
videoStreamTrack
into aBlob
containing a single still image. The method of doing this will depend on the underlying device. - If the UA is unable to execute the
takePhoto()
method for any reason (for example, upon invocation of multipletakePhoto()
method calls in rapid succession), then the UA MUST return a promise rejected with a newDOMException
whose name isUnknownError
. - Return a resolved promise with the Blob object.
-
Gather data from
- If the
getPhotoCapabilities()
-
When
getPhotoCapabilities()
is used to retrieve the ranges of available configuration options and their current setting values, if any. When this method is invoked:- If the
readyState
ofvideoStreamTrack
provided in the constructor is notlive
, return a promise rejected with a newDOMException
whose name isInvalidStateError
. -
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps:
- Gather data from
videoStreamTrack
into aPhotoCapabilities
object containing the available capabilities of the device, including ranges where appropriate. The resolvedPhotoCapabilities
will also include the current conditions in which the capabilities of the device are found. The method of doing this will depend on the underlying device. - If the UA is unable to execute the
getPhotoCapabilities()
method for any reason (for example, theMediaStreamTrack
being ended asynchronously), then the UA MUST return a promise rejected with a newDOMException
whose name isOperationError
. - Return a resolved promise with the
PhotoCapabilities
object.
- Gather data from
- If the
setOptions()
-
setOptions()
is used to configure a number of settings affecting the image capture and/or the current video feed invideoStreamTrack
. When this method is invoked:- If the
readyState
of avideoStreamTrack
provided in the constructor is notlive
, return a promise rejected with a newDOMException
whose name isInvalidStateError
. - If an invalid
PhotoSettings
object is passed as argument, return a promise rejected with a newDOMException
whose name isSyntaxError
-
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps:
- Configure the underlying image capture device with the `settings` parameter.
- If the UA cannot successfully apply the settings, then the UA MUST return a promise rejected with a new
DOMException
whose name isOperationError
. -
If the UA can successfully apply the settings, then the UA MUST return a resolved promise.
If the UA can successfully apply the settings, the effect MAY be reflected, if visible at all, in
videoStreamTrack
. The result of applying some of the settings MAY force the latter to not satisfy its `constraints` (e.g. the frame rate). The result of applying some of this constraints might not be immediate.
Parameter Type Nullable Optional Description settings PhotoSettings
✔ ✘ The PhotoSettings
dictionary to be applied.Many of thePhotoSettings
represent hardware capabilities that cannot be modified instaneously, e.g. `zoom` or `focus`.setOptions()
will resolve the Promise as soon as possible. The actual status of any field can be monitored usinggetPhotoCapabilities()
. - If the
grabFrame()
-
grabFrame()
takes a snapshot of the live video being held invideoStreamTrack
, returning anImageBitmap
if successful.grabFrame()
returns data only once upon being invoked. When this method is invoked:- If the
readyState
ofvideoStreamTrack
provided in the constructor is notlive
, return a promise rejected with a newDOMException
whose name isInvalidStateError
. -
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps:
-
Gather data from
videoStreamTrack
into anImageBitmap
object. The width and height of theImageBitmap
object are derived from the constraints ofvideoStreamTrack
.The result ofgrabFrame()
is affected by any options set bysetOptions()
if those are reflected invideoStreamTrack
. - Returns a resolved promise with a newly created
ImageBitmap
object. - If the UA is unable to execute the
grabFrame()
method for any reason (for example, upon invocation of multiplegrabFrame()
/takePhoto()
method calls in rapid succession), then the UA MUST return a promise rejected with a newDOMException
whose name isUnknownError
.
-
Gather data from
- If the
3. PhotoCapabilities
interface PhotoCapabilities { readonly attribute MeteringMode whiteBalanceMode; readonly attribute MediaSettingsRange colorTemperature; readonly attribute MeteringMode exposureMode; readonly attribute MediaSettingsRange exposureCompensation; readonly attribute MediaSettingsRange iso; readonly attribute boolean redEyeReduction; readonly attribute MeteringMode focusMode; readonly attribute MediaSettingsRange brightness; readonly attribute MediaSettingsRange contrast; readonly attribute MediaSettingsRange saturation; readonly attribute MediaSettingsRange sharpness; readonly attribute MediaSettingsRange imageHeight; readonly attribute MediaSettingsRange imageWidth; readonly attribute MediaSettingsRange zoom; readonly attribute FillLightMode fillLightMode; };
3.1. Attributes
whiteBalanceMode
, of type MeteringMode, readonly- This reflects the current white balance mode setting.
colorTemperature
, of type MediaSettingsRange, readonly- This range reflects the current correlated color temperature being used for the scene white balance calculation and its available range.
exposureMode
, of type MeteringMode, readonly- This reflects the current exposure mode setting.
exposureCompensation
, of type MediaSettingsRange, readonly- This reflects the current exposure compensation setting and permitted range. The supported range can be, and usually is, centered around 0 EV.
iso
, of type MediaSettingsRange, readonly- This reflects the current camera ISO setting and permitted range. Values are numeric.
redEyeReduction
, of type boolean, readonly- This reflects whether camera red eye reduction is on or off, and is boolean - on is true
focusMode
, of type MeteringMode, readonly- This reflects the current focus mode setting.
brightness
, of type MediaSettingsRange, readonly- This reflects the current brightness setting of the camera and permitted range. Values are numeric. Increasing values indicate increasing brightness.
contrast
, of type MediaSettingsRange, readonly- This reflects the current contrast setting of the camera and permitted range. Values are numeric. Increasing values indicate increasing contrast.
saturation
, of type MediaSettingsRange, readonly- This reflects the current saturation setting of the camera and permitted range. Values are numeric. Increasing values indicate increasing saturation.
sharpness
, of type MediaSettingsRange, readonly- This reflects the current sharpness setting of the camera and permitted range. Values are numeric. Increasing values indicate increasing sharpness, and the minimum value always implies no sharpness enhancement or processing.
imageHeight
, of type MediaSettingsRange, readonly- This reflects the image height range supported by the UA and the current height setting.
imageWidth
, of type MediaSettingsRange, readonly- This reflects the image width range supported by the UA and the current width setting.
zoom
, of type MediaSettingsRange, readonly- This reflects the zoom value range supported by the UA and the current zoom setting.
fillLightMode
, of type FillLightMode, readonly- This reflects the current fill light (flash) mode setting. Values are of type
FillLightMode
.
imageWidth
and imageHeight
ranges to prevent increasing the fingerprinting surface and to allow the UA to make a best-effort decision with regards to actual hardware configuration. 3.2. Discussion
This section is non-normative.The PhotoCapabilities
interface provides the photo-specific settings and their current values. Many of these fields mirror hardware capabilities that are hard to define since can be implemented in a number of ways. Moreover, hardware manufacturers tend to publish vague definitions to protect their intellectual property. The following definitions are assumed for individual settings and are provided for information purposes:
-
White balance mode is a setting that cameras use to adjust for different color temperatures. Color temperature is the temperature of background light (usually measured in Kelvin). This setting can usually be automatically and continuously determined by the implementation, but it’s also common to offer a
manual
mode in which the estimated temperature of the scene illumination is hinted to the implementation. Typical temperature ranges for popular modes are provided below:Mode Kelvin range incandescent 2500-3500 fluorescent 4000-5000 warm-fluorescent 5000-5500 daylight 5500-6500 cloudy-daylight 6500-8000 twilight 8000-9000 shade 9000-10000 - Exposure is the amount of time during which light is allowed to fall on the photosensitive device. Auto-exposure mode is a camera setting where the exposure levels are automatically adjusted by the implementation based on the subject of the photo.
- Exposure Compensation is a numeric camera setting that adjusts the exposure level from the current value used by the implementation. This value can be used to bias the exposure level enabled by auto-exposure, and usually is a symmetric range around 0 EV (the no-compensation value).
- The ISO setting of a camera describes the sensitivity of the camera to light. It is a numeric value, where the lower the value the greater the sensitivity. This value should follow the [iso12232] standard.
- Red Eye Reduction is a feature in cameras that is designed to limit or prevent the appearance of red pupils ("Red Eye") in photography subjects due prolonged exposure to a camera’s flash.
- Focus mode describes the focus setting of the capture device (e.g. `auto` or `manual`).
- [LIGHTING-VOCABULARY] defines brightness as "the attribute of a visual sensation according to which an area appears to emit more or less light" and in the context of the present API, it refers to the numeric camera setting that adjusts the perceived amount of light emitting from the photo object. A higher brightness setting increases the intensity of darker areas in a scene while compressing the intensity of brighter parts of the scene. The range and effect of this setting is implementation dependent but in general it translates into a numerical value that is added to each pixel with saturation.
- Contrast is the numeric camera setting that controls the difference in brightness between light and dark areas in a scene. A higher contrast setting reflects an expansion in the difference in brightness. The range and effect of this setting is implementation dependent but it can be understood as a transformation of the pixel values so that the luma range in the histogram becomes larger; the transformation is sometimes as simple as a gain factor.
- [LIGHTING-VOCABULARY] defines saturation as "the colourfulness of an area judged in proportion to its brightness" and in the current context it refers to a numeric camera setting that controls the intensity of color in a scene (i.e. the amount of gray in the scene). Very low saturation levels will result in photos closer to black-and-white. Saturation is similar to contrast but referring to colors, so its implementation, albeit being platform dependent, can be understood as a gain factor applied to the chroma components of a given image.
- Sharpness is a numeric camera setting that controls the intensity of edges in a scene. Higher sharpness settings result in higher contrast along the edges, while lower settings result in less contrast and blurrier edges (i.e. soft focus). The implementation is platform dependent, but it can be understood as the linear combination of an edge detection operation applied on the original image and the original image itself; the relative weights being cotrolled by this `sharpness`.
- Zoom is a numeric camera setting that controls the focal length of the lens. The setting usually represents a ratio, e.g. 4 is a zoom ratio of 4:1. The minimum value is usually 1, to represent a 1:1 ratio (i.e. no zoom).
- Fill light mode describes the flash setting of the capture device (e.g. `auto`, `off`, `on`).
4. PhotoSettings
dictionary PhotoSettings { MeteringMode whiteBalanceMode; double colorTemperature; MeteringMode exposureMode; double exposureCompensation; double iso; boolean redEyeReduction; MeteringMode focusMode; sequence<Point2D> pointsOfInterest; double brightness; double contrast; double saturation; double sharpness; double zoom; double imageHeight; double imageWidth; FillLightMode fillLightMode; };
4.1. Members
whiteBalanceMode
, of type MeteringMode- This reflects the desired white balance mode setting.
colorTemperature
, of type double- Color temperature to be used for the white balance calculation of the scene. This field is only significant if whiteBalanceMode is
manual
. exposureMode
, of type MeteringMode- This reflects the desired exposure mode setting. Acceptable values are of type
MeteringMode
. exposureCompensation
, of type double- This reflects the desired exposure compensation setting. A value of 0 EV is interpreted as no exposure compensation.
iso
, of type double- This reflects the desired camera ISO setting.
redEyeReduction
, of type boolean- This reflects whether camera red eye reduction is desired
focusMode
, of type MeteringMode- This reflects the desired focus mode setting. Acceptable values are of type
MeteringMode
. pointsOfInterest
, of type sequence<Point2D>-
A `sequence` of
Point2D
s to be used as metering area centers for other settings, e.g. Focus, Exposure and Auto White Balance.APoint2D
Point of Interest is interpreted to represent a pixel position in a normalized square space (`{x,y} ∈ [0.0, 1.0]`). The origin of coordinates `{x,y} = {0.0, 0.0}` represents the upper leftmost corner whereas the `{x,y} = {1.0, 1.0}` represents the lower rightmost corner: thex
coordinate (columns) increases rightwards and they
coordinate (rows) increases downwards. Values beyond the mentioned limits will be clamped to the closest allowed value. brightness
, of type double- This reflects the desired brightness setting of the camera.
contrast
, of type double- This reflects the desired contrast setting of the camera.
saturation
, of type double- This reflects the desired saturation setting of the camera.
sharpness
, of type double- This reflects the desired sharpness setting of the camera.
zoom
, of type double- This reflects the desired zoom setting of the camera.
imageWidth
, of type double- This reflects the desired image height. The UA MUST select the closest height value this setting if it supports a discrete set of height options.
imageHeight
, of type double- This reflects the desired image width. The UA MUST select the closest width value this setting if it supports a discrete set of width options.
fillLightMode
, of type FillLightMode- This reflects the desired fill light (flash) mode setting. Acceptable values are of type
FillLightMode
.
5. MediaSettingsRange
interface MediaSettingsRange { readonly attribute double max; readonly attribute double min; readonly attribute double current; readonly attribute double step; };
5.1. Attributes
max
, of type double, readonly- The maximum value of this setting
min
, of type double, readonly- The minimum value of this setting
current
, of type double, readonly- The current value of this setting
step
, of type double, readonly- The minimum difference between consecutive values of this setting.
6. FillLightMode
enum FillLightMode {
"unavailable",
"auto",
"off",
"flash",
"torch"
};
6.1. Values
unavailable
- This source does not have an option to change fill light modes (e.g., the camera does not have a flash)
auto
- The video device’s fill light will be enabled when required (typically low light conditions). Otherwise it will be off. Note that auto does not guarantee that a flash will fire when
takePhoto()
is called. Useflash
to guarantee firing of the flash fortakePhoto()
method. off
- The source’s fill light and/or flash will not be used.
flash
- This value will always cause the flash to fire for
takePhoto()
method. torch
- The source’s fill light will be turned on (and remain on) while the source
videoStreamTrack
is active
7. MeteringMode
enum MeteringMode {
"none",
"manual",
"single-shot",
"continuous"
};
7.1. Values
none
- This source does not offer focus/exposure/white balance mode. For setting, this is interpreted as a command to turn off the feature.
manual
- The capture device is set to manually control the lens position/exposure time/white balance, or such a mode is requested to be configured.
single-shot
- The capture device is configured for single-sweep autofocus/one-shot exposure/white balance calculation, or such a mode is requested.
continuous
- The capture device is configured for continuous focusing for near-zero shutter-lag/continuous auto exposure/white balance calculation, or such continuous focus hunting/exposure/white balance calculation mode is requested.
8. Point2D
A Point2D
represents a location in a two dimensional space. The origin of coordinates is situated in the upper leftmost corner of the space.
dictionary Point2D { double x = 0.0; double y = 0.0; };
8.1. Members
x
, of type double, defaulting to0.0
- Value of the horizontal (abscissa) coordinate.
y
, of type double, defaulting to0.0
- Value of the vertical (ordinate) coordinate.
9. Examples
9.1. Update camera zoom and takePhoto()
<html> <body> <video autoplay></video> <img> <input type="range" hidden> <script> var imageCapture; navigator.mediaDevices.getUserMedia({video: true}) .then(gotMedia) .catch(err => console.error('getUserMedia() failed: ', err)); function gotMedia(mediastream) { const video = document.querySelector('video'); video.srcObject = mediastream; const track = mediastream.getVideoTracks()[0]; imageCapture = new ImageCapture(track); imageCapture.getPhotoCapabilities() .then(photoCapabilities => { // Check whether zoom is supported or not. if (!photoCapabilities.zoom.min && !photoCapabilities.zoom.max) { return; } // Map zoom to a slider element. const input = document.querySelector('input[type="range"]'); input.min = photoCapabilities.zoom.min; input.max = photoCapabilities.zoom.max; input.step = photoCapabilities.zoom.step; input.value = photoCapabilities.zoom.current; input.oninput = function(event) { imageCapture.setOptions({zoom: event.target.value}); } input.hidden = false; }) .catch(err => console.error('getPhotoCapabilities() failed: ', err)); } function takePhoto() { imageCapture.takePhoto() .then(blob => { console.log('Photo taken: ' + blob.type + ', ' + blob.size + 'B'); const image = document.querySelector('img'); image.src = URL.createObjectURL(blob); }) .catch(err => console.error('takePhoto() failed: ', err)); } </script> </body> </html>
9.2. Repeated grabbing of a frame with grabFrame()
<html> <body> <canvas></canvas> <button onclick="stopGrabFrame()">Stop frame grab</button> <script> const canvas = document.querySelector('canvas'); var interval; var track; navigator.mediaDevices.getUserMedia({video: true}) .then(gotMedia) .catch(err => console.error('getUserMedia() failed: ', err)); function gotMedia(mediastream) { track = mediastream.getVideoTracks()[0]; var imageCapture = new ImageCapture(track); interval = setInterval(function () { imageCapture.grabFrame() .then(processFrame) .catch(err => console.error('grabFrame() failed: ', err)); }, 1000); } function processFrame(imgData) { canvas.width = imgData.width; canvas.height = imgData.height; canvas.getContext('2d').drawImage(imgData, 0, 0); } function stopGrabFrame(e) { clearInterval(interval); track.stop(); } </script> </body> </html>
9.3. Grabbing a Frame and Post-Processing
<html> <body> <canvas></canvas> <script> const canvas = document.querySelector('canvas'); var track; navigator.mediaDevices.getUserMedia({video: true}) .then(gotMedia) .catch(err => console.error('getUserMedia() failed: ', err)); function gotMedia(mediastream) { track = mediastream.getVideoTracks()[0]; var imageCapture = new ImageCapture(track); imageCapture.grabFrame() .then(processFrame) .catch(err => console.error('grabFrame() failed: ', err)); } function processFrame(imageBitmap) { track.stop(); // |imageBitmap| pixels are not directly accessible: we need to paint // the grabbed frame onto a <canvas>, then getImageData() from it. const ctx = canvas.getContext('2d'); canvas.width = imageBitmap.width; canvas.height = imageBitmap.height; ctx.drawImage(imageBitmap, 0, 0); // Read back the pixels from the <canvas>, and invert the colors. const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height); var data = imageData.data; for (var i = 0; i < data.length; i += 4) { data[i] = 255 - data[i]; // red data[i + 1] = 255 - data[i + 1]; // green data[i + 2] = 255 - data[i + 2]; // blue } // Finally, draw the inverted image to the <canvas> ctx.putImageData(imageData, 0, 0); } </script> </body> </html>