This document outlines the supported USB Video Class Extension Unit (XU) APIs used by Google Meet conference systems to enable intelligent camera features. The purpose of creating this specification is to influence practices to enable these features and to allow for better asynchronous scalability and testing for our partners.
For details about the latest changes to this document, go to Release notes.
Test facility
To help partners validate compliance with this specification, we provide a test
facility in Chromebox-for-Meetings devices set in
developer mode.
Enable
writing to the filesystem.
Add the following lines to
/etc/chrome_dev.conf
:
--enable-logging
--log-level=0
Restart the device, connect the camera and a USB
keyboard, press Ctrl-Alt-X
, and the active camera's spec compliance will be
exercised and logged to /home/chronos/user/log/chrome
.
Little-endian convention
USB is a little-endian standard. Within this document:
- Multi-byte numbers appear big-endian (and are transmitted little-endian).
- Byte arrays are in the little-endian memory layout.
For example, 0x12345678
is the same as [0x78, 0x56, 0x34, 0x12]
.
Extension unit GUID
Extension units supporting this Meet XU control specification must use this GUID.
Extension Unit | GUID |
---|---|
Peripheral Control XU | {74D7E924-49C9-4A45-98A3-8A9F60061E83} |
Peripheral control XU selectors
These are the defined peripheral control XU selectors.
Control selectors | Value |
---|---|
GOOGXU_FRAME_STRATEGY |
0x01 |
GOOGXU_REFRAME |
0x02 |
GOOGXU_OCCUPANCY_COUNTING_TOGGLE |
0x03 |
GOOGXU_OCCUPANCY_COUNTING_READ |
0x04 |
GOOGXU_STATUS_INFO |
0x05 |
GOOGXU_STATUS_RESET |
0x06 |
GOOGXU_PRESETS |
0x07 |
GOOGXU_PAN_TILT_ABSOLUTE |
0x08 |
GOOGXU_PAN_TILT_RELATIVE |
0x09 |
Control request type
Control request types are defined in Chapter 4: Class Specific Requests of the UVC 1.5 Class Specification.
Operation | UVC control |
---|---|
GET |
GET_CUR , GET_MIN , GET_MAX , GET_RES , GET_LEN , GET_INFO , GET_DEF |
SET |
SET_CUR |
Camera modes
Camera modes are used to frame individuals in a meeting room and are a tuple
of:
- Strategy (camera view)
- Bias (speaker or room)
- Feeds (single or multiple streams)
Each dimension can take values described in the following sections.
Auto-framing strategy
Feature | Description |
---|---|
None | The camera disables all intelligent framing features and allows the client to freely control the PTZ values. Note: When set to this framing strategy, the camera remains in its current pan, tilt, and zoom position. |
Continuous Framing (CAZ) | Based on the framing bias, the camera continually tracks people in the room. Note: PTZ is disabled in this mode. |
Split Frames | The camera creates as many video views as needed. Based on the Auto-framing feeds option, it either composes them into tiles in a single stream or creates separate video streams for each view. Note: PTZ is disabled in this mode. |
Dynamic View | One or more cameras attempt to provide the best view of the room. It can decide whether or not to composite multiple feeds into one or provide an "interesting" view of the current room. The purpose of this view is to provide the most equitable view of in-room participants to the call. Notes:
|
Auto-framing bias
Feature | Description |
---|---|
High-Stakes Presenter (Speaker Tracking) | The camera attempts to best frame the person actively speaking in the room. In this scenario, the camera should bias toward the presenter. For example, the CEO in a boardroom giving a presentation. |
Collaboration (Room Tracking) |
The camera attempts to best frame all participants in the room. In this scenario, the camera should treat every participant equitably. Most meetings should use this strategy. |
Auto-framing feeds
Feature | Description |
---|---|
Single-Stream | The camera sends a single video stream to the host device. |
Multi-Stream (Work-In-Progress) |
The camera splits the stream and creates multiple video streams to send to the host. Note: The full specification and expected behavior of this feature is pending review and isn't supported until later revisions of this document. |
Auto-framing mode bitmap values
Except for the default state of None
that's represented by an empty byte
array, each bit in the byte array represents a different camera mode which is a
specific combination of the Auto-framing strategy, Auto-framing
bias, and the Auto-framing feeds.
Bit index | CAZ | Split-Frame | Dynamic | - |
---|---|---|---|---|
Speaker | D1 - |
- - |
D5 D6 |
Single-Stream Multi-Stream |
Room | D2 - |
D3 D4 |
D7 D8 |
Single-Stream Multi-Stream |
Frame modes | Frame mode value (least significant byte) |
---|---|
None |
0x00 |
CAZ, Speaker, Single-Stream |
0x01 |
CAZ, Room, Single-Stream |
0x02 |
Split-Frame, Room, Single-Stream |
0x04 |
Split-Frame, Room, Multi-Stream |
0x08 |
Dynamic, Speaker, Single-Stream |
0x10 |
Dynamic, Speaker, Multi-Stream |
0x20 |
Dynamic, Room, Single-Stream |
0x40 |
Dynamic, Room, Multi-Stream |
0x80 |
Control: GOOGXU_FRAME_STRATEGY
This control is used to get or set the framing modes of the camera as listed in
Auto-framing mode bitmap values. Each mode is represented as
a bit in their respective bitmap. The command GET_RES
returns an 8-byte long
bitmask with a value of zero (0) or one (1) to respectively indicate if the
feature is unsupported or supported by the device. For example, if a camera
supports CAZ, Speaker, Single-Stream
, Split-Frame, Room, Single-Stream
, and
Dynamic, Room, Multi-Stream
but no other modes, then GET_RES
should return
0x000000000000000085 (i.e. 0b10000101
followed by seven zero bytes).
The command SET_CUR
is used to send bitmaps to tell the camera which
SINGLE camera mode to enable.
Control selector | 1 | |||
---|---|---|---|---|
Operation | GET / SET |
|||
wLength |
8 | |||
Offset | Field | Size | Value | Description |
0 | bActiveMode |
8 | Bitmap | Set or return Active Camera Mode |
Notes:
|
The behavior of the supported request types is as follows:
Offset | 0 | Description |
---|---|---|
GET_CUR |
Get Active Framing Camera Mode | |
GET_MIN |
Camera-dependent | |
GET_MAX |
Camera-dependent | |
GET_RES |
Returns an 8-byte long bitmask of supported camera modes | |
GET_LEN |
0x0008 | Length |
GET_INFO |
0x0B | AutoUpdate / Write / Read |
GET_DEF |
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 |
Default value |
SET_CUR |
Set Active Framing Camera mode |
Control: GOOGXU_REFRAME
This control is used to trigger One-Shot Framing, also known as OTAZ. When OTAZ is triggered, the camera view snaps to the best view of the room. Afterwards, the client regains the ability to control the PTZ values. If one-shot framing isn't supported, the camera shouldn't define this control.
Control selector | 2 | |||
---|---|---|---|---|
Operation | SET |
|||
wLength |
1 | |||
Offset | Field | Size | Value | Description |
0 | bReframe |
1 | Number | 0x01 Execute Reframe Request |
The behavior of the supported request types is as follows:
Offset | 0 | Description |
---|---|---|
GET_MIN |
0x00 | |
GET_MAX |
0x01 | |
GET_RES |
0x01 | |
GET_LEN |
0x0001 | |
GET_INFO |
0x02 | Write Only |
GET_DEF |
0x00 | |
SET_CUR |
Set request for One-Shot Framing |
Occupancy counting
Occupancy counting (OC) is a feature used to estimate the number of participants in a meeting room, despite the camera's cropped view.
This table shows the expected behavior of the OC controls and their interactions with the camera video stream and the camera LED indicator.
When Occupancy Counting is | & the camera video stream is: | The camera LED indicator should be | GOOGXU_OCCUPANCY_COUNTING_TOGGLE GET_CUR should be |
GOOGXU_OCCUPANCY_COUNTING_READ GET_CUR should be |
---|---|---|---|---|
Turned on | Not streaming and not muted | On | 0x01 |
The count of persons in the camera's full field of view. |
Turned on | Streaming | On | 0x01 |
The count of persons in the camera's full field of view. |
Turned on | Muted | Off | 0x01 |
Turned off |
Turned off | Not streaming and not muted | Off | 0x00 |
Turned off |
Turned off | Streaming | On | 0x00 |
Turned off |
Turned off | Muted | Off | 0x00 |
Turned off |
Control: GOOGXU_OCCUPANCY_COUNTING_TOGGLE
This control is used to enable or disable the feature to count occupants in a room. Setting a value of zero (0) disables this feature and one (1) enables this feature. If this feature is unsupported, the camera shouldn't define this control.
Control selector | 3 | |||
---|---|---|---|---|
Operation | GET / SET |
|||
wLength |
1 | |||
Offset | Field | Size | Value | Description |
0 | bOccupancy |
1 | Boolean | Set occupancy counting function 0x00 Turn off function 0x01 Turn on function |
The behavior of the supported request types is as follows:
Offset | 0 | Description |
---|---|---|
GET_CUR |
Return if occupancy counting is turned on | |
GET_MIN |
0x00 | |
GET_MAX |
0x01 | |
GET_RES |
0x01 | |
GET_LEN |
0x0001 | |
GET_INFO |
0x0B | AutoUpdate / Write / Read |
GET_DEF |
0x00 | |
SET_CUR |
Enable or disable occupancy counting feature |
Control: GOOGXU_OCCUPANCY_COUNTING_READ
This control is used to read the number of participants in a room reported by the camera when occupancy counting is enabled. When occupancy counting is disabled, the camera should disable this control. If occupancy counting isn't supported, the camera shouldn't define this control.
Control selector | 4 | |||
---|---|---|---|---|
Operation | GET |
|||
wLength |
2 | |||
Offset | Field | Size | Value | Description |
0 | bNumPeople |
2 | Number | The number of detected occupants in view. (Read Only) |
The behavior of the supported request types is as follows:
Offset | 0 | Description |
---|---|---|
GET_CUR |
Return number of detected occupants | |
GET_MIN |
0x0000 | |
GET_MAX |
0x00FF | |
GET_RES |
0x0001 | |
GET_LEN |
0x0002 | |
GET_INFO |
0x09 | AutoUpdate / Read |
GET_DEF |
0x0000 |
Device telemetry & diagnostics
These controls are meant to encourage better debugging practices with Meet hardware and are usually not user facing.
Control: GOOGXU_STATUS_INFO
This control is used to query information from the host camera to share with partners for debugging.
Control selector | 5 | |||
---|---|---|---|---|
Operation | GET |
|||
wLength |
8 | |||
Offset | Field | Size | Value | Description |
0 | bNumCameras |
1 | Number | The number of additional satellites attached to the main camera that may affect the camera stream returned to the host. |
1 | bIsMoving |
1 | Bitmap | 0 when the camera is idle, and non-zero when its PTZ values are changing. Vendors are free to map different axes or motors to different bits. |
2 | Undef |
6 | Undef | To be extended in the future. |
The behavior of the supported request types is as follows:
Offset | 0 | 1 | 2 | Description |
---|---|---|---|---|
GET_MIN |
0x00 | 0x00 | 0x00 0x00 0x00 0x00 0x00 0x00 |
|
GET_MAX |
0xFF | 0xFF | 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF |
|
GET_RES |
0x01 | 0x01 | 0x01 0x00 0x00 0x00 0x00 0x00 |
|
GET_LEN |
0x08 | 0x00 | 0x0008 | |
GET_INFO |
0x09 | AutoUpdate / Read | ||
GET_DEF |
0x00 | 0x00 | 0x00 0x00 0x00 0x00 0x00 0x00 |
Control: GOOGXU_STATUS_RESET
This control is used to issue a reset request to the camera. Setting a value of one (1) requests the camera to reset. The camera returns zero (0) if there's been no request to restart the camera since the last reset and one (1) if it's resetting. The reset must trigger a camera reboot. (This is needed for self-powered devices where forcing a USB-disconnect to emulate a hotplug isn't useful.)
Control selector | 6 | |||
---|---|---|---|---|
Operation | GET / SET |
|||
wLength |
1 | |||
Offset | Field | Size | Value | Description |
0 | bResetRequest |
1 | Boolean | Issue a reset request to the host and connected cameras. Returns 0x01 if reset request issued since last reset, else 0x00. |
The behavior of the supported request types is as follows:
Offset | 0 | Description |
---|---|---|
GET_MIN |
0x00 | |
GET_MAX |
0x01 | |
GET_RES |
0x01 | |
GET_LEN |
0x0001 | |
GET_INFO |
0x03 | Write / Read |
GET_DEF |
0x00 |
PTZ presets
Used to configure and restore the camera's field of view into a preset position.
Control: GOOGXU_PRESETS
This control is used to set the camera's pan, tilt, and zoom (PTZ) values to a preset configuration.
The Preset Action
is used to state the intended action of the command. Setting
a value of one (1) is used to map the current pan, tilt, and zoom values to a
provided preset index. Setting a value of two (2) should transition the pan,
tilt, and zoom of the camera to the previously mapped values for the provided
index, or the default factory coordinates (if not previously mapped). Setting a
value of three (3) resets the index to the factory default coordinates.
The Preset Index
is used to specify the PTZ coordinates mapped to the index.
The Preset index
of zero (0) is mapped to the home coordinates and should be
the camera's default position on wake when the GOOGXU_FRAME_STRATEGY
is set to
NONE
.
Control selector | 7 | |||
---|---|---|---|---|
Operation | SET |
|||
wLength |
2 | |||
Offset | Field | Size | Value | Description |
0 | bPresetAction |
1 | Number | 0x01: Save preset 0x02: Restore preset 0x03: Reset preset to default. (Default should be a valid preset coordinate.) |
1 | bPresetIndex |
1 | Number | The Active Preset index. 0~N-1 Where 0 is considered the default camera start position and N-1 is a vendor-defined constant for number of presets. |
The behavior of the supported request types is as follows:
Offset | 0 | 1 | Description |
---|---|---|---|
GET_MIN |
0x00 | 0x00 | |
GET_MAX |
0x03 | N-1 | N max presets supported |
GET_RES |
0x01 | 0x01 | |
GET_LEN |
0x02 | 0x00 | 0x0002 |
GET_INFO |
0x02 | Write only | |
GET_DEF |
0x00 | 0x00 |
Pan & tilt auxiliary mapping
Some cameras have special components, such as the motors for mechanical cameras, or digital PTZ capabilities. For these, use the standard V4L2 controls for pan, tilt, and zoom.
Control: GOOGXU_PAN_TILT_ABSOLUTE
(deprecated)
Pan and tilt auxiliary mapping controls are defined in Chapter 4: Class Specific Requests Section 4.2.2.1.14 PanTilt (Absolute) Control of the UVC 1.5 Class Specification.
Control: GOOGXU_PAN_TILT_RELATIVE
(deprecated)
Pan and tilt auxiliary mapping controls are defined in Chapter 4: Class Specific Requests Section 4.2.2.1.15 PanTilt (Relative) Control of the UVC 1.5 Class Specification.
Related topics
- Linux USB Video Class (UVC) driver
- Video Class v1.5 document set
- USB 2.0 specification
- Google Meet Hardware help center
- Google Meet Hardware section of the Google Workspace Admin help center
Release notes
These release notes reflect improvements and new features in each revision of this document.
May 21, 2024
- Removed test script in favor of the new CfM test facility,
which exercises
SET
and also supports validation of Series One cameras. - Clarified fields in
GOOGXU_STATUS_INFO
. - Clarified behavior of
GOOGXU_STATUS_RESET
. - Deprecated
GOOGXU_PAN_TILT_ABSOLUTE
andGOOGXU_PAN_TILT_RELATIVE
.
November 15, 2023
Updated test script to check and interpret valid framing modes. Clarified byte representations.
July 21, 2023
Added test script for partners to validate implementations for compliance with this specification.
May 25, 2023
Corrected
GOOGXU_PRESETS
note regarding the number of presets. It should be N, not N-1.
April 17, 2023
Initial release.