-
Notifications
You must be signed in to change notification settings - Fork 8
Tutorial 01: Capture Screens
Getting visual inputs is a fundamental demand for training a deep learning model. In this tutorial, we will show you how to use VIVID Python API to capture images.
VIVID provides several types of visual input including scene, depth and segmentation based on AirSim. To get pictures from a camera, you can simply call simGetImages
from a VividClient
and pass a list of ImageRequest
. The response of simGetImages
is a list of byte arrays in png format corresponding to each of ImageRequest
. For example:
response = client.simGetImages([ImageRequest(0, AirSimImageType.Scene)])
We briefly describe the parameters for ImageRequest
, types of image and provide an example in the following sections.
Parameters | Description | |
---|---|---|
int | cameraId | Camera ID |
int | image_type | Type of image. Default: AirSimImageType.Scene |
bool | pixels_as_float | True for converting pixel to floating point. Default: False |
bool | compress | True for the compressed image. Default: False |
Type | Description |
---|---|
Scene | Visual appearance observed by the camera |
DepthVis | Each pixel is interpolated from black to white by the distance between the pixel and camera plane for depth visualization. Pure black: 0 meters. Pure white: 100 or over 100 meters |
DepthPlanner | Distance from camera plane to the pixel plane (parallel to camera plane) |
DepthPerspective | Distance from the camera viewpoint to the pixel |
Segmentation | Ground truth segmentation (by objects) from the screen |
Subwindows from left to right in the example picture below are the captured images. ( DepthVis, Segmentation, and Scene respectively)
A python example is provided here.