Vision Service

The vision service enables your machine to use its on-board cameras to intelligently see and interpret the world around it. While the camera component lets you access what your machine’s camera sees, the vision service allows you to interpret your image data.

Error

The vision service supports the following kinds of operations:

Detections

A white dog with a bounding box around it labeled 'Dog: 0.71'

2D Object Detection is the process of taking a 2D image from a camera and identifying and drawing a box around the distinct “objects” of interest in the scene. Any camera that can return 2D images can use 2D object detection.

You can use different types of detectors, both based on heuristics and machine learning, for any object you may need to identify.

The returned detections consist of the bounding box around the identified object, as well as its label and confidence score:

  • x_min, y_min, x_max, y_max (int): specify the bounding box around the object.
  • class_name (string): specifies the label of the found object.
  • confidence (float): specifies the confidence of the assigned label. Between 0.0 and 1.0, inclusive.

Supported API methods:

Classifications

2D Image Classification is the process of taking a 2D image from a camera and deciding which class label, out of many, best describes the given image. Any camera that can return 2D images can use 2D image classification.

The class labels used for classification vary and depend on the machine learning model and how it was trained.

The returned classifications consist of the image’s class label and confidence score.

  • class_name (string): specifies the label of the found object.
  • confidence (float): specifies the confidence of the assigned label. Between 0.0 and 1.0, inclusive.

Supported API methods:

Segmentations

3D Object Segmentation is the process of separating and returning a list of the identified “objects” from a 3D scene. The “objects” are usually a list of point clouds with associated metadata, like the label, the 3D bounding box, and center coordinates of the object.

3D object segmentation is useful for obstacle detection. See our guide Navigate with a Rover Base for an example of automating obstacle avoidance with 3D object segmentation for obstacle detection.

Any camera that can return 3D pointclouds can use 3D object segmentation.

Supported API methods:

Configuration

For configuration information, click on the model name:

Model
Description

API

The vision service supports the following vision service API methods:

Method NameDescription
GetDetectionsFromCameraGet a list of detections from the next image from a specified camera using a configured detector.
GetDetectionsGet a list of detections from a given image using a configured detector.
GetClassificationsFromCameraGet a list of classifications from the next image from a specified camera using a configured classifier.
GetClassificationsGet a list of classifications from a given image using a configured classifier.
GetObjectPointCloudsGet a list of 3D point cloud objects and associated metadata in the latest picture from a 3D camera (using a specified segmenter).
CaptureAllFromCameraGet the next image, detections, classifications, and objects all together, given a camera name.
ReconfigureReconfigure this resource.
DoCommandExecute model-specific commands that are not otherwise defined by the service API.
GetResourceNameGet the ResourceName for this instance of the vision service with the given name.
GetPropertiesFetch information about which vision methods a given vision service supports.
CloseSafely shut down the resource and prevent further use.

Next Steps

For general configuration and development info, see: