Previous
Data management
The vision service enables your machine to use its on-board cameras to intelligently see and interpret the world around it. While the camera component lets you access what your machine’s camera sees, the vision service allows you to interpret your image data.
The vision service API allows you to get detections, classifications, or point cloud objects, depending on the ML model the vision service is using.
The vision service supports the following kinds of operations:

2D Object Detection is the process of taking a 2D image from a camera and identifying and drawing a box around the distinct “objects” of interest in the scene. Any camera that can return 2D images can use 2D object detection.
You can use different types of detectors, both based on heuristics and machine learning, for any object you may need to identify.
The returned detections consist of the bounding box around the identified object, as well as its label and confidence score:
x_min, y_min, x_max, y_max (int): specify the bounding box around the object using a rectangular area specified by two points: the top left point ((x_min, y_min)) and the bottom right point ((x_max, y_max)). The origin (0, 0) occupies the top left pixel of the image; X values increase as you move right, Y values increase as you move down.class_name (string): specifies the label of the found object.confidence (float): specifies the confidence of the assigned label.
Between 0.0 and 1.0, inclusive.Supported API methods:
2D Image Classification is the process of taking a 2D image from a camera and deciding which class label, out of many, best describes the given image. Any camera that can return 2D images can use 2D image classification.
The class labels used for classification vary and depend on the machine learning model and how it was trained.
The returned classifications consist of the image’s class label and confidence score.
class_name (string): specifies the label of the found object.confidence (float): specifies the confidence of the assigned label.
Between 0.0 and 1.0, inclusive.Supported API methods:
3D Object Segmentation is the process of separating and returning a list of the identified “objects” from a 3D scene. The “objects” are usually a list of point clouds with associated metadata, like the label, the 3D bounding box, and center coordinates of the object.
3D object segmentation is useful for obstacle detection. See our guide Navigate with a Rover Base for an example of automating obstacle avoidance with 3D object segmentation for obstacle detection.
Any camera that can return 3D pointclouds can use 3D object segmentation.
3D segmentation operations require frame system configuration to properly relate camera coordinates to your machine’s spatial reference frames. This enables the vision service to provide meaningful 3D coordinates and spatial relationships.
Supported API methods:
The vision service supports the following methods:
| Method Name | Description | 
|---|---|
| GetDetectionsFromCamera | Get a list of detections from the next image from a specified camera using a configured detector. | 
| GetDetections | Get a list of detections from a given image using a configured detector. | 
| GetClassificationsFromCamera | Get a list of classifications from the next image from a specified camera using a configured classifier. | 
| GetClassifications | Get a list of classifications from a given image using a configured classifier. | 
| GetObjectPointClouds | Get a list of 3D point cloud objects and associated metadata in the latest picture from a 3D camera (using a specified segmenter). | 
| CaptureAllFromCamera | Get the next image, detections, classifications, and objects all together, given a camera name. | 
| Reconfigure | Reconfigure this resource. | 
| DoCommand | Execute model-specific commands that are not otherwise defined by the service API. | 
| GetResourceName | Get the ResourceNamefor this instance of the vision service. | 
| GetProperties | Fetch information about which vision methods a given vision service supports. | 
| Close | Safely shut down the resource and prevent further use. | 
Get a list of detections from the next image from a specified camera using a configured detector.
Parameters:
camera_name (str) (required): The name of the camera to use for detection.extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Raises:
Example:
my_detector = VisionClient.from_robot(robot=machine, "my_detector")
# Get detections for the next image from the specified camera
detections = await my_detector.get_detections_from_camera("my_camera")
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.cameraName (string): The name of the camera from which to get an image to run detections on.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
Example:
myDetectorService, err := vision.FromRobot(machine, "my_detector")
if err != nil {
  logger.Error(err)
  return
}
// Get detections from the camera output
detections, err := myDetectorService.DetectionsFromCamera(context.Background(), "my_camera", nil)
if err != nil {
  logger.Fatalf("Could not get detections: %v", err)
}
if len(detections) > 0 {
  logger.Info(detections[0])
}
For more information, see the Go SDK Docs.
Parameters:
cameraName (string) (required): The name of the camera to use for detection.extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const vision = new VIAM.VisionClient(machine, 'my_vision');
const detections = await vision.getDetectionsFromCamera('my_camera');
For more information, see the TypeScript SDK Docs.
Get a list of detections from a given image using a configured detector.
Parameters:
image (viam.media.video.ViamImage) (required): The image to get detections for.extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Raises:
Example:
my_camera = Camera.from_robot(robot=machine, "my_camera")
my_detector = VisionClient.from_robot(robot=machine, "my_detector")
# Get an image from the camera
img = await my_camera.get_image()
# Get detections for that image
detections = await my_detector.get_detections(img)
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.img (image.Image): The image in which to look for detections.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
Example:
 // add "go.viam.com/rdk/utils" to imports to use this code snippet
  myCam, err := camera.FromRobot(machine, "my_camera")
  if err != nil {
    logger.Error(err)
    return
  }
  // Get an image from the camera decoded as an image.Image
  img, err = camera.DecodeImageFromCamera(context.Background(), utils.MimeTypeJPEG, nil, myCam)
  myDetectorService, err := vision.FromRobot(machine, "my_detector")
  if err != nil {
    logger.Error(err)
    return
  }
  // Get the detections from the image
  detections, err := myDetectorService.Detections(context.Background(), img, nil)
  if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
  }
  if len(detections) > 0 {
    logger.Info(detections[0])
  }
For more information, see the Go SDK Docs.
Parameters:
image (Uint8Array) (required): The image from which to get detections.width (number) (required): The width of the image.height (number) (required): The height of the image.mimeType (MimeType) (required): The MimeType of the image.extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const camera = new VIAM.CameraClient(machine, 'my_camera');
const vision = new VIAM.VisionClient(machine, 'my_vision');
const mimeType = 'image/jpeg';
const image = await camera.getImage(mimeType);
const detections = await vision.getDetections(
  image,
  600,
  600,
  mimeType
);
For more information, see the TypeScript SDK Docs.
Get a list of classifications from the next image from a specified camera using a configured classifier.
Parameters:
camera_name (str) (required): The name of the camera to use for detection.count (int) (required): The number of classifications desired.extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Example:
my_classifier = VisionClient.from_robot(robot=machine, "my_classifier")
# Get the 2 classifications with the highest confidence scores for the next image from the camera
classifications = await my_classifier.get_classifications_from_camera(
    "my_camera", 2)
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.cameraName (string): The name of the camera from which to get an image to run the classifier on.n (int): The number of classifications to return. For example, if you specify 3 you will get the top three classifications with the greatest confidence scores.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
Example:
myClassifierService, err := vision.FromRobot(machine, "my_classifier")
if err != nil {
  logger.Error(err)
  return
}
// Get the 2 classifications with the highest confidence scores from the camera output
classifications, err := myClassifierService.ClassificationsFromCamera(context.Background(), "my_camera", 2, nil)
if err != nil {
  logger.Fatalf("Could not get classifications: %v", err)
}
if len(classifications) > 0 {
  logger.Info(classifications[0])
}
For more information, see the Go SDK Docs.
Parameters:
cameraName (string) (required): The name of the camera to use for classification.count (number) (required): The number of Classifications requested.extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const vision = new VIAM.VisionClient(machine, 'my_vision');
const classifications = await vision.getClassificationsFromCamera(
  'my_camera',
  10
);
For more information, see the TypeScript SDK Docs.
Parameters:
Returns:
Example:
// Example:
var classifications = await myVisionService.classificationsFromCamera('myWebcam', 2);
For more information, see the Flutter SDK Docs.
Get a list of classifications from a given image using a configured classifier.
Parameters:
image (viam.media.video.ViamImage) (required): The image to get detections for.count (int) (required): The number of classifications desired.extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Example:
my_camera = Camera.from_robot(robot=machine, "my_camera")
my_classifier = VisionClient.from_robot(robot=machine, "my_classifier")
# Get an image from the camera
img = await my_camera.get_image()
# Get the 2 classifications with the highest confidence scores for the image
classifications = await my_classifier.get_classifications(img, 2)
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.img (image.Image): The image in which to look for classifications.n (int): The number of classifications to return. For example, if you specify 3 you will get the top three classifications with the greatest confidence scores.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
Example:
 // add "go.viam.com/rdk/utils" to imports to use this code snippet
  myCam, err := camera.FromRobot(machine, "my_camera")
  if err != nil {
    logger.Error(err)
    return
  }
  // Get an image from the camera decoded as an image.Image
  img, err = camera.DecodeImageFromCamera(context.Background(), utils.MimeTypeJPEG, nil, myCam)
  myClassifierService, err := vision.FromRobot(machine, "my_classifier")
  if err != nil {
    logger.Error(err)
    return
  }
  // Get the 2 classifications with the highest confidence scores from the image
  classifications, err := myClassifierService.Classifications(context.Background(), img, 2, nil)
  if err != nil {
    logger.Fatalf("Could not get classifications: %v", err)
  }
  if len(classifications) > 0 {
    logger.Info(classifications[0])
  }
For more information, see the Go SDK Docs.
Parameters:
image (Uint8Array) (required): The image from which to get classifications.width (number) (required): The width of the image.height (number) (required): The height of the image.mimeType (MimeType) (required): The MimeType of the image.count (number) (required): The number of Classifications requested.extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const camera = new VIAM.CameraClient(machine, 'my_camera');
const vision = new VIAM.VisionClient(machine, 'my_vision');
const mimeType = 'image/jpeg';
const image = await camera.getImage(mimeType);
const classifications = await vision.getClassifications(
  image,
  600,
  600,
  mimeType,
  10
);
For more information, see the TypeScript SDK Docs.
Parameters:
Returns:
Example:
// Example:
var latestImage = await myWebcam.image();
var classifications = await myVisionService.classifications(latestImage, 2);
For more information, see the Flutter SDK Docs.
Get a list of 3D point cloud objects and associated metadata in the latest picture from a 3D camera (using a specified segmenter).
Parameters:
camera_name (str) (required): The name of the camera.extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Example:
import numpy as np
import open3d as o3d
my_segmenter = VisionClient.from_robot(robot=machine, "my_segmenter")
# Get the objects from the camera output
objects = await my_segmenter.get_object_point_clouds("my_camera")
# write the first object point cloud into a temporary file
with open("/tmp/pointcloud_data.pcd", "wb") as f:
    f.write(objects[0].point_cloud)
pcd = o3d.io.read_point_cloud("/tmp/pointcloud_data.pcd")
points = np.asarray(pcd.points)
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.cameraName (string): The name of the 3D camera from which to get point cloud data.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
Example:
mySegmenterService, err := vision.FromRobot(machine, "my_segmenter")
if err != nil {
  logger.Error(err)
  return
}
// Get the objects from the camera output
objects, err := mySegmenterService.GetObjectPointClouds(context.Background(), "my_camera", nil)
if err != nil {
  logger.Fatalf("Could not get point clouds: %v", err)
}
if len(objects) > 0 {
  logger.Info(objects[0])
}
For more information, see the Go SDK Docs.
Parameters:
cameraName (string) (required): The name of the camera.extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const vision = new VIAM.VisionClient(machine, 'my_vision');
const pointCloudObjects =
  await vision.getObjectPointClouds('my_camera');
For more information, see the TypeScript SDK Docs.
Parameters:
Returns:
Example:
// Example:
var ptCloud = await myVisionService.objectPointClouds('myCamera');
For more information, see the Flutter SDK Docs.
Get the next image, detections, classifications, and objects all together, given a camera name. Used for visualization.
Parameters:
camera_name (str) (required): The name of the camera to use for detection.return_image (bool) (required): Ask the vision service to return the camera’s latest image.return_classifications (bool) (required): Ask the vision service to return its latest classifications.return_detections (bool) (required): Ask the vision service to return its latest detections.return_object_point_clouds (bool) (required): Ask the vision service to return its latest 3D segmentations.extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Example:
my_detector = VisionClient.from_robot(machine, "my_detector")
# Get the captured data for a camera
result = await my_detector.capture_all_from_camera(
    "my_camera",
    return_image=True,
    return_detections=True,
)
image = result.image
detections = result.detections
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.cameraName (string): The name of the camera to use for detection.opts (viscapture.CaptureOptions): Additional options to provide if desired.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
Example:
// The data to capture and return from the camera
captOpts := viscapture.CaptureOptions{
  ReturnImage: true,
  ReturnDetections: true,
}
// Get the captured data for a camera
capture, err := visService.CaptureAllFromCamera(context.Background(), "my_camera", captOpts, nil)
if err != nil {
  logger.Fatalf("Could not get capture data from vision service: %v", err)
}
image := capture.Image
detections := capture.Detections
classifications := capture.Classifications
objects := capture.Objects
For more information, see the Go SDK Docs.
Parameters:
cameraName (string) (required): The name of the camera to use for classification,
detection, and segmentation.__namedParameters (CaptureAllOptions) (required)extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const vision = new VIAM.VisionClient(machine, 'my_vision');
const captureAll = await vision.captureAllFromCamera('my_camera', {
  returnImage: true,
  returnClassifications: true,
  returnDetections: true,
  returnObjectPointClouds: true,
});
For more information, see the TypeScript SDK Docs.
Reconfigure this resource. Reconfigure must reconfigure the resource atomically and in place.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.deps (Dependencies): The resource dependencies.conf (Config): The resource configuration.Returns:
For more information, see the Go SDK Docs.
Execute model-specific commands that are not otherwise defined by the service API.
Most models do not implement DoCommand.
Any available model-specific commands should be covered in the model’s documentation.
If you are implementing your own vision service and want to add features that have no corresponding built-in API method, you can implement them with DoCommand.
Parameters:
command (Mapping[str, ValueTypes]) (required): The command to execute.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Example:
my_vision_svc = VisionClient.from_robot(robot=machine, "my_vision_svc")
my_command = {
  "cmnd": "dosomething",
  "someparameter": 52
}
await my_vision_svc.do_command(command=my_command)
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.cmd (map[string]interface{}): The command to execute.Returns:
Example:
myVisionSvc, err := vision.FromRobot(machine, "my_vision_svc")
command := map[string]interface{}{"cmd": "test", "data1": 500}
result, err := myVisionSvc.DoCommand(context.Background(), command)
For more information, see the Go SDK Docs.
Parameters:
command (Struct) (required): The command to execute.callOptions (CallOptions) (optional)Returns:
Example:
import { Struct } from '@viamrobotics/sdk';
const result = await resource.doCommand(
  Struct.fromJson({
    myCommand: { key: 'value' },
  })
);
For more information, see the TypeScript SDK Docs.
Get the ResourceName for this instance of the vision service.
Parameters:
name (str) (required): The name of the Resource.Returns:
Example:
my_vision_svc_name = VisionClient.get_resource_name("my_vision_svc")
For more information, see the Python SDK Docs.
Parameters:
Returns:
Example:
myVisionSvc, err := vision.FromRobot(machine, "my_vision_svc")
err = myVisionSvc.Name()
For more information, see the Go SDK Docs.
Parameters:
Returns:
Example:
vision.name
For more information, see the TypeScript SDK Docs.
Parameters:
name String (required)Returns:
Example:
final myVisionServiceResourceName = myVisionService.getResourceName("my_vision_service");
For more information, see the Flutter SDK Docs.
Fetch information about which vision methods a given vision service supports.
Parameters:
extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.Returns:
Example:
my_detector = VisionClient.from_robot(robot=machine, "my_detector")
properties = await my_detector.get_properties()
detections_supported = properties.detections_supported
classifications_supported = properties.classifications_supported
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.extra (map[string]interface{}): Extra options to pass to the underlying RPC call.Returns:
For more information, see the Go SDK Docs.
Parameters:
extra (None) (optional)callOptions (CallOptions) (optional)Returns:
Example:
const vision = new VIAM.VisionClient(machine, 'my_vision');
const properties = await vision.getProperties();
For more information, see the TypeScript SDK Docs.
Parameters:
Returns:
Example:
// Example:
var properties = await myVisionService.properties();
properties.detections_supported
properties.classifications_supported
For more information, see the Flutter SDK Docs.
Safely shut down the resource and prevent further use.
Parameters:
Returns:
Example:
my_vision_svc = VisionClient.from_robot(robot=machine, name="my_vision_svc")
await my_vision_svc.close()
For more information, see the Python SDK Docs.
Parameters:
ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.Returns:
Example:
myVisionSvc, err := vision.FromRobot(machine, "my_vision_svc")
err = myVisionSvc.Close(context.Background())
For more information, see the Go SDK Docs.
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!