Vision service API

The vision service API allows you to get detections, classifications, or point cloud objects, depending on the ML model the vision service is using.

The vision service supports the following methods:

Method NameDescription
GetDetectionsFromCameraGet a list of detections from the next image from a specified camera using a configured detector.
GetDetectionsGet a list of detections from a given image using a configured detector.
GetClassificationsFromCameraGet a list of classifications from the next image from a specified camera using a configured classifier.
GetClassificationsGet a list of classifications from a given image using a configured classifier.
GetObjectPointCloudsGet a list of 3D point cloud objects and associated metadata in the latest picture from a 3D camera (using a specified segmenter).
CaptureAllFromCameraGet the next image, detections, classifications, and objects all together, given a camera name.
ReconfigureReconfigure this resource.
DoCommandExecute model-specific commands that are not otherwise defined by the service API.
GetResourceNameGet the ResourceName for this instance of the vision service with the given name.
GetPropertiesFetch information about which vision methods a given vision service supports.
CloseSafely shut down the resource and prevent further use.

Establish a connection

To get started using Viam’s SDKs to connect to and control your machine, go to your machine’s page on the Viam app, navigate to the CONNECT tab’s Code sample page, select your preferred programming language, and copy the sample code.

When executed, this sample code creates a connection to your machine as a client.

The following examples assume that you have a machine configured with a camera and a vision service detector, classifier or segmenter.

from viam.services.vision import VisionClient
import (
  "go.viam.com/rdk/services/vision"
)

API

GetDetectionsFromCamera

Get a list of detections from the next image from a specified camera using a configured detector.

Parameters:

  • camera_name (str) (required): The name of the camera to use for detection.
  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

  • (List[viam.proto.service.vision.Detection]): A list of 2D bounding boxes, their labels, and the confidence score of the labels, around the found objects in the next 2D image from the given camera, with the given detector applied to it.

Raises:

  • (ViamError): Raised if given an image without a specified width and height.

Example:

camera_name = "cam1"

# Grab the detector you configured on your machine
my_detector = VisionClient.from_robot(robot, "my_detector")

# Get detections from the next image from the camera
detections = await my_detector.get_detections_from_camera(camera_name)

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • cameraName (string): The name of the camera from which to get an image to run detections on.
  • extra (map[string]interface{}): Extra options to pass to the underlying RPC call.

Returns:

Example:

// Get detections from the camera output
detections, err := visService.DetectionsFromCamera(context.Background(), myCam, nil)
if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
}
if len(detections) > 0 {
    logger.Info(detections[0])
}

For more information, see the Go SDK Docs.

Parameters:

Returns:

Example:

// Example:
var detections = await myVisionService.detectionsFromCamera('myWebcam');

For more information, see the Flutter SDK Docs.

GetDetections

Get a list of detections from a given image using a configured detector.

Parameters:

  • image (viam.media.video.ViamImage) (required): The image to get detections from.
  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

  • (List[viam.proto.service.vision.Detection]): A list of 2D bounding boxes, their labels, and the confidence score of the labels, around the found objects in the next 2D image from the given camera, with the given detector applied to it.

Raises:

  • (ViamError): Raised if given an image without a specified width and height.

Example:

# Grab camera from the machine
cam1 = Camera.from_robot(robot, "cam1")

# Get the detector you configured on your machine
my_detector = VisionClient.from_robot(robot, "my_detector")

# Get an image from the camera
img = await cam1.get_image()

# Get detections from that image
detections = await my_detector.get_detections(img)

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • img (image.Image): The image in which to look for detections.
  • extra (map[string]interface{}): Extra options to pass to the underlying RPC call.

Returns:

Example:

// Get the stream from a camera
camStream, err := myCam.Stream(context.Background())

// Get an image from the camera stream
img, release, err := camStream.Next(context.Background())
defer release()

// Get the detections from the image
detections, err := visService.Detections(context.Background(), img, nil)
if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
}
if len(detections) > 0 {
    logger.Info(detections[0])
}

For more information, see the Go SDK Docs.

Parameters:

Returns:

Example:

// Example:
var latestImage = await myWebcam.image();
var detections = await myVisionService.detections(latestImage);

For more information, see the Flutter SDK Docs.

GetClassificationsFromCamera

Get a list of classifications from the next image from a specified camera using a configured classifier.

Parameters:

  • camera_name (str) (required): The name of the camera to use for detection.
  • count (int) (required): The number of classifications desired.
  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

Example:

camera_name = "cam1"

# Grab the classifier you configured on your machine
my_classifier = VisionClient.from_robot(robot, "my_classifier")

# Get the 2 classifications with the highest confidence scores from the next image from the camera
classifications = await my_classifier.get_classifications_from_camera(
    camera_name, 2)

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • cameraName (string): The name of the camera from which to get an image to run the classifier on.
  • n (int): The number of classifications to return. For example, if you specify 3 you will get the top three classifications with the greatest confidence scores.
  • extra (map[string]interface{}): Extra options to pass to the underlying RPC call.

Returns:

Example:

// Get the 2 classifications with the highest confidence scores from the camera output
classifications, err := visService.ClassificationsFromCamera(context.Background(), myCam, 2, nil)
if err != nil {
    logger.Fatalf("Could not get classifications: %v", err)
}
if len(classifications) > 0 {
    logger.Info(classifications[0])
}

For more information, see the Go SDK Docs.

Parameters:

Returns:

Example:

// Example:
var classifications = await myVisionService.classificationsFromCamera('myWebcam', 2);

For more information, see the Flutter SDK Docs.

GetClassifications

Get a list of classifications from a given image using a configured classifier.

Parameters:

  • image (viam.media.video.ViamImage) (required): The image to get detections from.
  • count (int) (required): The number of classifications desired.
  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

Example:

# Grab camera from the machine
cam1 = Camera.from_robot(robot, "cam1")

# Get the classifier you configured on your machine
my_classifier = VisionClient.from_robot(robot, "my_classifier")

# Get an image from the camera
img = await cam1.get_image()

# Get the 2 classifications with the highest confidence scores
classifications = await my_classifier.get_classifications(img, 2)

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • img (image.Image): The image in which to look for classifications.
  • n (int): The number of classifications to return. For example, if you specify 3 you will get the top three classifications with the greatest confidence scores.
  • extra (map[string]interface{}): Extra options to pass to the underlying RPC call.

Returns:

Example:

// Get the stream from a camera
camStream, err := myCam.Stream(context.Background())
if err!=nil {
    logger.Error(err)
    return
}

// Get an image from the camera stream
img, release, err := camStream.Next(context.Background())
defer release()

// Get the 2 classifications with the highest confidence scores from the image
classifications, err := visService.Classifications(context.Background(), img, 2, nil)
if err != nil {
    logger.Fatalf("Could not get classifications: %v", err)
}
if len(classifications) > 0 {
    logger.Info(classifications[0])
}

For more information, see the Go SDK Docs.

Parameters:

Returns:

Example:

// Example:
var latestImage = await myWebcam.image();
var classifications = await myVisionService.classifications(latestImage, 2);

For more information, see the Flutter SDK Docs.

GetObjectPointClouds

Get a list of 3D point cloud objects and associated metadata in the latest picture from a 3D camera (using a specified segmenter).

Parameters:

  • camera_name (str) (required): The name of the camera.
  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

Example:

import numpy as np
import open3d as o3d

# Grab the 3D camera from the machine
cam1 = Camera.from_robot(robot, "cam1")
# Grab the object segmenter you configured on your machine
my_segmenter = VisionClient.from_robot(robot, "my_segmenter")
# Get the objects from the camera output
objects = await my_segmenter.get_object_point_clouds(cam1)
# write the first object point cloud into a temporary file
with open("/tmp/pointcloud_data.pcd", "wb") as f:
    f.write(objects[0].point_cloud)
pcd = o3d.io.read_point_cloud("/tmp/pointcloud_data.pcd")
points = np.asarray(pcd.points)

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • cameraName (string): The name of the 3D camera from which to get point cloud data.
  • extra (map[string]interface{}): Extra options to pass to the underlying RPC call.

Returns:

  • ([]*viz.Object): A list of point clouds and associated metadata like the center coordinates of each point cloud.
  • (error): An error, if one occurred.

Example:

// Get the objects from the camera output
objects, err := visService.GetObjectPointClouds(context.Background(), "cam1", nil)
if err != nil {
    logger.Fatalf("Could not get point clouds: %v", err)
}
if len(objects) > 0 {
    logger.Info(objects[0])
}

For more information, see the Go SDK Docs.

Parameters:

Returns:

Example:

// Example:
var ptCloud = await myVisionService.objectPointClouds('myCamera');

For more information, see the Flutter SDK Docs.

CaptureAllFromCamera

Get the next image, detections, classifications, and objects all together, given a camera name. Used for visualization.

Parameters:

  • camera_name (str) (required): The name of the camera to use for detection.
  • return_image (bool) (required): Ask the vision service to return the camera’s latest image.
  • return_classifications (bool) (required): Ask the vision service to return its latest classifications.
  • return_detections (bool) (required): Ask the vision service to return its latest detections.
  • return_object_point_clouds (bool) (required): Ask the vision service to return its latest 3D segmentations.
  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

  • (viam.services.vision.vision.CaptureAllResult): A class that stores all potential returns from the vision service. It can return the image from the camera along with its associated detections, classifications, and objects, as well as any extra info the model may provide.

Example:

camera_name = "cam1"

# Grab the detector you configured on your machine
my_detector = VisionClient.from_robot(robot, "my_detector")

# capture all from the next image from the camera
result = await my_detector.capture_all_from_camera(
    camera_name,
    return_image=True,
    return_detections=True,
)

For more information, see the Python SDK Docs.

Parameters:

Returns:

  • (viscapture.VisCapture): A class that stores all potential returns from the vision service. It can return the image from the camera along with its associated detections, classifications, and objects, as well as any extra info the model may provide.
  • (error): An error, if one occurred.

Example:

// The data to capture and return from the camera
captOpts := viscapture.CaptureOptions{}
// Get the captured data for a camera
capture, err := visService.CaptureAllFromCamera(context.Background(), "cam1", captOpts, nil)
if err != nil {
    logger.Fatalf("Could not get capture data from vision service: %v", err)
}
image := capture.Image
detections := capture.Detections
classifications := capture.Classifications
objects := capture.Objects

For more information, see the Go SDK Docs.

Reconfigure

Reconfigure this resource. Reconfigure must reconfigure the resource atomically and in place.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • deps (Dependencies): The resource dependencies.
  • conf (Config): The resource configuration.

Returns:

  • (error): An error, if one occurred.

For more information, see the Go SDK Docs.

DoCommand

Execute model-specific commands that are not otherwise defined by the service API. For built-in service models, any model-specific commands available are covered with each model’s documentation. If you are implementing your own vision service and add features that have no built-in API method, you can access them with DoCommand.

Parameters:

  • command (Mapping[str, ValueTypes]) (required): The command to execute.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

  • (Mapping[str, viam.utils.ValueTypes]): Result of the executed command.

Example:

service = VisionClient.from_robot(robot, "my_vision_svc")

my_command = {
  "cmnd": "dosomething",
  "someparameter": 52
}

# Can be used with any resource, using the motion service as an example
await service.do_command(command=my_command)

For more information, see the Python SDK Docs.

Parameters:

Returns:

Example:

myVision, err := vision.FromRobot(machine, "my_vision")

command := map[string]interface{}{"cmd": "test", "data1": 500}
result, err := myVision.DoCommand(context.Background(), command)

For more information, see the Go SDK Docs.

Parameters:

Returns:

Example:

// Example using doCommand with an arm component
const command = {'cmd': 'test', 'data1': 500};
var result = myArm.doCommand(command);

For more information, see the Flutter SDK Docs.

GetResourceName

Get the ResourceName for this instance of the vision service with the given name.

Parameters:

  • name (str) (required): The name of the Resource.

Returns:

Example:

my_vision_svc_name = VisionClient.get_resource_name("my_vision_svc")

For more information, see the Python SDK Docs.

Parameters:

Returns:

For more information, see the Flutter SDK Docs.

GetProperties

Fetch information about which vision methods a given vision service supports.

Parameters:

  • extra (Mapping[str, Any]) (optional): Extra options to pass to the underlying RPC call.
  • timeout (float) (optional): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

Example:

# Grab the detector you configured on your machine
my_detector = VisionClient.from_robot(robot, "my_detector")
properties = await my_detector.get_properties()
properties.detections_supported      # returns True
properties.classifications_supported # returns False

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • extra (map[string]interface{}): Extra options to pass to the underlying RPC call.

Returns:

For more information, see the Go SDK Docs.

Close

Safely shut down the resource and prevent further use.

Parameters:

  • None.

Returns:

  • None.

Example:

await my_vision_svc.close()

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.

Returns:

  • (error): An error, if one occurred.

Example:

myVisionSvc, err := vision.FromRobot(machine, "my_vision_svc")

err = myVisionSvc.Close(context.Background())

For more information, see the Go SDK Docs.

Have questions, or want to meet other people working on robots? Join our Community Discord.

If you notice any issues with the documentation, feel free to file an issue or edit this file.