Vision Service
The vision service enables your machine to use its on-board cameras to intelligently see and interpret the world around it. While the camera component lets you access what your machine’s camera sees, the vision service allows you to interpret your image data.
The vision service supports the following kinds of operations:
Detections
2D Object Detection is the process of taking a 2D image from a camera and identifying and drawing a box around the distinct “objects” of interest in the scene. Any camera that can return 2D images can use 2D object detection.
You can use different types of detectors, both based on heuristics and machine learning, for any object you may need to identify.
The returned detections consist of the bounding box around the identified object, as well as its label and confidence score:
x_min
,y_min
,x_max
,y_max
(int): specify the bounding box around the object.class_name
(string): specifies the label of the found object.confidence
(float): specifies the confidence of the assigned label. Between0.0
and1.0
, inclusive.
Supported API methods:
Classifications
2D Image Classification is the process of taking a 2D image from a camera and deciding which class label, out of many, best describes the given image. Any camera that can return 2D images can use 2D image classification.
The class labels used for classification vary and depend on the machine learning model and how it was trained.
The returned classifications consist of the image’s class label and confidence score.
class_name
(string): specifies the label of the found object.confidence
(float): specifies the confidence of the assigned label. Between0.0
and1.0
, inclusive.
Supported API methods:
Segmentations
3D Object Segmentation is the process of separating and returning a list of the identified “objects” from a 3D scene. The “objects” are usually a list of point clouds with associated metadata, like the label, the 3D bounding box, and center coordinates of the object.
3D object segmentation is useful for obstacle detection. See our guide Navigate with a Rover Base for an example of automating obstacle avoidance with 3D object segmentation for obstacle detection.
Any camera that can return 3D pointclouds can use 3D object segmentation.
Supported API methods:
Configuration
For configuration information, click on the model name:
Add support for other models
If none of the existing models fit your use case, you can create a modular resource to add support for it.
API
The vision service supports the following vision service API methods:
Method Name | Description |
---|---|
GetDetectionsFromCamera | Get a list of detections from the next image from a specified camera using a configured detector. |
GetDetections | Get a list of detections from a given image using a configured detector. |
GetClassificationsFromCamera | Get a list of classifications from the next image from a specified camera using a configured classifier. |
GetClassifications | Get a list of classifications from a given image using a configured classifier. |
GetObjectPointClouds | Get a list of 3D point cloud objects and associated metadata in the latest picture from a 3D camera (using a specified segmenter). |
CaptureAllFromCamera | Get the next image, detections, classifications, and objects all together, given a camera name. |
Reconfigure | Reconfigure this resource. |
DoCommand | Execute model-specific commands that are not otherwise defined by the service API. |
GetResourceName | Get the ResourceName for this instance of the vision service with the given name. |
GetProperties | Fetch information about which vision methods a given vision service supports. |
Close | Safely shut down the resource and prevent further use. |
Next Steps
For general configuration and development info, see:
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!