Detection (or 2D object detection)

Changed in RDK v0.2.36 and API v0.1.118

2D Object Detection is the process of taking a 2D image from a camera and identifying and drawing a box around the distinct “objects” of interest in the scene. Any camera that can return 2D images can use 2D object detection.

The service provides different types of detectors, both based on heuristics and machine learning, so that you can create, register, and use detectors for any object you may need to identify.

The returned detections consist of the bounding box around the identified object, as well as its label and confidence score:

  • x_min, y_min, x_max, y_max (int): specify the bounding box around the object.
  • class_name (string): specifies the label of the found object.
  • confidence (float): specifies the confidence of the assigned label. Between 0.0 and 1.0, inclusive.

You can use the following types of detectors:

  • color_detector: A heuristic detector that draws boxes around objects according to their hue (does not detect black, gray, and white).
  • mlmodel: A machine learning detector that draws bounding boxes according to a ML model available on the robot’s hard drive.

Configure a color_detector

A heuristic detector that draws boxes around objects according to their hue. Color detectors do not detect black, perfect grays (grays where the red, green, and blue color component values are equal), or white. It only detects hues found on the color wheel.

Navigate to the robot page on the Viam app. Click on the robot you wish to add the Vision Service to. Select the Config tab, and click on Services.

Scroll to the Create Service section. To create a Vision Service:

  1. Select vision as the Type.
  2. Enter a name as the Name.
  3. Select Color Detector as the Model.
  4. Click Create Service.

Create Vision Service for color detector

In your Vision Service’s panel, select the color your vision service will be detecting, as well as a hue tolerance and a segment size (in pixels):

Color detector panel with color and hue tolerance selection and a field for the segment size

Add the Vision Service object to the services array in your raw JSON configuration:

"services": [
  {
    "name": "<service_name>",
    "type": "vision",
    "model": "color_detector",
    "attributes": {
      "segment_size_px": <integer>,
      "detect_color": "#ABCDEF",
      "hue_tolerance_pct": <number>,
      "saturation_cutoff_pct": <number>,
      "value_cutoff_pct": <number>
    }
  },
  ... // Other services
]
"services": [
  {
    "name": "blue_square",
    "type": "vision",
    "model": "color_detector",
    "attributes": {
      "segment_size_px": 100,
      "detect_color": "#1C4599",
      "hue_tolerance_pct": 0.07,
      "value_cutoff_pct": 0.15
    }
  },
  {
    "name": "green_triangle",
    "type": "vision",
    "model": "color_detector",
    "attributes": {
      "segment_size_px": 200,
      "detect_color": "#62963F",
      "hue_tolerance_pct": 0.05,
      "value_cutoff_pct": 0.20
    }
  }
]

The following parameters are available for a "color_detector".

ParameterInclusionDescription
segment_size_pxRequiredAn integer that sets a minimum size (in pixels) of a contiguous color region to be detected, and filters out all other found objects below that size.
detect_colorRequiredThe color to detect in the image, as a string of the form #RRGGBB. The color is written as a hexadecimal string prefixed by ‘#’.
hue_tolerance_pctRequiredA number bigger than 0.0 and smaller than or equal to 1.0 that defines how strictly the detector must match to the hue of the color requested. ~0.0 means the color must match exactly, while 1.0 matches to every color, regardless of the input color. 0.05 is a good starting value.
saturation_cutoff_pctOptionalA number > 0.0 and <= 1.0 which defines the minimum saturation before a color is ignored. Defaults to 0.2.
value_cutoff_pctOptionalA number > 0.0 and <= 1.0 which defines the minimum value before a color is ignored. Defaults to 0.3.

Click Save config and head to the Components tab. Proceed to Add a camera component and a “transform” model.

Configure a mlmodel detector

A machine learning detector that draws bounding boxes according to the specified tensorflow-lite model file available on the robot’s hard drive. To create a mlmodel classifier, you need an ML Model Service with a suitable model.

Navigate to the robot page on the Viam app. Click on the robot you wish to add the Vision Service to. Select the Config tab, and click on Services.

Scroll to the Create Service section.

  1. Select vision as the Type.
  2. Enter a name as the Name.
  3. Select ML Model as the Model.
  4. Click Create Service.

Create Vision Service for mlmodel

In your Vision Service’s panel, fill in the Attributes field.

{
  "mlmodel_name": "<detector_name>"
}

Add the Vision Service object to the services array in your raw JSON configuration:

"services": [
  {
    "name": "<service_name>",
    "type": "vision",
    "model": "mlmodel",
    "attributes": {
      "mlmodel_name": "<detector_name>"
    }
  }
]
"services": [
  {
    "name": "person_detector",
    "type": "vision",
    "model": "mlmodel",
    "attributes": {
      "mlmodel_name": "person_detector"
    }
  }
]

Click Save config and head to the Components tab.

Add a camera component and a “transform” model

You cannot interact directly with the Vision Service. To be able to interact with the Vision Service you must:

  1. Configure a physical camera component.
  2. Configure a transform camera to view output from the detector overlaid on images from the physical camera.

After adding the component and its attributes, click Save config.

Wait for the robot to reload, and then go to the Control tab to test the stream of detections.

Code

The following code gets the robot’s vision service and then runs a color detector vision model on an image from the robot’s camera "camera_1":

from viam.services.vision import VisionClient, VisModelConfig, VisModelType

robot = await connect()
# grab camera from the robot
cam1 = Camera.from_robot(robot, "cam1")
# grab Viam's vision service for the detector
my_detector = VisionClient.from_robot(robot, "my_detector")

img = await cam1.get_image()
detections = await my_detector.get_detections(img)

await robot.close()

To learn more about how to use detection, see the Python SDK docs.

import (
"go.viam.com/rdk/config"
"go.viam.com/rdk/services/vision"
"go.viam.com/rdk/components/camera"
)

// grab the camera from the robot
cameraName := "cam1" // make sure to use the same component name that you have in your robot configuration
myCam, err := camera.FromRobot(robot, cameraName)
if err != nil {
  logger.Fatalf("cannot get camera: %v", err)
}

visService, err := vision.from_robot(robot=robot, name='my_detector')
if err != nil {
    logger.Fatalf("Cannot get Vision Service: %v", err)
}

// gets the stream from a camera
camStream, err := myCam.Stream(context.Background())

// gets an image from the camera stream
img, release, err := camStream.Next(context.Background())
defer release()

// Apply the color classifier to the image from your camera (configured as "cam1")
detections, err := visService.GetDetections(context.Background(), img)
if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
}
if len(detections) > 0 {
    logger.Info(detections[0])
}

To learn more about how to use detection, see the Go SDK docs.

Next Steps



Have questions, or want to meet other people working on robots? Join our Community Discord.