Configure an mlmodel Detector or Classifier

Changed in RDK v0.2.36 and API v0.1.118

The mlmodel vision service model is a machine learning detector that draws bounding boxes or returns class labels according to the specified tensorflow-lite model file available on the machine’s hard drive. To create a mlmodel classifier, you need an ML model service with a suitable model. Before configuring your mlmodel detector or classifier, you need to:

Train or upload an ML model

You can add an existing model or train your own models for object detection and classification using data from the data management service.

Deploy your model

To make use of ML models with your machine, use the built-in ML model service to deploy and run the model.


Once you have deployed your ML model, configure your mlmodel detector or classifier:

Navigate to your machine’s Config tab on the Viam app. Click the Services subtab and click Create service in the lower-left corner. Select the Vision type, then select the ML Model model. Enter a name for your service and click Create.

In your vision service’s panel, fill in the Attributes field.

{
  "mlmodel_name": "<mlmodel-service-name>"
}

Add the vision service object to the services array in your raw JSON configuration:

"services": [
  {
    "name": "<service_name>",
    "type": "vision",
    "model": "mlmodel",
    "attributes": {
      "mlmodel_name": "<mlmodel-service-name>"
    }
  },
  ... // Other services
]
"services": [
  {
    "name": "person_detector",
    "type": "vision",
    "model": "mlmodel",
    "attributes": {
      "mlmodel_name": "my_mlmodel_service"
    }
  }
]
"services": [
  {
    "name": "fruit_classifier",
    "type": "vision",
    "model": "mlmodel",
    "attributes": {
      "mlmodel_name": "fruit_classifier"
    }
  }
]

Click Save config. Proceed to test your detector or classifier.

Test your detector or classifier

You can test your detector or classifier with existing images in the Viam app or live camera footage. You can also test classifiers with existing images on a computer.

Existing images in the cloud

If you have images stored in the Viam cloud, you can run your classifier against your images in the Viam app.

  1. Navigate to the Data tab and click on the Images subtab.
  2. Click on an image to open the side menu, and select the Actions tab under the Data tab.
  3. In the Run model section, select your model and specify a confidence threshold.
  4. Click Run model

If the classifier’s results exceed the confidence threshold, the Run model section shows a label and the responding confidence threshold.

Live camera footage

If you intend to use the detector or classifier with a camera that is part of your machine, you can test your detector or classifier from the Control tab or with code:

  1. Configure a camera component.

  2. (Optional) If you would like to see detections or classifications from the Control tab, configure a transform camera with the following attributes:

{
  "pipeline": [
    {
      "type": "detections",
      "attributes": {
        "confidence_threshold": 0.5,
        "detector_name": "<vision-service-name>"
      }
    }
  ],
  "source": "<camera-name>"
}
{
  "pipeline": [
    {
      "type": "classifications",
      "attributes": {
        "confidence_threshold": 0.5,
        "classifier_name": "<vision-service-name>"
      }
    }
  ],
  "source": "<camera-name>"
}
  1. After adding the components and their attributes, click Save config.

  2. Navigate to the Control tab, click on your transform camera and toggle it on. If you’ve configured a detector, the transform camera will now show detections with bounding boxes around the object.

    Viam app control tab interface showing bounding boxes around two office chairs, both labeled “chair” with confidence score “0.50.”

    If you’ve configured a classifier, the transform camera will now show classifications on the image.

    Model recognizes a star on camera feed

  3. The following code gets the machine’s vision service and then runs a detector or classifier vision model on an image from the machine’s camera "cam1".

from viam.services.vision import VisionClient

robot = await connect()
camera_name = "cam1"

# Grab camera from the machine
cam1 = Camera.from_robot(robot, camera_name)
# Grab Viam's vision service for the detector
my_detector = VisionClient.from_robot(robot, "my_detector")

detections = await my_detector.get_detections_from_camera(camera_name)

# If you need to store the image, get the image first
# and then run detections on it. This process is slower:
img = await cam1.get_image()
detections_from_image = await my_detector.get_detections(img)

await robot.close()

To learn more about how to use detection, see the Python SDK docs.

import (
  "go.viam.com/rdk/config"
  "go.viam.com/rdk/services/vision"
  "go.viam.com/rdk/components/camera"
)

// Grab the camera from the machine
cameraName := "cam1" // make sure to use the same component name that you have in your machine configuration
myCam, err := camera.FromRobot(robot, cameraName)
if err != nil {
  logger.Fatalf("cannot get camera: %v", err)
}

myDetector, err := vision.from_robot(robot, "my_detector")
if err != nil {
    logger.Fatalf("Cannot get vision service: %v", err)
}

// Get detections from the camera output
detections, err := myDetector.DetectionsFromCamera(context.Background(), myCam, nil)
if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
}
if len(directDetections) > 0 {
    logger.Info(detections[0])
}

// If you need to store the image, get the image first
// and then run detections on it. This process is slower:

// Get the stream from a camera
camStream, err := myCam.Stream(context.Background())

// Get an image from the camera stream
img, release, err := camStream.Next(context.Background())
defer release()

// Apply the color classifier to the image from your camera (configured as "cam1")
detectionsFromImage, err := myDetector.Detections(context.Background(), img, nil)
if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
}
if len(detectionsFromImage) > 0 {
    logger.Info(detectionsFromImage[0])
}

To learn more about how to use detection, see the Go SDK docs.

from viam.services.vision import VisionClient

robot = await connect()
camera_name = "cam1"
# Grab camera from the machine
cam1 = Camera.from_robot(robot, camera_name)
# Grab Viam's vision service for the classifier
my_classifier = VisionClient.from_robot(robot, "my_classifier")

# Get the top 2 classifications with the highest confidence scores from the
# camera output
classifications = await my_classifier.get_classifications_from_camera(
    camera_name, 2)

# If you need to store the image, get the image first
# and then run classifications on it. This process is slower:
img = await cam1.get_image()
classifications_from_image = await my_classifier.get_classifications(img, 2)

await robot.close()

To learn more about how to use classification, see the Python SDK docs.

import (
  "go.viam.com/rdk/config"
  "go.viam.com/rdk/services/vision"
  "go.viam.com/rdk/components/camera"
)

// Grab the camera from the machine
cameraName := "cam1" // make sure to use the same component name that you have in your machine configuration
myCam, err := camera.FromRobot(robot, cameraName)
if err != nil {
  logger.Fatalf("cannot get camera: %v", err)
}

myClassifier, err := vision.from_robot(robot, "my_classifier")
if err != nil {
    logger.Fatalf("Cannot get vision service: %v", err)
}

// Get the top 2 classifications with the highest confidence scores from the camera output
classifications, err := visService.ClassificationsFromCamera(context.Background(), myCam, 2, nil)
if err != nil {
    logger.Fatalf("Could not get classifications: %v", err)
}
if len(directClassifications) > 0 {
    logger.Info(classifications[0])
}

// If you need to store the image, get the image first
// and then run classifications on it. This process is slower:

// Get the stream from a camera
camStream, err := myCam.Stream(context.Background())

// Get an image from the camera stream
img, release, err := camStream.Next(context.Background())
defer release()

// Apply the color classifier to the image from your camera (configured as "cam1")
// Get the top 2 classifications with the highest confidence scores
classificationsFromImage, err := visService.GetClassifications(context.Background(), img, 2, nil)
if err != nil {
    logger.Fatalf("Could not get classifications: %v", err)
}
if len(classificationsFromImage) > 0 {
    logger.Info(classificationsFromImage[0])
}

To learn more about how to use classification, see the Go SDK docs.

Existing images on your machine

If you would like to test your detector or classifier with existing images, load the images and pass them to the detector or classifier:

from viam.services.vision import VisionClient
from PIL import Image

robot = await connect()
# Grab Viam's vision service for the detector
my_detector = VisionClient.from_robot(robot, "my_detector")

# Load an image
img = Image.open('test-image.png')

# Apply the detector to the image
detections_from_image = await my_detector.get_detections(img)

await robot.close()

To learn more about how to use detection, see the Python SDK docs.

import (
  "go.viam.com/rdk/config"
  "go.viam.com/rdk/services/vision"
  "image/jpeg"
  "os"
)

myDetector, err := vision.from_robot(robot, "my_detector")
if err != nil {
    logger.Fatalf("Cannot get Vision Service: %v", err)
}

// Read image from existing file
file, err := os.Open("test-image.jpeg")
if err != nil {
    logger.Fatalf("Could not get image: %v", err)
}
defer file.Close()
img, err := jpeg.Decode(file)
if err != nil {
    logger.Fatalf("Could not decode image: %v", err)
}
defer img.Close()

// Apply the detector to the image
detectionsFromImage, err := myDetector.Detections(context.Background(), img, nil)
if err != nil {
    logger.Fatalf("Could not get detections: %v", err)
}
if len(detectionsFromImage) > 0 {
    logger.Info(detectionsFromImage[0])
}

To learn more about how to use detection, see the Go SDK docs.

from viam.services.vision import VisionClient
from PIL import Image

robot = await connect()
# Grab Viam's vision service for the classifier
my_classifier = VisionClient.from_robot(robot, "my_classifier")

# Load an image
img = Image.open('test-image.png')

# Apply the classifier to the image
classifications_from_image = await my_classifier.get_classifications(img)

await robot.close()

To learn more about how to use classification, see the Python SDK docs.

import (
  "go.viam.com/rdk/config"
  "go.viam.com/rdk/services/vision"
  "image"
  "image/png"
  "os"
)

myClassifier, err := vision.from_robot(robot, "my_classifier")
if err != nil {
    logger.Fatalf("Cannot get Vision Service: %v", err)
}

// Read image from existing file
file, err := os.Open("test-image.jpeg")
if err != nil {
    logger.Fatalf("Could not get image: %v", err)
}
defer file.Close()
img, err := jpeg.Decode(file)
if err != nil {
    logger.Fatalf("Could not decode image: %v", err)
}
defer img.Close()

// Apply the classifier to the image
classificationsFromImage, err := myClassifier.Classifications(context.Background(), img, nil)
if err != nil {
    logger.Fatalf("Could not get classifications: %v", err)
}
if len(classificationsFromImage) > 0 {
    logger.Info(classificationsFromImage[0])
}

To learn more about how to use classification, see the Go SDK docs.

Next steps



Have questions, or want to meet other people working on robots? Join our Community Discord.

If you notice any issues with the documentation, feel free to file an issue or edit this file.