Deploy an ML Model with the ML Model Service

The Machine Learning (ML) model service allows you to deploy machine learning models to your machine.

You can use the following built-in model of the service:

ModelDescription
"tflite_cpu" modelRuns a tensorflow lite model that you have trained or uploaded.

Used with

After deploying your model, you need to configure an additional service to use the deployed model. For example, you can configure an mlmodel vision service and a transform camera to visualize the predictions your model makes.

Modular resources

Search for additional mlmodel models that you can add from the Viam Registry:

For configuration information, click on the model name:

Model
Description

Create an ML model service

You can use the ML model service to deploy:

Navigate to your machine’s Config tab on the Viam app. Click the Services subtab and click Create service in the lower-left corner. Select the ML Model type, then select the TFLite CPU model. Enter a name for your service and click Create.

You can choose to configure your service with an existing model on the machine or deploy a model onto your machine:

  1. To configure your service and deploy a model onto your machine, select Deploy Model On Robot for the Deployment field.

  2. Click on Models to open a dropdown with all of the ML models available to you privately, as well as all of the ML models available in the registry, which are shared by users. You can select from any of these models to deploy on your robot.

Models dropdown menu with models from the registry.
  1. Also, optionally select the Number of threads.
Create a machine learning models service with a model to be deployed
  1. To configure your service with an existing model on the machine, select Path to Existing Model On Robot for the Deployment field.
  2. Then specify the absolute Model Path and any Optional Settings such as the absolute Label Path and the Number of threads.

Create a machine learning models service with an existing model

Add the tflite_cpu ML model object to the services array in your raw JSON configuration:

"services": [
  {
    "name": "<mlmodel_name>",
    "type": "mlmodel",
    "model": "tflite_cpu",
    "attributes": {
      "model_path": "${packages.ml_model.<model_name>}/<model-name>.tflite",
      "label_path": "${packages.ml_model.<model_name>}/labels.txt",
      "num_threads": <number>
    }
  },
  ... // Other services
]
"services": [
  {
    "name": "fruit_classifier",
    "type": "mlmodel",
    "model": "tflite_cpu",
    "attributes": {
      "model_path": "${packages.ml_model.my_fruit_model}/my_fruit_model.tflite",
      "label_path": "${packages.ml_model.my_fruit_model}/labels.txt",
      "num_threads": 1
    }
  }
]

The following parameters are available for a "tflite_cpu" model:

ParameterInclusionDescription
model_pathRequiredThe absolute path to the .tflite model file, as a string.
label_pathOptionalThe absolute path to a .txt file that holds class labels for your TFLite model, as a string. This text file should contain an ordered listing of class labels. Without this file, classes will read as “1”, “2”, and so on.
num_threadsOptionalAn integer that defines how many CPU threads to use to run inference. Default: 1.

Save the configuration.

Versioning for deployed models

If you upload or train a new version of a model, Viam automatically deploys the latest version of the model to the machine. If you do not want Viam to automatically deploy the latest version of the model, you can change the packages configuration in the Raw JSON machine configuration.

You can get the version number from a specific model version by clicking on COPY on the model on the models tab of the DATA page. The model package config looks like this:

{
  "package": "<model_id>/allblack",
  "version": "YYYYMMDDHHMMSS",
  "name": "<model_name>",
  "type": "ml_model"
}

tflite_cpu limitations

We strongly recommend that you package your tflite_cpu model with metadata in the standard form.

In the absence of metadata, your tflite_cpu model must satisfy the following requirements:

  • A single input tensor representing the image of type UInt8 (expecting values from 0 to 255) or Float 32 (values from -1 to 1).
  • At least 3 output tensors (the rest won’t be read) containing the bounding boxes, class labels, and confidence scores (in that order).
  • Bounding box output tensor must be ordered [x x y y], where x is an x-boundary (xmin or xmax) of the bounding box and the same is true for y. Each value should be between 0 and 1, designating the percentage of the image at which the boundary can be found.

These requirements are satisfied by a few publicly available model architectures including EfficientDet, MobileNet, and SSD MobileNet V1. You can use one of these architectures or build your own.

API

The MLModel service supports the following methods:

Method NameDescription
InferTake an already ordered input tensor as an array, make an inference on the model, and return an output tensor map.
MetadataGet the metadata (such as name, type, expected tensor/array shape, inputs, and outputs) associated with the ML model.
DoCommandSend arbitrary commands to the resource.
CloseSafely shut down the resource and prevent further use.

Infer

Take an already ordered input tensor as an array, make an inference on the model, and return an output tensor map.

Parameters:

  • input_tensors (Dict[str, NDArray]): A dictionary of input flat tensors, as specified in the metadata.
  • timeout (Optional[float]): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

  • (Dict[str, NDArray]): A dictionary of output flat tensors as specified in the metadata, after being run through an inference engine.

For more information, see the Python SDK Docs.

import numpy as np

my_mlmodel = MLModelClient.from_robot(robot=robot, name="my_mlmodel_service")

nd_array = np.array([1, 2, 3], dtype=np.float64)
input_tensors = {"0": nd_array}

output_tensors = await my_mlmodel.infer(input_tensors)

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.
  • tensors (ml.Tensors): The input map of tensors, as specified in the metadata.

Returns:

  • (ml.Tensors): The output map of tensors, as specified in the metadata, after being run through an inference engine.
  • (error): An error, if one occurred.

For more information, see the Go SDK Docs.

myMLModel, err := mlmodel.FromRobot(robot, "my_mlmodel_service")

input_tensors := ml.Tensors{"0": tensor.New(tensor.WithShape(1, 2, 3), tensor.WithBacking(6))}

output_tensors, err := myMLModel.Infer(ctx.Background(), input_tensors)

Metadata

Get the metadata: name, data type, expected tensor/array shape, inputs, and outputs associated with the ML model.

Parameters:

  • timeout (Optional[float]): An option to set how long to wait (in seconds) before calling a time-out and closing the underlying RPC call.

Returns:

  • (Metadata): Name, type, expected tensor/array shape, inputs, and outputs associated with the ML model.

For more information, see the Python SDK Docs.

my_mlmodel = MLModelClient.from_robot(robot=robot, name="my_mlmodel_service")

metadata = await my_mlmodel.metadata()

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.

Returns:

  • (MLMetadata): Struct containing the metadata of the model file, such as the name of the model, what kind of model it is, and the expected tensor/array shape and types of the inputs and outputs of the model.
  • (error): An error, if one occurred.

For more information, see the Go SDK Docs.

myMLModel, err := mlmodel.FromRobot(robot, "my_mlmodel_service")

metadata, err := myMLModel.Metadata(ctx.Background())

DoCommand

Execute model-specific commands that are not otherwise defined by the service API. For built-in service models, any model-specific commands available are covered with each model’s documentation. If you are implementing your own navigation service and add features that have no built-in API method, you can access them with DoCommand.

Parameters:

Returns:

Raises:

  • NotImplementedError: Raised if the Resource does not support arbitrary commands.
my_mlmodel = MLModelClient.from_robot(robot=robot, name="my_mlmodel_service")

my_command = {
  "command": "dosomething",
  "someparameter": 52
}

await my_mlmodel.do_command(my_command)

For more information, see the Python SDK Docs.

Parameters:

Returns:

myMLModel, err := mlmodel.FromRobot(robot, "my_mlmodel_service")

resp, err := myMLModel.DoCommand(ctx, map[string]interface{}{"command": "dosomething", "someparameter": 52})

For more information, see the Go SDK Docs.

Close

Safely shut down the resource and prevent further use.

Parameters:

  • None

Returns:

  • None
my_mlmodel = MLModelClient.from_robot(robot, "my_mlmodel_service")

await my_mlmodel.close()

For more information, see the Python SDK Docs.

Parameters:

  • ctx (Context): A Context carries a deadline, a cancellation signal, and other values across API boundaries.

Returns:

  • (error) : An error, if one occurred.
myMLModel, err := mlmodel.FromRobot(robot, "my_mlmodel_service")

err := myMLModel.Close(ctx)

For more information, see the Go SDK Docs.

Use the ML model service with the Viam Python SDK

To use the ML model service from the Viam Python SDK, install the Python SDK using the mlmodel extra:

pip install 'viam-sdk[mlmodel]'

You can also run this command on an existing Python SDK install to add support for the ML model service.

See the Python documentation for more information about the MLModel service in Python.

See Program a machine for more information about using an SDK to control your machine.

Next steps

To make use of your model with your machine, add a vision service or a modular resource:



Have questions, or want to meet other people working on robots? Join our Community Discord.

If you notice any issues with the documentation, feel free to file an issue or edit this file.