How to train and deploy ML/computer vision models

You can use Viam’s built-in tools to train a machine learning (ML) model on your images and then deploy computer vision on your machines.

Diagram of the camera component to data management service to ML model service to vision service pipeline.

You can use ML models to help your machines adapt their behavior to the world around them.

For example, you can train a model to recognize your dog and detect whether they are sitting or standing. You could then use that knowledge to give your dog treats or capture images only when your dog is in the camera frame so you don’t capture hundreds of photos of an empty room.

Prerequisites

A running machine connected to the Viam app. Click to see instructions.
Add a new machine in the Viam app. Then follow the setup instructions to install viam-server on the computer you’re using for your project and connect to the Viam app. Wait until your machine has successfully connected.
A configured camera. Click to see instructions.

Navigate to the CONFIGURE tab of your machine’s page in the Viam app. Click the + icon next to your machine part in the left-hand menu and select Component. Then find and add a camera model that supports your camera.

If you are not sure what to use, start with a webcam which supports most USB cameras and inbuilt laptop webcams.

No computer or webcam?

No problem. You don’t need to buy or own any hardware to complete this guide.

Use Try Viam to borrow a rover free of cost online. The rover already has viam-server installed and is configured with some components, including a webcam.

Once you have borrowed a rover, go to its CONTROL tab where you can view camera streams and also drive the rover. You should have a front-facing camera and an overhead view of your rover. Now you know what the rover can perceive.

To change what the front-facing camera is pointed at, find the cam camera panel on the CONTROL tab and click Toggle picture-in-picture so you can continue to view the camera stream. Then, find the viam_base panel and drive the rover around.

Now that you have seen that the cameras on your Try Viam rover work, begin by Creating a dataset and labeling data. You can drive the rover around as you capture data to get a variety of images from different angles.

Create a dataset and label data

Start by assembling the dataset to train your machine learning model on.

Just testing and want a dataset to get started with? Click here.

We have two datasets you can use for testing, one with shapes and the other with a wooden figure:

The shapes dataset. The datasets subtab of the data tab in the Viam app, showing a custom 'viam-figure' dataset of 25 images, most containing the wooden Viam figure
  1. Download the shapes dataset or download the wooden figure dataset.

  2. Unzip the download.

  3. Open a terminal and go to the dataset folder.

  4. In it you will find a Python script to upload the data to the Viam app.

  5. Open the script and fill in the constants at the top of the file.

  6. Run the script to upload the data into a dataset in Viam app:

    python3 upload_data.py
    
  7. Continue to Train a machine learning model.

Collect data

1. Collect images

Start by collecting images from your cameras and syncing it to the Viam app. See Collect image data and sync it to the cloud for instructions.

When training machine learning models, it is important to supply a variety of different data about the subject in different situations, such as from different angles or in different lighting situations. The more varied the provided data set, the more accurate the resulting model becomes.

Label data

2. Label your images

Once you have enough images of the objects you’d like to identify captured and synced to the Viam app, use the interface on the DATA tab to label your data.

You can label your images to create:

  • Detection models: Draw bounding boxes around distinct objects within captured images. The trained model will enable your machine to be able to detect those objects on its own.
  • Classification models: Add tags to each of your images with class labels that describe it. The trained model will enable your machine to classify similar images on its own.

Create image tags (for an image classifier)

You can use tags to create classification models for images. For example, if you would like to create a model that identifies an image of a star in a set of images, tag each image containing a star with a star tag. You also need images without the star tag or with another tag like notstar.

To tag an image, click on an image and select the Image tags mode in the menu that opens. Add one or more tags to your image.

If you want to expand the image, click on the expand side menu arrow in the corner of the image.

Repeat this with all images.

Create bounding boxes (for an object detector)

You can create one or more bounding boxes for objects in each image. For example, if you would like to create a model that detects a dog in an image, add bounding boxes around the dog in each of your images and add or select the label dog.

To add a bounding box, click on an image and select the Bounding box mode in the menu that opens. Choose an existing label or create a new label. Click on the image where you would like to add the bounding box and drag to where the bounding box should end.

To expand the image, click on the expand side menu arrow in the corner of the image:

Repeat this with all images. To see all the images that have bounding boxes, you can filter your dataset by selecting the label from the Bounding box labels dropdown in the Filters menu.

3. Create a dataset

A dataset allows you to conveniently view, work with, and train an ML model on a collection of images.

Use the interface on the DATA tab (or the viam data dataset add command) to add all images you want to train the model on to a dataset.

Click on an image you want to train your ML model. In the Actions pane on the right-hand side, enter a dataset name under Datasets, then press return. Repeat this with all images you want to add to your dataset.

Want to do this programmatically? Click here.

You can also add all data with a certain label to a dataset using the viam dataset data add command or the Data Client API:

viam dataset create --org-id=<org-id> --name=<name>
viam dataset data add filter --dataset-id=<dataset-id> --tags=red_star,blue_square

You can run this script to add all data from your machine to a dataset:

import asyncio

from viam.rpc.dial import DialOptions, Credentials
from viam.app.viam_client import ViamClient
from viam.utils import create_filter
from viam.proto.app.data import BinaryID


async def connect() -> ViamClient:
    dial_options = DialOptions(
      credentials=Credentials(
        type="api-key",
        # Replace "<API-KEY>" (including brackets) with your machine's API key
        payload='<API-KEY>',
      ),
      # Replace "<API-KEY-ID>" (including brackets) with your machine's
      # API key ID
      auth_entity='<API-KEY-ID>'
    )
    return await ViamClient.create_from_dial_options(dial_options)


async def main():
    # Make a ViamClient
    viam_client = await connect()
    # Instantiate a DataClient to run data client API methods on
    data_client = viam_client.data_client

    # Replace "<PART-ID>" (including brackets) with your machine's part id
    my_filter = create_filter(part_id="<PART-ID>")

    print("Getting data for part...")
    binary_metadata, _, _ = await data_client.binary_data_by_filter(
        my_filter,
        include_binary_data=False
    )
    my_binary_ids = []

    for obj in binary_metadata:
        my_binary_ids.append(
            BinaryID(
                file_id=obj.metadata.id,
                organization_id=obj.metadata.capture_metadata.organization_id,
                location_id=obj.metadata.capture_metadata.location_id
                )
            )
    print("Creating dataset...")
    # Create dataset
    try:
        dataset_id = await data_client.create_dataset(
            name="MyDataset",
            organization_id=ORG_ID
        )
        print("Created dataset: " + dataset_id)
    except Exception:
        print("Error. Check that the dataset name does not already exist.")
        print("See: https://app.viam.com/data/datasets")
        return 1

    print("Adding data to dataset...")
    await data_client.add_binary_data_to_dataset_by_ids(
        binary_ids=my_binary_ids,
        dataset_id=dataset_id
    )
    print("Added files to dataset.")
    print("See dataset: https://app.viam.com/data/datasets?id=" + dataset_id)

    viam_client.close()

if __name__ == '__main__':
    asyncio.run(main())

To remove an image from a dataset click on the x button next to the dataset name.

Train a machine learning (ML) model

1. Train an ML model

In the Viam app, navigate to your list of DATASETS and select the one you want to train on.

Click Train model and follow the prompts.

Select to train a new model or update an existing model. You can train or update using Built-in training or using a training script from the Viam Registry.

Click Next steps.

The shapes dataset.

2. Select the details for your ML model

  • Enter a name or use the suggested name for your new model.
  • For built-in training scripts, select a Model Type. Depending on the training script you’ve chose, you may have a number of these options:
    • Single Label Classification: The resulting model predicts one of the selected labels or UNKNOWN per image. Select this if you only have one label on each image. Ensure that the dataset you are training on also contains unlabeled images.
    • Multi Label Classification: The resulting model predicts one or more of the selected labels per image.
    • Object Detection: The resulting model predicts either no detected objects or any number of object labels alongside their locations per image.
  • For built-in classification training, select the tags you want to train your model on from the Labels section. Unselected tags will be ignored, and will not be part of the resulting model.
  • Click Train model.
The data tab showing the train a model pane

3. Wait for your model to train

The model now starts training and you can follow its process in the Training section of the Models page.

The models tab on the data page showing a model named my-classifier-model being trained

Once the model has finished training, it becomes visible in the Models section of the page. You will receive an email when your model finishes training.

The trained model

4. Debug your training job

If your training job failed you can check your job’s logs with the CLI.

You can obtain the job’s id by listing the jobs:

viam train list --org-id=<INSERT ORG ID> --job-status=unspecified

Then use the job id to get your training job’s logs:

viam train logs --job-id=<JOB ID>

Test your ML model

Once your model has finished training, you can test it with images in the Viam app:

  1. Navigate to the DATA tab and click on the Images subtab.
  2. Click on an image to open the side menu, and select the Actions tab.
  3. In the Run model section, select your model and specify a confidence threshold.
  4. Click Run model

If the results exceed the confidence threshold, the Run model section shows a label and the responding confidence threshold.

When satisfied that your ML model is working well, continue to deploy an ML model. If the vision service is not detecting or classifying reliably, you will need to adjust your ML model by consider adding and labelling more images in your dataset.

Ideally, you want your ML model to be able to identify objects with a high level of confidence, which is dependent on a robust training dataset.

Deploy an ML model

To use an ML model on your machine, you need to deploy it to your machine using a compatible ML model service. The ML model service will run the model and allow a vision service to use it:

Train models

1. Deploy your ML model

Navigate to the CONFIGURE tab of one of your machines in the Viam app. Here, add an ML model service that supports the ML model you just trained and add the model as the Model. For example use the TFLite CPU ML model service for TFlite ML models. This service will deploy and run the model.

Configure a service

2. Configure an mlmodel vision service

The vision service takes the ML model and applies it to the stream of images from your camera.

Add the vision / ML model service to your machine. Then, from the Select model dropdown, select the name of the ML model service you configured in the last step (for example, mlmodel-1).

Click Save to save your changes.

Deploy your model

3. Use your detector or classifier

You can test your detector by clicking on the Test area of the vision service’s configuration panel or from the CONTROL tab.

The camera stream will show classification or detections when it identifies something, depending on your model. Try placing an object your ML model can recognize in front of the camera. If you are using a Viam rover, use the viam_base panel to move your rover, then click on the vision panel to check for classifications or detections.

Detected blue star Detection of a viam figure with a confidence score of 0.97
Want to limit the number of shown classifications or detections? Click here.

If you are seeing a lot of classifications or detections, you can set a minimum confidence threshold.

On the configuration page of the vision service in the top right corner, click {} (Switch to advanced). Add the following JSON to the JSON configuration to set the default_minimum_confidence of the detector:

"default_minimum_confidence": 0.82

The full configuration for the attributes of the vision service should resemble:

{
  "mlmodel_name": "mlmodel-1",
  "default_minimum_confidence": 0.82
}

This optional attribute reduces your output by filtering out classifications or detections below the threshold of 82% confidence. You can adjust this attribute as necessary.

Click the Save button in the top right corner of the page to save your configuration, then close and reopen the Test panel of the vision service configuration panel. Now if you reopen the panel, you will only see classifications or detections with a confidence value higher than the default_minimum_confidence attribute.

For more detailed information, including optional attribute configuration, see the mlmodel docs.

You can also test your detector or classifier with code.

Versioning for deployed models

If you upload or train a new version of a model, Viam automatically deploys the latest version of the model to the machine. If you do not want Viam to automatically deploy the latest version of the model, you can edit the "packages" array in the JSON configuration of your machine. This array is automatically created when you deploy the model and is not embedded in your service configuration.

You can get the version number from a specific model version by navigating to the models page finding the model’s row, clicking on the right-side menu marked with and selecting Copy package JSON. For example: 2024-02-28T13-36-51. The model package config looks like this:

"packages": [
  {
    "package": "<model_id>/<model_name>",
    "version": "YYYY-MM-DDThh-mm-ss",
    "name": "<model_name>",
    "type": "ml_model"
  }
]

Next steps

To work with datasets programmatically, see the data API which includes several methods to work with datasets:

See the following tutorials for examples of using machine learning models to make your machine interact intelligently based on what it detects:

Have questions, or want to meet other people working on robots? Join our Community Discord.

If you notice any issues with the documentation, feel free to file an issue or edit this file.