Train models with any machine learning frameworks

You can create custom Python training scripts that train ML models to your specifications using PyTorch, Tensorflow, TFLite, ONNX, or any other Machine Learning framework. Once you upload a training script to the Viam Registry, you can use it to build ML models in the Viam Cloud based on your datasets.

Prerequisites

A dataset with data you can train an ML model on. Click to see instructions.

For image data, you can follow the instructions to Create a dataset and label data to create a dataset.

For other data you can use the Data Client API from within the training script to get data stored in the Viam Cloud.

The Viam CLI. Click to see instructions.

You must have the Viam CLI installed to upload training scripts to the registry.

To download the Viam CLI on a macOS computer, install brew and run the following commands:

brew tap viamrobotics/brews
brew install viam

To download the Viam CLI on a Linux computer with the aarch64 architecture, run the following commands:

sudo curl -o /usr/local/bin/viam https://storage.googleapis.com/packages.viam.com/apps/viam-cli/viam-cli-stable-linux-arm64
sudo chmod a+rx /usr/local/bin/viam

To download the Viam CLI on a Linux computer with the amd64 (Intel x86_64) architecture, run the following commands:

sudo curl -o /usr/local/bin/viam https://storage.googleapis.com/packages.viam.com/apps/viam-cli/viam-cli-stable-linux-amd64
sudo chmod a+rx /usr/local/bin/viam

You can also install the Viam CLI using brew on Linux amd64 (Intel x86_64):

brew tap viamrobotics/brews
brew install viam

If you have Go installed, you can build the Viam CLI directly from source using the go install command:

go install go.viam.com/rdk/cli/viam@latest

To confirm viam is installed and ready to use, issue the viam command from your terminal. If you see help instructions, everything is correctly installed. If you do not see help instructions, add your local go/bin/* directory to your PATH variable. If you use bash as your shell, you can use the following command:

echo 'export PATH="$HOME/go/bin:$PATH"' >> ~/.bashrc

For more information see install the Viam CLI.

Create a training script

1. Create files

Create the following folders and empty files:

my-training/
├── model/
|   ├── training.py
|   └── __init__.py
└── setup.py

2. Add setup.py code

Add the following code to setup.py and add additional required packages on line 11:

from setuptools import find_packages, setup

setup(
    name="my-training",
    version="0.1",
    packages=find_packages(),
    include_package_data=True,
    install_requires=[
        "google-cloud-aiplatform",
        "google-cloud-storage",
        # TODO: Add additional required packages
    ],
)

3. Create __init__.py

If you haven’t already, create a folder called model and create an empty file inside it called __init__.py.

4. Add training.py code

Copy this template into training.py:

Click to see the template
import argparse
import json
import os
import typing as ty

single_label = "MODEL_TYPE_SINGLE_LABEL_CLASSIFICATION"
multi_label = "MODEL_TYPE_MULTI_LABEL_CLASSIFICATION"
labels_filename = "labels.txt"
unknown_label = "UNKNOWN"

API_KEY = os.environ['API_KEY']
API_KEY_ID = os.environ['API_KEY_ID']


# This parses the required args for the training script.
# The model_dir variable will contain the output directory where
# the ML model that this script creates should be stored.
# The data_json variable will contain the metadata for the dataset
# that you should use to train the model.
def parse_args():
    """Returns dataset file, model output directory, and num_epochs if present.
    These must be parsed as command line arguments and then used as the model
    input and output, respectively. The number of epochs can be used to
    optionally override the default.
    """
    parser = argparse.ArgumentParser()
    parser.add_argument("--dataset_file", dest="data_json", type=str)
    parser.add_argument("--model_output_directory", dest="model_dir", type=str)
    parser.add_argument("--num_epochs", dest="num_epochs", type=int)
    args = parser.parse_args()
    return args.data_json, args.model_dir, args.num_epochs


# This is used for parsing the dataset file (produced and stored in Viam),
# parse it to get the label annotations
# Used for training classifiction models
def parse_filenames_and_labels_from_json(
    filename: str, all_labels: ty.List[str], model_type: str
) -> ty.Tuple[ty.List[str], ty.List[str]]:
    """Load and parse JSON file to return image filenames and corresponding
    labels. The JSON file contains lines, where each line has the key
    "image_path" and "classification_annotations".
    Args:
        filename: JSONLines file containing filenames and labels
        all_labels: list of all N_LABELS
        model_type: string single_label or multi_label
    """
    image_filenames = []
    image_labels = []

    with open(filename, "rb") as f:
        for line in f:
            json_line = json.loads(line)
            image_filenames.append(json_line["image_path"])

            annotations = json_line["classification_annotations"]
            labels = [unknown_label]
            for annotation in annotations:
                if model_type == multi_label:
                    if annotation["annotation_label"] in all_labels:
                        labels.append(annotation["annotation_label"])
                # For single label model, we want at most one label.
                # If multiple valid labels are present, we arbitrarily select
                # the last one.
                if model_type == single_label:
                    if annotation["annotation_label"] in all_labels:
                        labels = [annotation["annotation_label"]]
            image_labels.append(labels)
    return image_filenames, image_labels


# Parse the dataset file (produced and stored in Viam) to get
# bounding box annotations
# Used for training object detection models
def parse_filenames_and_bboxes_from_json(
    filename: str,
    all_labels: ty.List[str],
) -> ty.Tuple[ty.List[str], ty.List[str], ty.List[ty.List[float]]]:
    """Load and parse JSON file to return image filenames
    and corresponding labels with bboxes.
    Args:
        filename: JSONLines file containing filenames and bboxes
        all_labels: list of all N_LABELS
    """
    image_filenames = []
    bbox_labels = []
    bbox_coords = []

    with open(filename, "rb") as f:
        for line in f:
            json_line = json.loads(line)
            image_filenames.append(json_line["image_path"])
            annotations = json_line["bounding_box_annotations"]
            labels = []
            coords = []
            for annotation in annotations:
                if annotation["annotation_label"] in all_labels:
                    labels.append(annotation["annotation_label"])
                    # Store coordinates in rel_yxyx format so that
                    # we can use the keras_cv function
                    coords.append(
                        [
                            annotation["y_min_normalized"],
                            annotation["x_min_normalized"],
                            annotation["y_max_normalized"],
                            annotation["x_max_normalized"],
                        ]
                    )
            bbox_labels.append(labels)
            bbox_coords.append(coords)
    return image_filenames, bbox_labels, bbox_coords


# Build the model
def build_and_compile_model(
    labels: ty.List[str], model_type: str, input_shape: ty.Tuple[int, int, int]
) -> Model:
    """Builds and compiles a model
    Args:
        labels: list of string lists, where each string list contains up to
        N_LABEL labels associated with an image
        model_type: string single_label or multi_label
        input_shape: 3D shape of input
    """

    # TODO: Add logic to build and compile model

    return model


def save_labels(labels: ty.List[str], model_dir: str) -> None:
    """Saves a label.txt of output labels to the specified model directory.
    Args:
        labels: list of string lists, where each string list contains up to
        N_LABEL labels associated with an image
        model_dir: output directory for model artifacts
    """
    filename = os.path.join(model_dir, labels_filename)
    with open(filename, "w") as f:
        for label in labels[:-1]:
            f.write(label + "\n")
        f.write(labels[-1])


def save_model(
    model: Model,
    model_dir: str,
    model_name: str,
) -> None:
    """Save model as a TFLite model.
    Args:
        model: trained model
        model_dir: output directory for model artifacts
        model_name: name of saved model
    """
    file_type = ""

    # Save the model to the output directory.
    filename = os.path.join(model_dir, f"{model_name}.{file_type}")
    with open(filename, "wb") as f:
        f.write(model)


if __name__ == "__main__":
    DATA_JSON, MODEL_DIR = parse_args()

    IMG_SIZE = (256, 256)

    # Read dataset file.
    # TODO: change labels to the desired model output.
    LABELS = ["orange_triangle", "blue_star"]

    # The model type can be changed based on whether you want the model to
    # output one label per image or multiple labels per image
    model_type = multi_label
    image_filenames, image_labels = parse_filenames_and_labels_from_json(
        DATA_JSON, LABELS, model_type)

    # Build and compile model on data
    model = build_and_compile_model()

    # Save labels.txt file
    save_labels(LABELS + [unknown_label], MODEL_DIR)
    # Convert the model to tflite
    save_model(
        model, MODEL_DIR, "classification_model", IMG_SIZE + (3,)
    )

5. Understand template script parsing functionality

When a training script is run, the Viam platform passes the dataset file for the training and the designated model output directory to the script.

The template contains functionality to parse the command line inputs and parse annotations from the dataset file.

Click for more information on parsing command line inputs.

The script you are creating must take the following command line inputs:

  • dataset_file: a file containing the data and metadata for the training job
  • model_output_directory: the location where the produced model artifacts are saved to

The parse_args() function in the template parses your arguments.

You can add additional custom command line inputs by adding them to the parse_args() function.

Click for more information on parsing annotations from dataset file.

When you submit a training job to the Viam Cloud, Viam will pass a dataset_file to the training script when you train an ML model with it. The file contains metadata from the dataset used for the training, including the file path for each data point and any annotations associated with the data.

Dataset JSON files for image datasets with bounding box labels and classification labels are formatted as follows:

{
    "image_path": "/path/to/data/data1.jpeg",
    "bounding_box_annotations": [
        {
            "annotation_label": "blue_star",
            "x_min_normalized": 0.38175675675675674,
            "x_max_normalized": 0.5101351351351351,
            "y_min_normalized": 0.35585585585585583,
            "y_max_normalized": 0.527027027027027
        }
    ],
    "classification_annotations": [
        {
            "annotation_label": "blue_star"
        }
    ]
}
{
    "image_path": "/path/to/data/data2.jpeg",
    "bounding_box_annotations": [
        {
            "annotation_label": "blue_star",
            "x_min_normalized": 0.2939189189189189,
            "x_max_normalized": 0.4594594594594595,
            "y_min_normalized": 0.25225225225225223,
            "y_max_normalized": 0.5495495495495496
        }
    ],
    "classification_annotations": [
        {
            "annotation_label": "blue_star"
        }
    ]
}

{
    "image_path": "/path/to/data/data3.jpeg",
    "bounding_box_annotations": [
        {
            "annotation_label": "blue_star",
            "x_min_normalized": 0.03557312252964427,
            "x_max_normalized": 0.2015810276679842,
            "y_min_normalized": 0.30526315789473685,
            "y_max_normalized": 0.5368421052631579
        },
        {
            "annotation_label": "blue_square",
            "x_min_normalized": 0.039525691699604744,
            "x_max_normalized": 0.2015810276679842,
            "y_min_normalized": 0.2578947368421053,
            "y_max_normalized": 0.5473684210526316
        }
    ],
    "classification_annotations": [
        {
            "annotation_label": "blue_star"
        },
        {
            "annotation_label": "blue_square"
        }
    ]
}

In your training script, you must parse the dataset file for the classification or bounding box annotations from the dataset metadata. Depending on if you are training a classification or detection model, the template script contains the parse_filenames_and_labels_from_json() and the parse_filenames_and_bboxes_from_json() function.

If the script you are creating does not use an image dataset, you only need the model output directory.

6. Add logic to produce the model artifact

You must fill in the build_and_compile_model function. In this part of the script, you use the data from the dataset and the annotations from the dataset file to build a Machine Learning model.

As an example, you can refer to the logic from model/training.py from this example classification training script that trains a classification model using TensorFlow and Keras.

7. Save the model artifact

The save_model() and the save_labels() functions in the template before the main logic save the model artifact your training job produces to the model_output_directory in the cloud.

Once a training job is complete, Viam checks the output directory and creates a package with all of the contents of the directory, creating or updating a registry item for the ML model.

You must fill in these functions.

As an example, you can refer to the logic from model/training.py from this example classification training script that trains a classification model using TensorFlow and Keras.

8. Update the main method

Update the main to call the functions you have just created.

9. Using Viam APIs in a training script

If you need to access any of the Viam APIs within a custom training script, you can use the environment variables API_KEY and API_KEY_ID to establish a connection. These environment variables will be available to training scripts.

async def connect() -> ViamClient:
    """Returns a authenticated connection to the ViamClient for the requested
    org associated with the submitted training job."""
    # The API key and key ID can be accessed programmatically, using the
    # environment variable API_KEY and API_KEY_ID. The user does not need to
    # supply the API keys, they are provided automatically when the training
    # job is submitted.
    dial_options = DialOptions.with_api_key(
        os.environ.get("API_KEY"), os.environ.get("API_KEY_ID")
    )
    return await ViamClient.create_from_dial_options(dial_options)

Test your training script locally

You can export one of your Viam datasets to test your training script locally.

1. Export your dataset

You can get the dataset ID from the dataset page or using the viam dataset list command:

viam dataset export --destination=<destination> --dataset-id=<dataset-id> --include-jsonl=true

The dataset will be formatted like the one Viam produces for the training. Use the parse_filenames_and_labels_from_json and parse_filenames_and_bboxes_from_json functions to get the images and annotations from your dataset file.

2. Run your training script locally

Install any required dependencies and run your training script specifying the path to the dataset.jsonl file from your exported dataset:

python3 -m model.training --dataset_file=/path/to/dataset.jsonl \
    --model_output_directory=. --custom_arg=3

Upload your training script

To be able to use your training script in the Viam platform, you must upload it to the Viam Registry.

1. Package the training script as a tar.gz source distribution

Before you can upload your training script to Viam, you have to compress your project folder into a tar.gz file:

tar -czvf my-training.tar.gz my-training/

2. Upload a training script

To upload your custom training script to the registry, use the viam training-script upload command.

viam training-script upload --path=<path-to-tar.gz> \
  --org-id=<org-id> --script-name=<training-script-name>
viam training-script upload --path=my-training.tar.gz \
  --org-id=<ORG_ID> --script-name=my-training-script

viam training-script upload --path=my-training.tar.gz \
  --org-id=<ORG_ID> --script-name=my-training \
  --framework=tensorflow --type=single_label_classification \
  --description="Custom image classification model" \
  --visibility=private

You can also specify the version, framework, type, visibility, and description when uploading a custom training script.

To find your organization’s ID, run the following command:

viam organization list

After a successful upload, the CLI displays a confirmation message with a link to view your changes online. You can view uploaded training scripts by navigating to the registry’s Training Scripts page.

Submit a training job

After uploading the training script, you can run it by submitting a training job through the Viam app or using the Viam CLI or ML Training client API.

1. Create the training job

Train models

In the Viam app, navigate to your list of DATASETS and select the one you want to train a model on.

Click Train model and select Train on a custom training script, then follow the prompts.

You can use viam train submit custom from-registry to submit a training job.

For example:

viam train submit custom from-registry --dataset-id=<INSERT DATASET ID> \
  --org-id=<INSERT ORG ID> --model-name=MyRegistryModel \
  --model-version=2 --version=1 \
  --script-name=mycompany:MyCustomTrainingScript
  --args=custom_arg1=3,custom_arg2="'green_square blue_star'"

This command submits a training job to the previously uploaded MyCustomTrainingScript with another input dataset, which trains MyRegistryModel and publishes that to the registry.

You can get the dataset id from the dataset page or using the viam dataset list command.

2. Check on training job process

You can view your training job on the DATA page’s TRAINING tab.

Once the model has finished training, it becomes visible on the DATA page’s MODELS tab.

You will receive an email when your training job completes.

You can also check your training jobs and their status from the CLI:

viam train list --org-id=<INSERT ORG ID> --job-status=unspecified

3. Debug your training job

From the DATA page’s TRAINING tab, click on your training job’s ID to see its logs.

You can also view your training jobs’ logs with the viam train logs command.

Next steps

To use your new model with machines, you must deploy it with the ML model service. Then you can use another service, such as the vision service, to apply the deployed model to camera feeds.

To see models in use with machines, see one of the following resources:

Have questions, or want to meet other people working on robots? Join our Community Discord.

If you notice any issues with the documentation, feel free to file an issue or edit this file.