Create custom training scripts

You can create your own custom Python training script that trains ML models to your specifications using the Machine Learning framework of your choice (PyTorch, Tensorflow, TFLite, ONNX, or any other framework). Once added to the Viam Registry, you can use the training script to build ML models based on your datasets.

Prerequisites

A dataset with data you can train an ML model on. Click to see instructions.
Follow the instructions to Create a dataset and label data to create a dataset.
The Viam CLI. Click to see instructions.

You must have the Viam CLI installed to upload training scripts to the registry.

To download the Viam CLI on a macOS computer, run the following commands:

brew tap viamrobotics/brews
brew install viam

To download the Viam CLI on a Linux computer with the aarch64 architecture, run the following commands:

sudo curl -o /usr/local/bin/viam https://storage.googleapis.com/packages.viam.com/apps/viam-cli/viam-cli-stable-linux-arm64
sudo chmod a+rx /usr/local/bin/viam

To download the Viam CLI on a Linux computer with the amd64 (Intel x86_64) architecture, run the following commands:

sudo curl -o /usr/local/bin/viam https://storage.googleapis.com/packages.viam.com/apps/viam-cli/viam-cli-stable-linux-amd64
sudo chmod a+rx /usr/local/bin/viam

You can also install the Viam CLI using brew on Linux amd64 (Intel x86_64):

brew tap viamrobotics/brews
brew install viam

If you have Go installed, you can build the Viam CLI directly from source using the go install command:

go install go.viam.com/rdk/cli/viam@latest

To confirm viam is installed and ready to use, issue the viam command from your terminal. If you see help instructions, everything is correctly installed. If you do not see help instructions, add your local go/bin/* directory to your PATH variable. If you use bash as your shell, you can use the following command:

echo 'export PATH="$HOME/go/bin:$PATH"' >> ~/.bashrc

For more information see install the Viam CLI.

Create a training script

1. Create folder structure

Create a folder for the training-script, for example my-training.

2. Create setup.py

Inside the top level folder (in this example my-training), create a file called setup.py with the following contents:

from setuptools import find_packages, setup

setup(
    name="my-training",
    version="0.1",
    packages=find_packages(),
    include_package_data=True,
    install_requires=[
        "google-cloud-aiplatform",
        "google-cloud-storage",
        # TODO: Add additional required packages
    ],
)

Ensure you add additional required packages on line 11.

3. Create __init__.py

Inside the top level folder (in this example my-training), create a folder called model and create an empty file called __init__.py inside it.

4. Create `training.py`

Inside the model folder, create a file called training.py.

Copy this template into training.py:

Click to see the template
import argparse
import json
import os
import typing as ty

single_label = "MODEL_TYPE_SINGLE_LABEL_CLASSIFICATION"
multi_label = "MODEL_TYPE_MULTI_LABEL_CLASSIFICATION"
labels_filename = "labels.txt"
unknown_label = "UNKNOWN"

API_KEY = os.environ['API_KEY']
API_KEY_ID = os.environ['API_KEY_ID']


# This parses the required args for the training script.
# The model_dir variable will contain the output directory where
# the ML model that this script creates should be stored.
# The data_json variable will contain the metadata for the dataset
# that you should use to train the model.
def parse_args():
    """Returns dataset file, model output directory, and num_epochs if present.
    These must be parsed as command line arguments and then used as the model
    input and output, respectively. The number of epochs can be used to
    optionally override the default.
    """
    parser = argparse.ArgumentParser()
    parser.add_argument("--dataset_file", dest="data_json", type=str)
    parser.add_argument("--model_output_directory", dest="model_dir", type=str)
    parser.add_argument("--num_epochs", dest="num_epochs", type=int)
    args = parser.parse_args()
    return args.data_json, args.model_dir, args.num_epochs


# This is used for parsing the dataset file (produced and stored in Viam),
# parse it to get the label annotations
# Used for training classifiction models
def parse_filenames_and_labels_from_json(
    filename: str, all_labels: ty.List[str], model_type: str
) -> ty.Tuple[ty.List[str], ty.List[str]]:
    """Load and parse JSON file to return image filenames and corresponding
    labels. The JSON file contains lines, where each line has the key
    "image_path" and "classification_annotations".
    Args:
        filename: JSONLines file containing filenames and labels
        all_labels: list of all N_LABELS
        model_type: string single_label or multi_label
    """
    image_filenames = []
    image_labels = []

    with open(filename, "rb") as f:
        for line in f:
            json_line = json.loads(line)
            image_filenames.append(json_line["image_path"])

            annotations = json_line["classification_annotations"]
            labels = [unknown_label]
            for annotation in annotations:
                if model_type == multi_label:
                    if annotation["annotation_label"] in all_labels:
                        labels.append(annotation["annotation_label"])
                # For single label model, we want at most one label.
                # If multiple valid labels are present, we arbitrarily select
                # the last one.
                if model_type == single_label:
                    if annotation["annotation_label"] in all_labels:
                        labels = [annotation["annotation_label"]]
            image_labels.append(labels)
    return image_filenames, image_labels


# Parse the dataset file (produced and stored in Viam) to get
# bounding box annotations
# Used for training object detection models
def parse_filenames_and_bboxes_from_json(
    filename: str,
    all_labels: ty.List[str],
) -> ty.Tuple[ty.List[str], ty.List[str], ty.List[ty.List[float]]]:
    """Load and parse JSON file to return image filenames
    and corresponding labels with bboxes.
    Args:
        filename: JSONLines file containing filenames and bboxes
        all_labels: list of all N_LABELS
    """
    image_filenames = []
    bbox_labels = []
    bbox_coords = []

    with open(filename, "rb") as f:
        for line in f:
            json_line = json.loads(line)
            image_filenames.append(json_line["image_path"])
            annotations = json_line["bounding_box_annotations"]
            labels = []
            coords = []
            for annotation in annotations:
                if annotation["annotation_label"] in all_labels:
                    labels.append(annotation["annotation_label"])
                    # Store coordinates in rel_yxyx format so that
                    # we can use the keras_cv function
                    coords.append(
                        [
                            annotation["y_min_normalized"],
                            annotation["x_min_normalized"],
                            annotation["y_max_normalized"],
                            annotation["x_max_normalized"],
                        ]
                    )
            bbox_labels.append(labels)
            bbox_coords.append(coords)
    return image_filenames, bbox_labels, bbox_coords


# Build the model
def build_and_compile_model(
    labels: ty.List[str], model_type: str, input_shape: ty.Tuple[int, int, int]
) -> Model:
    """Builds and compiles a model
    Args:
        labels: list of string lists, where each string list contains up to
        N_LABEL labels associated with an image
        model_type: string single_label or multi_label
        input_shape: 3D shape of input
    """

    # TODO: Add logic to build and compile model

    return model


def save_labels(labels: ty.List[str], model_dir: str) -> None:
    """Saves a label.txt of output labels to the specified model directory.
    Args:
        labels: list of string lists, where each string list contains up to
        N_LABEL labels associated with an image
        model_dir: output directory for model artifacts
    """
    filename = os.path.join(model_dir, labels_filename)
    with open(filename, "w") as f:
        for label in labels[:-1]:
            f.write(label + "\n")
        f.write(labels[-1])


def save_model(
    model: Model,
    model_dir: str,
    model_name: str,
) -> None:
    """Save model as a TFLite model.
    Args:
        model: trained model
        model_dir: output directory for model artifacts
        model_name: name of saved model
    """
    file_type = ""

    # Save the model to the output directory.
    filename = os.path.join(model_dir, f"{model_name}.{file_type}")
    with open(filename, "wb") as f:
        f.write(model)


if __name__ == "__main__":
    DATA_JSON, MODEL_DIR = parse_args()

    IMG_SIZE = (256, 256)

    # Read dataset file.
    # TODO: change labels to the desired model output.
    LABELS = ["orange_triangle", "blue_star"]

    # The model type can be changed based on whether you want the model to
    # output one label per image or multiple labels per image
    model_type = multi_label
    image_filenames, image_labels = parse_filenames_and_labels_from_json(
        DATA_JSON, LABELS, model_type)

    # Build and compile model on data
    model = build_and_compile_model()

    # Save labels.txt file
    save_labels(LABELS + [unknown_label], MODEL_DIR)
    # Convert the model to tflite
    save_model(
        model, MODEL_DIR, "classification_model", IMG_SIZE + (3,)
    )
5. Template script parsing functionality explained

You do not need to edit the scripts parsing functionality but if you want to understand the script fully, click on the following expanders:

Parsing command line inputs

The script you are creating must take the following command line inputs:

  • dataset_file: a file containing the data and metadata for the training job
  • model_output_directory: the location where the produced model artifacts are saved to

The parse_args() function in the template parses your arguments.

Parsing annotations from dataset file

When you submit a training job to the Viam Cloud, Viam will pass a dataset_file to the training script when you train an ML model with it. The file contains metadata from the dataset used for the training, including the file path for each data point and any annotations associated with the data.

Dataset JSON files for image datasets with bounding box labels and classification labels are formatted as follows:

{
    "image_path": "/path/to/data/data1.jpeg",
    "bounding_box_annotations": [
        {
            "annotation_label": "blue_star",
            "x_min_normalized": 0.38175675675675674,
            "x_max_normalized": 0.5101351351351351,
            "y_min_normalized": 0.35585585585585583,
            "y_max_normalized": 0.527027027027027
        }
    ],
    "classification_annotations": [
        {
            "annotation_label": "blue_star"
        }
    ]
}
{
    "image_path": "/path/to/data/data2.jpeg",
    "bounding_box_annotations": [
        {
            "annotation_label": "blue_star",
            "x_min_normalized": 0.2939189189189189,
            "x_max_normalized": 0.4594594594594595,
            "y_min_normalized": 0.25225225225225223,
            "y_max_normalized": 0.5495495495495496
        }
    ],
    "classification_annotations": [
        {
            "annotation_label": "blue_star"
        }
    ]
}

{
    "image_path": "/path/to/data/data3.jpeg",
    "bounding_box_annotations": [
        {
            "annotation_label": "blue_star",
            "x_min_normalized": 0.03557312252964427,
            "x_max_normalized": 0.2015810276679842,
            "y_min_normalized": 0.30526315789473685,
            "y_max_normalized": 0.5368421052631579
        },
        {
            "annotation_label": "blue_square",
            "x_min_normalized": 0.039525691699604744,
            "x_max_normalized": 0.2015810276679842,
            "y_min_normalized": 0.2578947368421053,
            "y_max_normalized": 0.5473684210526316
        }
    ],
    "classification_annotations": [
        {
            "annotation_label": "blue_star"
        },
        {
            "annotation_label": "blue_square"
        }
    ]
}

In your training script, you must parse the dataset file for the classification or bounding box annotations from the dataset metadata. Depending on if you are training a classification or detection model, the template script contains the parse_filenames_and_labels_from_json() and the parse_filenames_and_bboxes_from_json() function.

6. Add logic to produce the model artifact

You must fill in the build_and_compile_model function. In this part of the script, you use the data from the dataset and the annotations from the dataset file to build a Machine Learning model.

As an example, you can refer to the logic from model/training.py from this example classification training script that trains a classification model using TensorFlow and Keras.

7. Save the model artifact

The save_model() and the save_labels() functions in the template before the main logic save the model artifact your training job produces to the model_output_directory in the cloud.

Once a training job is complete, Viam checks the output directory and creates a package with all of the contents of the directory, creating or updating a registry item for the ML model.

You must fill in these functions.

As an example, you can refer to the logic from model/training.py from this example classification training script that trains a classification model using TensorFlow and Keras.

8. Update the main method

Update the main to which calls the functions you have just created.

9. Using Viam APIs in a training script

If you need to access any of the Viam APIs within a custom training script, you can use the environment variables API_KEY and API_KEY_ID to establish a connection. These environment variables will be available to training scripts.

async def connect() -> ViamClient:
    """Returns a authenticated connection to the ViamClient for the requested
    org associated with the submitted training job."""
    # The API key and key ID can be accessed programmatically, using the
    # environment variable API_KEY and API_KEY_ID. The user does not need to
    # supply the API keys, they are provided automatically when the training
    # job is submitted.
    dial_options = DialOptions.with_api_key(
        os.environ.get("API_KEY"), os.environ.get("API_KEY_ID")
    )
    return await ViamClient.create_from_dial_options(dial_options)

Test your training script locally

You can export one of your Viam datasets to test your training script locally.

1. Export your dataset

You can get the dataset id from the dataset page or using the viam dataset list command:

viam dataset export --destination=<destination> --dataset-id=<dataset-id> --include-jsonl=true

The dataset will be formatted like the one Viam produces for the training. Use the parse_filenames_and_labels_from_json and parse_filenames_and_bboxes_from_json functions to get the images and annotations from your dataset file.

2. Run your training script locally

Install any required dependencies and run your training script specifying the path to the <dataset.jsonl> file from your exported dataset:

python3 -m model.training --dataset_file=/path/to/dataset.jsonl --model_output_directory=.

Upload your training script

To be able to use your training script in the Viam platform, you must upload it to the Viam Registry.

1. Package the training script as a tar.gz source distribution

To run your training script on datasets in Viam, compress your project folder into a tar.gz file. You can run this command to create a .tar.gz archive from your project folder:

tar -czvf my-training.tar.gz my-training/

2. Upload a new training script (or a new version)

To upload a custom training script to the registry, use the viam training-script upload command.

For example:

viam training-script upload --path=<path-to-tar.gz> \
  --org-id=<org-id> --script-name=<training-script-name>
viam training-script upload --path=my-training.tar.gz \
  --org-id=<ORG_ID> --script-name=my-training-script

viam training-script upload --path=my-training.tar.gz \
  --org-id=<ORG_ID> --script-name=my-training \
  --framework=tensorflow --type=single_label_classification \
  --description="Custom image classification model" \
  --visibility=private

You can also specify the version, framework, type, visibility, and description when uploading a custom training script.

To find your organization’s ID, run the following command:

viam organization list

After a successful upload, you’ll receive a confirmation message with a link to view your changes online in the CLI. Once uploaded, you can view the script by navigating to the registry’s Training Scripts page.

Submit a training job

After uploading the training script, you can run it by submitting a training job through the Viam app or using the Viam CLI or ML Training client API.

1. Create the training job

Train models

In the Viam app, navigate to your list of DATASETS and select the one you want to train on.

Click Train model and select Train on a custom training script, then follow the prompts.

You can use viam train submit custom from-registry to submit a training job from a training script already uploaded to the registry or viam train submit custom from-upload to upload a training script and submit a training job at the same time.

For example:

viam train submit custom from-registry --dataset-id=<INSERT DATASET ID> \
  --org-id=<INSERT ORG ID> --model-name=MyRegistryModel \
  --model-version=2 --version=1 \
  --script-name=mycompany:MyCustomTrainingScript

This command submits a training job to the previously uploaded MyCustomTrainingScript with another input dataset, which trains MyRegistryModel and publishes that to the registry.

viam train submit custom with-upload --dataset-id=<INSERT DATASET ID> \
  --model-org-id=<INSERT ORG ID> --model-name=MyRegistryModel \
  --model-type=single_label_classification --model-version=2 \
  --version=1 --path=<path-to-tar.gz> \
  --script-name=mycompany:MyCustomTrainingScript

This command uploads a script called MyCustomTrainingScript to the registry under the specified organization and also submits a training job to that script with the input dataset, which generates a new version of the single-classification ML model MyRegistryModel and publishes that to the registry.

To find the dataset ID of a given dataset, go to the DATASETS subtab of the DATA tab on the Viam app and select a dataset. Click in the left-hand menu and click Copy dataset ID.

To find your organization’s ID, navigate to your organization’s Settings page in the Viam app. Find Organization ID and click the copy icon.

2. Check on training job process

Once submitted, you can view your training job on the DATA page’s MODELS tab.

You will receive an email when your training job completed.

You can also list your training jobs and their status from the CLI:

viam train list --org-id=<INSERT ORG ID> --job-status=unspecified

3. Debug your training job

If your training job failed you can check your job’s logs with the CLI:

viam train logs --job-id=<JOB ID>

You can obtain the job’s id by listing the jobs as in step 2.

Next steps

Have questions, or want to meet other people working on robots? Join our Community Discord.

If you notice any issues with the documentation, feel free to file an issue or edit this file.