Train models with any machine learning frameworks
You can create custom Python training scripts that train ML models to your specifications using PyTorch, Tensorflow, TFLite, ONNX, or any other Machine Learning framework. Once you upload a training script to the Viam Registry, you can use it to build ML models in the Viam Cloud based on your datasets.
In this page
- Create a training script from a template.
- Test your training script locally with a downloaded dataset.
- Upload your training script.
- Submit a training job that uses the training script on a dataset to train a new ML model.
Prerequisites
Create a training script
1. Create files
Create the following folders and empty files:
my-training/
├── model/
| ├── training.py
| └── __init__.py
└── setup.py
2. Add setup.py
code
Add the following code to setup.py
and add additional required packages on line 11:
from setuptools import find_packages, setup
setup(
name="my-training",
version="0.1",
packages=find_packages(),
include_package_data=True,
install_requires=[
"google-cloud-aiplatform",
"google-cloud-storage",
# TODO: Add additional required packages
],
)
3. Create __init__.py
If you haven’t already, create a folder called
4. Add training.py
code
Copy this template into
5. Understand template script parsing functionality
When a training script is run, the Viam platform passes the dataset file for the training and the designated model output directory to the script.
The template contains functionality to parse the command line inputs and parse annotations from the dataset file.
If the script you are creating does not use an image dataset, you only need the model output directory.
6. Add logic to produce the model artifact
You must fill in the build_and_compile_model
function.
In this part of the script, you use the data from the dataset and the annotations from the dataset file to build a Machine Learning model.
As an example, you can refer to the logic from
7. Save the model artifact
The save_model()
and the save_labels()
functions in the template before the main
logic save the model artifact your training job produces to the model_output_directory
in the cloud.
Once a training job is complete, Viam checks the output directory and creates a package with all of the contents of the directory, creating or updating a registry item for the ML model.
You must fill in these functions.
As an example, you can refer to the logic from
8. Update the main method
Update the main to call the functions you have just created.
9. Using Viam APIs in a training script
If you need to access any of the Viam APIs within a custom training script, you can use the environment variables API_KEY
and API_KEY_ID
to establish a connection.
These environment variables will be available to training scripts.
async def connect() -> ViamClient:
"""Returns a authenticated connection to the ViamClient for the requested
org associated with the submitted training job."""
# The API key and key ID can be accessed programmatically, using the
# environment variable API_KEY and API_KEY_ID. The user does not need to
# supply the API keys, they are provided automatically when the training
# job is submitted.
dial_options = DialOptions.with_api_key(
os.environ.get("API_KEY"), os.environ.get("API_KEY_ID")
)
return await ViamClient.create_from_dial_options(dial_options)
Test your training script locally
You can export one of your Viam datasets to test your training script locally.
1. Export your dataset
You can get the dataset ID from the dataset page or using the viam dataset list
command:
viam dataset export --destination=<destination> --dataset-id=<dataset-id> --include-jsonl=true
The dataset will be formatted like the one Viam produces for the training.
Use the parse_filenames_and_labels_from_json
and parse_filenames_and_bboxes_from_json
functions to get the images and annotations from your dataset file.
2. Run your training script locally
Install any required dependencies and run your training script specifying the path to the
python3 -m model.training --dataset_file=/path/to/dataset.jsonl \
--model_output_directory=. --custom_arg=3
Upload your training script
To be able to use your training script in the Viam platform, you must upload it to the Viam Registry.
1. Package the training script as a
Before you can upload your training script to Viam, you have to compress your project folder into a tar.gz file:
tar -czvf my-training.tar.gz my-training/
Tip
You can refer to the directory structure of this example classification training script.
2. Upload a training script
To upload your custom training script to the registry, use the viam training-script upload
command.
viam training-script upload --path=<path-to-tar.gz> \
--org-id=<org-id> --script-name=<training-script-name>
viam training-script upload --path=my-training.tar.gz \
--org-id=<ORG_ID> --script-name=my-training-script
viam training-script upload --path=my-training.tar.gz \
--org-id=<ORG_ID> --script-name=my-training \
--framework=tensorflow --type=single_label_classification \
--description="Custom image classification model" \
--visibility=private
You can also specify the version, framework, type, visibility, and description when uploading a custom training script.
To find your organization’s ID, run the following command:
viam organization list
After a successful upload, the CLI displays a confirmation message with a link to view your changes online. You can view uploaded training scripts by navigating to the registry’s Training Scripts page.
Submit a training job
After uploading the training script, you can run it by submitting a training job through the Viam app or using the Viam CLI or ML Training client API.
1. Create the training job
In the Viam app, navigate to your list of DATASETS and select the one you want to train a model on.
Click Train model and select Train on a custom training script, then follow the prompts.
You can use viam train submit custom from-registry
to submit a training job.
For example:
viam train submit custom from-registry --dataset-id=<INSERT DATASET ID> \
--org-id=<INSERT ORG ID> --model-name=MyRegistryModel \
--model-version=2 --version=1 \
--script-name=mycompany:MyCustomTrainingScript
--args=custom_arg1=3,custom_arg2="'green_square blue_star'"
This command submits a training job to the previously uploaded MyCustomTrainingScript
with another input dataset, which trains MyRegistryModel
and publishes that to the registry.
You can get the dataset id from the dataset page or using the viam dataset list
command.
2. Check on training job process
You can view your training job on the DATA page’s TRAINING tab.
Once the model has finished training, it becomes visible on the DATA page’s MODELS tab.
You will receive an email when your training job completes.
You can also check your training jobs and their status from the CLI:
viam train list --org-id=<INSERT ORG ID> --job-status=unspecified
3. Debug your training job
From the DATA page’s TRAINING tab, click on your training job’s ID to see its logs.
Note
Your training script may output logs at the error level but still succeed.
You can also view your training jobs’ logs with the viam train logs
command.
Next steps
To use your new model with machines, you must deploy it with the ML model service. Then you can use another service, such as the vision service, to apply the deployed model to camera feeds.
To see models in use with machines, see one of the following resources:
Have questions, or want to meet other people working on robots? Join our Community Discord.
If you notice any issues with the documentation, feel free to file an issue or edit this file.
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!