Create custom training scripts
You can create your own custom Python training script that trains ML models to your specifications using the Machine Learning framework of your choice (PyTorch, Tensorflow, TFLite, ONNX, or any other framework). Once added to the Viam Registry, you can use the training script to build ML models based on your datasets.
In this page
- Create a training script from a template.
- Test your training script locally with a downloaded dataset.
- Upload your training script.
- Submit a training job that uses the training script on a dataset to train a new ML model.
Prerequisites
Create a training script
1. Create folder structure
Create a folder for the training-script, for example
2. Create setup.py
Inside the top level folder (in this example setup.py
with the following contents:
from setuptools import find_packages, setup
setup(
name="my-training",
version="0.1",
packages=find_packages(),
include_package_data=True,
install_requires=[
"google-cloud-aiplatform",
"google-cloud-storage",
# TODO: Add additional required packages
],
)
Ensure you add additional required packages on line 11.
3. Create __init__.py
Inside the top level folder (in this example
Inside the
Copy this template into
You do not need to edit the scripts parsing functionality but if you want to understand the script fully, click on the following expanders:
6. Add logic to produce the model artifact
You must fill in the build_and_compile_model
function.
In this part of the script, you use the data from the dataset and the annotations from the dataset file to build a Machine Learning model.
As an example, you can refer to the logic from
7. Save the model artifact
The save_model()
and the save_labels()
functions in the template before the main
logic save the model artifact your training job produces to the model_output_directory
in the cloud.
Once a training job is complete, Viam checks the output directory and creates a package with all of the contents of the directory, creating or updating a registry item for the ML model.
You must fill in these functions.
As an example, you can refer to the logic from
8. Update the main method
Update the main to which calls the functions you have just created.
9. Using Viam APIs in a training script
If you need to access any of the Viam APIs within a custom training script, you can use the environment variables API_KEY
and API_KEY_ID
to establish a connection.
These environment variables will be available to training scripts.
async def connect() -> ViamClient:
"""Returns a authenticated connection to the ViamClient for the requested
org associated with the submitted training job."""
# The API key and key ID can be accessed programmatically, using the
# environment variable API_KEY and API_KEY_ID. The user does not need to
# supply the API keys, they are provided automatically when the training
# job is submitted.
dial_options = DialOptions.with_api_key(
os.environ.get("API_KEY"), os.environ.get("API_KEY_ID")
)
return await ViamClient.create_from_dial_options(dial_options)
Test your training script locally
You can export one of your Viam datasets to test your training script locally.
1. Export your dataset
You can get the dataset id from the dataset page or using the viam dataset list
command:
viam dataset export --destination=<destination> --dataset-id=<dataset-id> --include-jsonl=true
The dataset will be formatted like the one Viam produces for the training.
Use the parse_filenames_and_labels_from_json
and parse_filenames_and_bboxes_from_json
functions to get the images and annotations from your dataset file.
2. Run your training script locally
Install any required dependencies and run your training script specifying the path to the <dataset.jsonl> file from your exported dataset:
python3 -m model.training --dataset_file=/path/to/dataset.jsonl --model_output_directory=.
Upload your training script
To be able to use your training script in the Viam platform, you must upload it to the Viam Registry.
1. Package the training script as a
To run your training script on datasets in Viam, compress your project folder into a tar.gz file. You can run this command to create a .tar.gz archive from your project folder:
tar -czvf my-training.tar.gz my-training/
Tip
You can refer to the directory structure of this example classification training script.
2. Upload a new training script (or a new version)
To upload a custom training script to the registry, use the viam training-script upload
command.
For example:
viam training-script upload --path=<path-to-tar.gz> \
--org-id=<org-id> --script-name=<training-script-name>
viam training-script upload --path=my-training.tar.gz \
--org-id=<ORG_ID> --script-name=my-training-script
viam training-script upload --path=my-training.tar.gz \
--org-id=<ORG_ID> --script-name=my-training \
--framework=tensorflow --type=single_label_classification \
--description="Custom image classification model" \
--visibility=private
You can also specify the version, framework, type, visibility, and description when uploading a custom training script.
To find your organization’s ID, run the following command:
viam organization list
After a successful upload, you’ll receive a confirmation message with a link to view your changes online in the CLI. Once uploaded, you can view the script by navigating to the registry’s Training Scripts page.
Submit a training job
After uploading the training script, you can run it by submitting a training job through the Viam app or using the Viam CLI or ML Training client API.
1. Create the training job
In the Viam app, navigate to your list of DATASETS and select the one you want to train on.
Click Train model and select Train on a custom training script, then follow the prompts.
You can use viam train submit custom from-registry
to submit a training job from a training script already uploaded to the registry or viam train submit custom from-upload
to upload a training script and submit a training job at the same time.
For example:
viam train submit custom from-registry --dataset-id=<INSERT DATASET ID> \
--org-id=<INSERT ORG ID> --model-name=MyRegistryModel \
--model-version=2 --version=1 \
--script-name=mycompany:MyCustomTrainingScript
This command submits a training job to the previously uploaded MyCustomTrainingScript
with another input dataset, which trains MyRegistryModel
and publishes that to the registry.
viam train submit custom with-upload --dataset-id=<INSERT DATASET ID> \
--model-org-id=<INSERT ORG ID> --model-name=MyRegistryModel \
--model-type=single_label_classification --model-version=2 \
--version=1 --path=<path-to-tar.gz> \
--script-name=mycompany:MyCustomTrainingScript
This command uploads a script called MyCustomTrainingScript
to the registry under the specified organization and also submits a training job to that script with the input dataset, which generates a new version of the single-classification ML model MyRegistryModel
and publishes that to the registry.
To find the dataset ID of a given dataset, go to the DATASETS subtab of the DATA tab on the Viam app and select a dataset. Click … in the left-hand menu and click Copy dataset ID.
To find your organization’s ID, navigate to your organization’s Settings page in the Viam app. Find Organization ID and click the copy icon.
2. Check on training job process
Once submitted, you can view your training job on the DATA page’s MODELS tab.
You will receive an email when your training job completed.
You can also list your training jobs and their status from the CLI:
viam train list --org-id=<INSERT ORG ID> --job-status=unspecified
3. Debug your training job
If your training job failed you can check your job’s logs with the CLI:
viam train logs --job-id=<JOB ID>
You can obtain the job’s id by listing the jobs as in step 2.
Next steps
Have questions, or want to meet other people working on robots? Join our Community Discord.
If you notice any issues with the documentation, feel free to file an issue or edit this file.
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!