Previous
ML Training Client API
The Machine Learning (ML) model service allow you to deploy machine learning models to your smart machine.
Vision services, like an "mlmodel"
detector or classifier, enable your machines to identify and classify objects in images with the deployed models’ predictions.
The two services work closely together, with the vision service relying on the deployed ML model to make inferences.
If you are designing your own ML Model service, you must try to make your ML models’ shapes match the input and output tensors the mlmodel
vision service expects to work with if you want the two services to coordinate in classification or detection.
To be able to use a deployed ML model, the mlmodel
vision service checks for descriptions of these characteristics in the metadata of the model, as defined in the Python SDK.
For an example of this, see Example Metadata.
input_info
in metadataFor both classification and detection models, the vision service sends a single input tensor to the ML Model with the following structure:
"image"
with type uint8
or float32
and shape (1, height, width, 3)
, with the last channel 3
being the RGB bytes of the pixel.height
and width
are unknown or variable, then height
and/or width
= -1
. During inference runtime the image will have a known height and width.output_info
in metadataData can be returned by the ML model in many ways, due to the variety of machine learning models for computer vision. The vision service will try to take into account many different forms of models as specified by the metadata of the model. If the model does not provide metadata, the vision service will make the following assumptions:
For classifications:
"probability"
with shape (1, n_classifications)
0
and 1
.0
and 1
, the vision service computes a softmax over the data, resulting in floating point numbers between 0
and 1
representing probability.For detections:
"Location"
: the bounding boxes(1, n_detections, 4)
(xmin, ymin, xmax, ymax)
0
and 1
."Category"
: the labels on the boxes(1, n_detections)
"Score"
: The confidence scores of the label(1, n_detections)
0
and 1
.For labels:
Many computer vision models have an associated ’labelfile.txt’ that lists the class labels associated with the model.
To get those labels associated with the model, currently the vision service looks at the first element of the output_info
list in the ML models’ metadata and checks for a key called "labels"
in its "extra"
struct.
The value of that key should be the full path to the label file on the machine.
See Example Metadata for an example of this.
label_path = ml_model_metadata.output_info.extra["labels"]
For example, a TF lite detector model that works with the vision service is structured with the following metadata:
name: "EfficientDet Lite0 V1"
type: "tflite_detector"
description: "Identify which of a known set of objects might be present and provide information about their positions within the given image or a video stream."
input_info {
name: "image"
description: "Input image to be detected. The expected image is 320 x 320, with three channels (red, blue, and green) per pixel. Each value in the tensor is a single byte between 0 and 255."
data_type: "uint8"
shape: 1
shape: 320
shape: 320
shape: 3
extra {
}
}
output_info {
name: "location"
description: "The locations of the detected boxes."
data_type: "float32"
extra {
fields {
key: "labels"
value {
string_value: "/Users/<username>/.viam/packages/.data/ml_model/effdet0-1685040512967/effdetlabels.txt"
}
}
}
}
output_info {
name: "category"
description: "The categories of the detected boxes."
data_type: "float32"
associated_files {
name: "labelmap.txt"
description: "Label of objects that this model can recognize."
label_type: LABEL_TYPE_TENSOR_VALUE
}
extra {
}
}
output_info {
name: "score"
description: "The scores of the detected boxes."
data_type: "float32"
extra {
}
}
output_info {
name: "number of detections"
description: "The number of the detected boxes."
data_type: "float32"
extra {
}
}
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!