Capture and sync edge data

You can use the data management service to capture data from supported components and services, then sync it to the cloud. You can also sync data from arbitrary folders on your machine.

Data capture and sync in Viam involves two key pieces:

  • The data management service that writes captured data to local edge device storage and syncs that data with the cloud.
  • Individual resource configurations that specify what data to capture and how often.

How data capture and data sync works

The data management service writes data from your configured Viam resources to local storage on your edge device and syncs data from the edge device to the cloud:

  • The data management service stores captured data locally in ~/.viam/capture by default.
  • Data is synced to the Viam cloud at a configured sync interval using encrypted gRPC calls and deleted from the disk once synced.
  • You can capture and sync data independently, one can run without the other.

For more information, see How sync works.

Configure the data management service for your machine

From your machine’s CONFIGURE tab in the Viam app, add the data management service. On the panel that appears, configure data capture and sync attributes as applicable. To both capture data and sync it to the cloud, keep both Capturing and Syncing switched on.

Click the Save button in the top right corner of the page to save your config.

Data capture configuration card.
{
  "components": [],
  "services": [
    {
      "name": "my-data-manager",
      "namespace": "rdk",
      "type": "data_manager",
      "attributes": {
        "sync_interval_mins": 1,
        "capture_dir": "",
        "tags": [],
        "capture_disabled": false,
        "sync_disabled": true,
        "delete_data_on_part_deletion": true,
        "delete_every_nth_when_disk_full": 5,
        "maximum_num_sync_threads": 250
      }
    }
  ]
}
{
  "components": [],
  "services": [
    {
      "name": "my-data-manager",
      "namespace": "rdk",
      "type": "data_manager",
      "attributes": {
        "capture_dir": "",
        "tags": [],
        "additional_sync_paths": [],
        "sync_interval_mins": 3
      }
    }
  ]
}

The following attributes are available for the data management service:

Click to view data management attributes
NameTypeRequired?Descriptionviam-micro-server Support
capture_disabledboolOptionalToggle data capture on or off for the entire machine part. Note that even if capture is on for the whole part, but is not on for any individual components (see Step 2), data is not being captured.
Default: false

capture_dirstringOptionalPath to the directory on your machine where you want to store captured data. If you change the directory for data capture, only new data is stored in the new directory. Existing data remains in the directory where it was stored.
Default: ~/.viam/capture

tagsarray of stringsOptionalTags to apply to all images or tabular data captured by this machine part. May include alphanumeric characters, underscores, and dashes.

sync_disabledboolOptionalToggle cloud sync on or off for the entire machine part.
Default: false

additional_sync_pathsstring arrayOptionalPaths to any other directories on your machine from which you want to sync data to the cloud. Once data is synced from a directory, it is automatically deleted from your machine. We recommend using absolute paths. For relative paths, see How sync works.

sync_interval_minsfloatOptionalTime interval in minutes between syncing to the cloud. Viam does not impose a minimum or maximum on the frequency of data syncing. However, in practice, your hardware or network speed may impose limits on the frequency of data syncing.
Default: 0.1, meaning once every 6 seconds.

delete_data_on_part_deletionboolOptionalWhether deleting this machine or machine part should result in deleting all the data captured by that machine part.
Default: false

delete_every_nth_when_disk_fullintOptionalHow many files to delete when local storage meets the fullness criteria. The data management service will delete every Nth file that has been captured upon reaching this threshold. Use JSON mode to configure this attribute.
Default: 5, meaning that every fifth captured file will be deleted.

maximum_num_sync_threadsintOptionalMax number of CPU threads to use for syncing data to the Viam Cloud.
Default: runtime.NumCPU/2 so half the number of logical CPUs available to viam-server

mongo_capture_config.uristringOptionalThe MongoDB URI data capture will attempt to write tabular data to after it is enqueued to be written to disk. When non-empty, data capture will capture tabular data to the configured MongoDB database and collection at that URI.
See mongo_capture_config.database and mongo_capture_config.collection below for database and collection defaults.
See Data capture directly to MongoDB for an example config.

mongo_capture_config.databasestringOptionalWhen mongo_capture_config.uri is non empty, changes the database data capture will write tabular data to.
Default: "sensorData"

mongo_capture_config.collectionstringOptionalWhen mongo_capture_config.uri is non empty, changes the collection data capture will write tabular data to.
Default: "readings"

cache_size_kbfloatOptionalviam-micro-server only. The maximum amount of storage bytes (in kilobytes) allocated to a data collector.
Default: 1 KB.

file_last_modified_millisfloatOptionalThe amount of time to pass since arbitrary files were last modified until they are synced. Normal .capture files are synced as soon as they are able to be synced.
Default: 10000 milliseconds.

Configure data capture for individual resources

You can capture data for any resource that supports it, including resources on remote parts.

Configure data capture for individual resources in their configuration by:

  • Selecting which resource methods to capture data from
  • Setting the capture frequency for each method

For each resource you can capture data for, there is a Data capture section in its panel. Select a Method and specify a capture Frequency in hertz, for example to 0.1 to capture an image every 10 seconds. You can add multiple methods with different capture frequencies. Some methods will prompt you to add additional parameters.

The available methods, and corresponding additional parameters, will depend on the component or service type. For example, a camera has the options ReadImage and NextPointCloud. Keep in mind that some models do not support all options, for example webcams do not capture point clouds, and choose the method accordingly.

component config example

This example configuration captures data from the ReadImage method of a camera:

{
  "services": [
    ...
    ,
    {
      "name": "data_manager",
      "type": "data_manager",
      "attributes": {
        "sync_interval_mins": 5,
        "capture_dir": "",
        "sync_disabled": false,
        "tags": []
      }
    }
  ],
  "remotes": [
    {
        ...
    }
  ],
  "components": [
        ...
    ,
    {
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
              {
                "capture_frequency_hz": 0.333,
                "disabled": false,
                "method": "ReadImage",
                "additional_params": {
                  "reader_name": "cam1",
                  "mime_type": "image/jpeg"
                }
              }
            ],
            "retention_policy": {
              "days": 5
            }
          }
        }
      ],
      "model": "webcam",
      "name": "cam",
      "type": "camera",
      "attributes": {
        "video_path": "video0"
      },
      "depends_on": [
        "local"
      ]
    },
    ...
  ]
}

This example configuration captures data from the GetReadings method of a temperature sensor and wifi signal sensor:

{
  "services": [
    {
      "attributes": {
        "capture_dir": "",
        "tags": [],
        "additional_sync_paths": [],
        "sync_interval_mins": 3
      },
      "name": "dm",
      "namespace": "rdk",
      "type": "data_manager"
    }
  ],
  "components": [
    {
      "type": "sensor",
      "model": "tmp36",
      "attributes": {
        "analog_reader": "temp",
        "num_readings": 15
      },
      "depends_on": [],
      "service_configs": [
        {
          "attributes": {
            "capture_methods": [
              {
                "capture_frequency_hz": 0.2,
                "cache_size_kb": 10,
                "additional_params": {},
                "method": "Readings"
              }
            ]
          },
          "type": "data_manager"
        }
      ],
      "name": "tmp36",
      "namespace": "rdk"
    },
    {
      "type": "sensor",
      "model": "wifi-rssi",
      "attributes": {},
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
              {
                "additional_params": {},
                "method": "Readings",
                "capture_frequency_hz": 0.1,
                "cache_size_kb": 10
              }
            ]
          }
        }
      ],
      "name": "my-wifi-sensor",
      "namespace": "rdk"
    }
  ]
}

Example for a vision service:

This example configuration captures data from the CaptureAllFromCamera method of the vision service:

{
  "components": [
    {
      "name": "camera-1",
      "namespace": "rdk",
      "type": "camera",
      "model": "webcam",
      "attributes": {}
    }
  ],
  "services": [
    {
      "name": "vision-1",
      "namespace": "rdk",
      "type": "vision",
      "model": "mlmodel",
      "attributes": {},
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
              {
                "method": "CaptureAllFromCamera",
                "capture_frequency_hz": 1,
                "additional_params": {
                  "mime_type": "image/jpeg",
                  "camera_name": "camera-1",
                  "min_confidence_score": "0.7"
                }
              }
            ]
          }
        }
      ]
    },
    {
      "name": "data_manager-1",
      "namespace": "rdk",
      "type": "data_manager",
      "attributes": {
        "sync_interval_mins": 0.1,
        "capture_dir": "",
        "tags": [],
        "additional_sync_paths": []
      }
    },
    {
      "name": "mlmodel-1",
      "namespace": "rdk",
      "type": "mlmodel",
      "model": "viam:mlmodel-tflite:tflite_cpu",
      "attributes": {}
    }
  ],
  "modules": [
    {
      "type": "registry",
      "name": "viam_tflite_cpu",
      "module_id": "viam:tflite_cpu",
      "version": "0.0.3"
    }
  ]
}

Viam supports data capture from resources on remote parts. For example, if you use a part that does not have a Linux operating system or does not have enough storage or processing power to run viam-server, you can still process and capture the data from that part’s resources by adding it as a remote part.

Currently, you can only configure data capture from remote resources in your JSON configuration. To add them to your JSON configuration you must explicitly add the remote resource’s type, model, name, and additional_params to the data_manager service configuration in the remotes configuration:

KeyDescription
typeThe type tells your machine what the resource is. For example, a board.
modelThe model is a colon-delimited-triplet that specifies the namespace, the type of the part, and the part itself.
nameThe name specifies the fully qualified name of the part.
additional_paramsThe additional parameters specify the data sources when you are using a board.
Click to view example JSON configuration for an ESP32 board that will be established as a remote part

The following example shows the configuration of the part that we will establish as a remote, in this case an ESP32 board. This config is just like that of a non-remote part; the remote connection is established by the main part (in the next expandable example).

{
  "components": [
    {
      "name": "my-esp32",
      "model": "esp32",
      "type": "board",
      "namespace": "rdk",
      "attributes": {
        "pins": [27],
        "analogs": [
          {
            "pin": "34",
            "name": "A1"
          },
          {
            "pin": "35",
            "name": "A2"
          }
        ]
      },
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
              {
                "method": "Analogs",
                "additional_params": {
                  "reader_name": "A1"
                },
                "cache_size_kb": 10,
                "capture_frequency_hz": 10
              },
              {
                "method": "Analogs",
                "additional_params": {
                  "reader_name": "A2"
                },
                "cache_size_kb": 10,
                "capture_frequency_hz": 10
              }
            ]
          }
        }
      ]
    }
  ]
}
Click to view the JSON configuration for capturing data from two analog readers and a pin of the board's GPIO

The following example of a configuration with a remote part captures data from two analog readers that provide a voltage reading and from pin 27 of the GPIO of the board that we configured in the previous example:

{
  "services": [
    {
      "name": "data_manager",
      "type": "data_manager",
      "attributes": {
        "capture_dir": "",
        "sync_disabled": true,
        "sync_interval_mins": 5,
        "tags": ["tag1", "tag2"]
      },
      "name": "data_manager",
      "type": "data_manager"
    }
  ],
  "components": [],
  "remotes": [
    {
      "name": "esp-home",
      "address": "esp-home-main.33vvxnbbw9.viam.cloud:80",
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
             // Captures data from two analog readers (A1 and A2)
             {
                "method": "Analogs",
                "capture_frequency_hz": 1,
                "cache_size_kb": 10,
                "name": "rdk:component:board/my-esp32",
                "additional_params": { "reader_name": "A1" },
                "disabled": false
             },
             {
                "method": "Analogs",
                "capture_frequency_hz": 1,
                "cache_size_kb": 10,
                "name": "rdk:component:board/my-esp32",
                "additional_params": { "reader_name": "A2" },
                "disabled": false
              },
              // Captures data from pin 27 of the board's GPIO
              {
                "method": "Gpios",
                "capture_frequency_hz": 1,
                "cache_size_kb": 10,
                "name": "rdk:component:board/my-esp32",
                "additional_params": {
                  "pin_name": “27”
                },
                "disabled": false
              }
            ]
          }
        }
      ],
      "secret": "REDACTED"
    }
  ]
}
Click to view the JSON configuration for capturing data from a camera

The following example of a configuration with a remote part captures data from the ReadImage method of a camera:

{
  "services": [
    {
      "attributes": {
        "capture_dir": "",
        "sync_disabled": true,
        "sync_interval_mins": 5,
        "tags": []
      },
      "name": "data_manager",
      "type": "data_manager"
    }
  ],
  "components": [],
  "remotes": [
    {
      "name": "pi-test-main",
      "address": "pi-test-main.vw3iu72d8n.viam.cloud",
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
              {
                "capture_frequency_hz": 1,
                "name": "rdk:component:camera/cam",
                "disabled": false,
                "method": "ReadImage",
                "additional_params": {
                  "mime_type": "image/jpeg",
                  "reader_name": "cam1"
                }
              }
            ]
          }
        }
      ],
      "secret": "REDACTED"
    }
  ]
}

The following attributes are available for data capture configuration:

Click to view data capture attributes
NameTypeRequired?Description
capture_frequency_hzfloatRequiredFrequency in hertz at which to capture data. For example, to capture a reading every 2 seconds, enter 0.5.
methodstringRequiredDepends on the type of component or service. See Supported components and services.
retention_policyobjectOptionalOption to configure how long data collected by this component or service should remain stored in the Viam Cloud. You must set this in JSON mode. See the JSON example for a camera component.
Options: "days": <int>, "binary_limit_gb": <int>, "tabular_limit_gb": <int>.
Days are in UTC time. Setting a retention policy of 1 day means that data stored now will be deleted the following day in UTC time. You can set either or both of the size limit options and size is in gigabytes.
additional_paramsdependsdependsVaries based on the method. For example, ReadImage requires a MIME type.

Click the Save button in the top right corner of the page to save your config.

If cloud sync is enabled, the data management service deletes captured data once it has successfully synced to the cloud.

Click for more automatic data deletion details

With viam-server, the data management service will also automatically delete local data in the event your machine’s local storage fills up. Local data is automatically deleted when all of the following conditions are met:

  • Data capture is enabled on the data management service
  • Local disk usage percentage is greater than or equal to 90%
  • The Viam capture directory is at least 50% of the current local disk usage

If local disk usage is greater than or equal to 90%, but the Viam capture directory is not at least 50% of that usage, a warning log message will be emitted instead and no action will be taken.

Automatic file deletion only applies to files in the specified Viam capture directory, which is set to ~/.viam/capture by default. Data outside of this directory is not touched by automatic data deletion.

If your machine captures a large amount of data, or frequently goes offline for long periods of time while capturing data, consider moving the Viam capture directory to a larger, dedicated storage device on your machine if available. You can change the capture directory using the capture_dir attribute.

You can also control how local data is deleted if your machine’s local storage becomes full, using the delete_every_nth_when_disk_full attribute.

Supported resources

The following components and services support data capture and cloud sync:

TypeMethod
ArmEndPosition, JointPositions
BoardAnalogs, Gpios
CameraGetImages, ReadImage, NextPointCloud
EncoderTicksCount
GantryLengths, Position
MotorPosition, IsPowered
Movement sensorAngularVelocity, CompassHeading, LinearAcceleration, LinearVelocity, Orientation, Position
SensorReadings
ServoPosition
Vision serviceCaptureAllFromCamera

View captured data

To view all the captured data you have access to, go to the DATA tab where you can filter by location, type of data, and more.

You can also access data from a resource or machine part menu.

Stop data capture and data sync

If you don’t need to capture data, for instance in a test scenario, you can turn off data capture to reduce unnecessary storage. Alternatively, see advanced data capture and sync configurations for other ways to control data usage, such as conditional sync or retention policies.

To turn off data capture for a specific resource’s capture method (for example, a camera component capturing through the GetImage capture method) navigate to the Data capture section of your resource’s configuration card and toggle the configured capture method’s switch to Off. You can also globally turn off data capture on the data_manager service configuration card by toggling the Capturing switch to Off.

To turn off data sync, navigate to the data_manager service configuration card and toggle the Syncing switch to Off.

Click the Save button in the top right corner of the page to save your config.

Advanced data capture and sync configurations

Capture directly to MongoDB

You can configure direct capture of tabular data to a MongoDB instance alongside disk storage on your edge device. This can be useful for powering real-time dashboards before data is synced from the edge to the cloud.

Configure using the mongo_capture_config attributes in your data manager service.

Here is a sample configuration that will capture fake sensor readings both to the configured MongoDB URI as well as to the ~/.viam/capture directory on disk:

Click to view configuration
{
  "components": [
    {
      "name": "sensor-1",
      "namespace": "rdk",
      "type": "sensor",
      "model": "fake",
      "attributes": {},
      "service_configs": [
        {
          "type": "data_manager",
          "attributes": {
            "capture_methods": [
              {
                "method": "Readings",
                "capture_frequency_hz": 0.5,
                "additional_params": {}
              }
            ]
          }
        }
      ]
    }
  ],
  "services": [
    {
      "name": "data_manager-1",
      "namespace": "rdk",
      "type": "data_manager",
      "attributes": {
        "mongo_capture_config": {
          "uri": "mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000"
        }
      }
    }
  ]
}

When mongo_capture_config.uri is configured, data capture will attempt to connect to the configured MongoDB server and write captured tabular data to the configured mongo_capture_config.database and mongo_capture_config.collection (or their defaults if unconfigured) after enqueuing that data to be written to disk.

If writes to MongoDB fail for any reason, data capture will log an error for each failed write and continue capturing.

Failing to write to MongoDB doesn’t affect capturing and syncing data to cloud storage other than adding capture latency.

Conditional sync

By default, viam-server checks for new data to sync at the configured interval (sync_interval_mins). You can additionally configure sync to only happen when certain conditions are met. For example:

  • Only sync when on WiFi
  • Sync when specific events are detected
  • Sync during certain time windows

See Conditional cloud sync for how to implement conditional syncs.

Cloud data retention

Configure how long your synced data remains stored in the cloud:

  • Retain data up to a certain size (for example, 100GB) or for a specific length of time (for example, 14 days): Set retention_policies at the resource level. See the retention_policy field in data capture configuration attributes.
  • Delete data captured by a machine when you delete the machine: Control whether your cloud data is deleted when a machine or machine part is removed. See the delete_data_on_part_deletion field in the data management service configuration attributes.

Sync optimization

Configurable sync threads: You can control how many concurrent sync operations occur by adjusting the maximum_num_sync_threads setting. Higher values may improve throughput on more powerful hardware, but raising it too high may introduce instability on resource-constrained devices.

Wait time before syncing arbitrary files: If you choose to sync arbitrary files (beyond those captured by the data management service), the file_last_modified_millis configuration attribute specifies how long a file must remain unmodified before the data manager considers it for syncing. The default is 10 seconds.