Edge AI with Cumulocity: Bringing Intelligence Closer to the Source

Authors: @Kanishk_Chaturvedi @Tobias_Sommer

Introduction

Artificial Intelligence is rapidly moving beyond the confines of centralized cloud data centers. As industrial systems, sensors, and connected products continue to generate massive volumes of data, it becomes increasingly inefficient — and sometimes impractical — to send all that data to the cloud for analysis.
This is where Edge AI comes in.

Edge AI refers to deploying and executing AI models directly on edge devices — such as gateways, cameras, or embedded systems — located near the data source. This approach offers several advantages:

  • Low latency: Inference happens locally, enabling real-time decision-making.
  • Reduced bandwidth usage: Only insights or anomalies are sent to the cloud, not raw data.
  • Improved privacy and security: Sensitive data can stay within local boundaries.
  • Higher resilience: Systems can continue operating even when offline or disconnected.

Cumulocity provides a robust foundation for realizing these Edge AI scenarios. With capabilities for device management, model versioning, remote software deployment, and secure connectivity, Cumulocity enables enterprises to operationalize AI models seamlessly across their device fleets — regardless of use case or hardware.

After our earlier demonstration of EdgeMLOps for Vision AI, this article explores how generic Edge AI use cases (like anomaly detection) can be achieved using the same framework — empowering industries to detect early signs of failure, optimize performance, and automate operations directly at the edge.

Architecture

At the heart of Edge AI with Cumulocity lies a simple yet powerful architecture that bridges AI models, cloud management, and edge execution.

This architecture ensures that:

  • Models are centrally managed in the cloud (Cumulocity)
  • Devices are securely connected and controlled through thin-edge.io
  • AI models are executed locally on the edge hardware for immediate insights

Example: Anomaly Detection at the Edge

Let’s explore an example where we use this setup to detect anomalies in vibration data — a typical requirement for predictive maintenance or equipment monitoring.

Imagine a motor or pump continuously streaming vibration signals along three axes: Vibration_X, Vibration_Y, and Vibration_Z. Normally, these signals follow predictable patterns. However, sudden spikes, drops, or irregular patterns could indicate potential mechanical issues or early-stage faults.

An anomaly detection model can continuously analyze this data to detect when the system’s behavior deviates from the normal range.

Step 1: Prepare the Model

Anomaly detection models are designed to identify unusual patterns that differ from the system’s normal operating behavior.

They can be unsupervised (trained only on normal data) or semi-supervised (using both normal and abnormal examples).

Many pre-trained models are already available on public repositories like Hugging Face, covering domains such as vibration monitoring, temperature sensing, and acoustic anomaly detection. However, organizations often build custom models tailored to their specific signals and operational environments.

Custom models can be developed using frameworks such as TensorFlow, PyTorch, or scikit-learn.

In this exercise, we trained a tiny 1D Convolutional Autoencoder using PyTorch to detect anomalies in vibration data (Vibration_X, Vibration_Y, Vibration_Z).

The autoencoder learns to reconstruct “normal” vibration signals. When it encounters a signal it cannot reconstruct accurately (i.e., high reconstruction error), it flags it as an anomaly.

This approach is lightweight, explainable, and ideal for running directly on edge hardware.

Step 2: Export to ONNX

Once trained, the model is exported to ONNX (Open Neural Network Exchange) format — a standardized representation that allows models from different frameworks (like PyTorch, TensorFlow, or Keras) to run on any ONNX-compatible runtime.

Why ONNX?

  • Interoperability: Train in any framework and deploy anywhere.
  • Portability: Same model can run across different hardware (Jetson, x86, ARM, etc.).
  • Optimization: Supports quantization, pruning, and graph optimization for faster inference on edge devices.
  • Lightweight Execution: ONNX Runtime provides efficient execution even on resource-limited devices.

Step 3: Manage the Model in Cumulocity

The exported ONNX model is uploaded to the Cumulocity’s Repository, where it becomes part of a version-controlled asset library.

Cumulocity provides:

  • Centralized storage of models (ONNX, TensorRT, TFLite, etc.)
  • Metadata management (model type, input shape, accuracy, training date)
  • Version control for continuous improvement (v1.0, v1.1, v2.0, etc.)
  • Deployment grouping (deploy model X to all “Factory Line A” devices)

This ensures that all models are traceable, auditable, and securely distributed across edge nodes.

Step 4: Connect the Edge Device via Thin-edge.io

The target edge device — for example, in this case, an NVIDIA Jetson Nano — runs Thin-edge.io, a lightweight open-source agent that bridges Cumulocity and the device.

Once installed and connected:

  • The device registers itself on the Cumulocity platform.
  • It reports its status, capabilities, and resource availability.
  • It becomes eligible for receiving AI model updates and configuration changes.

Step 5: Deploy the Model

From the Cumulocity UI, the AI model package (ONNX file) can be deployed to the connected device using the software management feature.

Thin-edge automatically:

  • Downloads and installs the model
  • Invokes the local runtime (ONNX Runtime in our example, but can also be TensorRT, or custom Python service)
  • Starts inferencing on live vibration data streams

This process is fully automated and scalable — whether for one device or hundreds.

Step 6: Perform On-Device Inferencing

The edge device now performs real-time inference on vibration data.

Whenever the model detects abnormal behavior — such as an unexpected spike in vibration magnitude or signal deviation — it flags an anomaly event.

As seen in the screenshot, the anomaly result can be then:

  • Sent to Cumulocity as an alarm
  • Visualized on dashboards or widgets
  • Used to trigger Smart Rules or further actions (like sending maintenance alerts)

Step 7: Continuous Improvement (Edge MLOps)

AI models are never static. As new data becomes available, models can be retrained, improved, and versioned.

Cumulocity supports an Edge MLOps lifecycle, where improved versions of models can be uploaded to the AI Repository and pushed to edge devices as updates — ensuring continuous learning and adaptation.

This workflow closes the loop:
Train → Export → Manage → Deploy → Infer → Monitor → Retrain → Redeploy

Conclusion

Cumulocity makes it possible to realize end-to-end Edge AI workflows — from model management in the cloud to real-time inference on the edge.

Whether it’s computer vision, anomaly detection, predictive maintenance, or acoustic monitoring, the same principles apply:

  • Centralize model governance in the cloud
  • Securely connect devices via thin-edge.io
  • Push the right model to the right device
  • Run inference locally and report only insights

The result is a flexible, scalable, and hardware-agnostic Edge AI framework — one that accelerates innovation while reducing latency, bandwidth, and operational costs.

With Cumulocity, AI doesn’t just live in the cloud — it thrives at the edge.

2 Likes