Where does AI actually run — on-device or in the cloud?

Both tiers work together. On-device, mos-roi-shader (Vulkan ROI extraction, dmabuf-to-NPU zero-copy) and mos-ai-runtime (.tflite inference on NPU) run triage directly on the box — discarding irrelevant frames before they leave the device. A Munic cloud product then runs heavier per-clip inference on the uploaded segments. The two tiers are complementary; you do not have to choose one.

How do I tie a clip to a hard-brake or DTC event?

Matching is done cloud-side via timestamp correlation. The lifecycle event stream carries driver behaviour signals, DTC codes, OBD position, and GNSS pose with wall-clock timestamps. The TimeCorrelator component attaches a wall_clock_trustworthy flag to each event so the cloud can reject ambiguous correlations. You do not write the correlation logic — the AI Funnel pipeline handles it.

Is the clip GDPR-compliant by the time the cloud sees it?

Yes. Anonymisation happens at capture time on the device, before the segment is written to disk. The cloud never receives raw pixel data containing personally identifiable information. See the /platform/compliance page for the full GDPR posture — blur pipeline, retention policy enforcement, and audit trail.

Do I write any AI runtime code?

No. mos-ai-runtime ships the on-device .tflite inference engine — multi-model concurrent execution across NPU slices is handled by the micro service. The AI Funnel pipeline covers cloud retraining, quantisation, and OTA model delivery. You supply the model file and the integration point; MOS4 owns the runtime.

Can I deploy a custom AI model?

Yes. mos-ai-runtime accepts .tflite models and supports multi-model concurrent execution across NPU slices on supported silicon (iMX8M Plus, QCS6490). You bring the model; the micro service manages scheduling, dmabuf-to-NPU zero-copy, and result delivery to the AI Funnel pipeline. Talk to engineering for silicon-specific quantisation guidance.

Video · Cloud AI on clips

On-device NPU today. Cloud AI on the bytes you upload.

mos-roi-shader and mos-ai-runtime do triage on the box; a Munic cloud product runs heavier per-clip AI on uploaded segments and merges with the lifecycle event stream — driver behaviour, DTC, OBD position, GNSS pose. The clip is GDPR-clean by the time the cloud sees it.

Talk to engineering Back to Video overview

How it works

Two AI tiers. One lifecycle event stream.

flowchart TD
  CAP[Capture<br/>GDPR live blur at capture time] --> NPU[NPU triage<br/>on-device mos-roi-shader + mos-ai-runtime]
  CAP --> REC[Recorder segment<br/>anonymised clip]
  REC --> UPL[SFTP upload<br/>GDPR-clean bytes]
  UPL --> CAI[Cloud AI<br/>per-clip inference]
  DRV[Driver behaviour] --> LS[Lifecycle event stream]
  OBD[DTC / OBD position] --> LS
  GNS[GNSS pose] --> LS
  LS --> CAI
  CAI --> DSH[Fleet dashboard]

Capture (GDPR blur) → NPU triage on-device + recorder segment → SFTP upload → Cloud AI; lifecycle event stream (driver behaviour, DTC/OBD, GNSS) → Cloud AI → Fleet dashboard

On-device triage runs before a byte leaves the box — Vulkan ROI extraction and NPU inference discard irrelevant frames. What reaches the cloud is already anonymised and already filtered. Cloud AI runs heavier per-clip models and merges with the AI Funnel lifecycle event stream for fleet-scale driver behaviour and incident correlation.

2 AI tiers on-device NPU triage + cloud per-clip inference

0 CPU pixel copies in AI hot path dmabuf-to-NPU zero-copy on Qualcomm via rpcmem/ION

~100 TOPS AI-class silicon QCS6490 · iMX8M Plus NPU slices

0 integration code for AI Funnel pipeline lifecycle event correlation ships with the micro service

What you do not write

The infrastructure MOS4 ships.

service mos-roi-shader

Vulkan ROI extraction with dmabuf-to-NPU zero-copy on Qualcomm via rpcmem/ION — no CPU pixel copies in the AI hot path. Regions of interest are extracted and delivered to the NPU without leaving the accelerator memory domain.

service mos-ai-runtime

.tflite NPU inference on iMX8M Plus and QCS6490 — multi-model concurrent execution across NPU slices, managed by the micro service. Bring your .tflite model file; the runtime handles scheduling, memory, and result delivery.

pipeline AI Funnel

Cloud retraining, quantisation, and OTA model delivery — managed by the AI Funnel pipeline. Retrained models land on the device without a firmware update cycle. Integration code is not required to wire the feedback loop.

runtime lifecycle event correlation

TimeCorrelator attaches a wall_clock_trustworthy flag to each lifecycle event — driver behaviour, DTC, OBD position, GNSS pose. Cloud AI uses the flag to reject ambiguous timestamp matches before merging with clip inference results. GDPR posture is documented at /platform/compliance.

Scope boundary

A specific Munic cloud product provides the per-clip inference and lifecycle merge described on this page. Product name, pricing, and cloud architecture are outside the scope of this page — that conversation starts with engineering. Use the CTA below to connect.

FAQ

Frequently asked questions

Where does AI actually run — on-device or in the cloud?

Both tiers work together. On-device, mos-roi-shader (Vulkan ROI extraction, dmabuf-to-NPU zero-copy) and mos-ai-runtime (.tflite inference on NPU) run triage directly on the box — discarding irrelevant frames before they leave the device. A Munic cloud product then runs heavier per-clip inference on the uploaded segments. The two tiers are complementary; you do not have to choose one.
How do I tie a clip to a hard-brake or DTC event?

Matching is done cloud-side via timestamp correlation. The lifecycle event stream carries driver behaviour signals, DTC codes, OBD position, and GNSS pose with wall-clock timestamps. The TimeCorrelator component attaches a wall_clock_trustworthy flag to each event so the cloud can reject ambiguous correlations. You do not write the correlation logic — the AI Funnel pipeline handles it.
Is the clip GDPR-compliant by the time the cloud sees it?

Yes. Anonymisation happens at capture time on the device, before the segment is written to disk. The cloud never receives raw pixel data containing personally identifiable information. See the /platform/compliance page for the full GDPR posture — blur pipeline, retention policy enforcement, and audit trail.
Do I write any AI runtime code?

No. mos-ai-runtime ships the on-device .tflite inference engine — multi-model concurrent execution across NPU slices is handled by the micro service. The AI Funnel pipeline covers cloud retraining, quantisation, and OTA model delivery. You supply the model file and the integration point; MOS4 owns the runtime.
Can I deploy a custom AI model?

Yes. mos-ai-runtime accepts .tflite models and supports multi-model concurrent execution across NPU slices on supported silicon (iMX8M Plus, QCS6490). You bring the model; the micro service manages scheduling, dmabuf-to-NPU zero-copy, and result delivery to the AI Funnel pipeline. Talk to engineering for silicon-specific quantisation guidance.

Two AI tiers. Zero integration code for the runtime.

On-device NPU triage filters before upload; a Munic cloud product runs per-clip AI and merges with the lifecycle event stream. Talk to engineering to scope the model and the silicon.

Talk to engineering Back to Video overview →

Continue

On-device NPU today. Cloud AI on the bytes you upload.

Two AI tiers. One lifecycle event stream.

The infrastructure MOS4 ships.

Frequently asked questions

Where does AI actually run — on-device or in the cloud?

How do I tie a clip to a hard-brake or DTC event?

Is the clip GDPR-compliant by the time the cloud sees it?

Do I write any AI runtime code?

Can I deploy a custom AI model?

Two AI tiers. Zero integration code for the runtime.

Other video functions

Multi-camera

Record on event

Stream live

Retrieve a clip

Wake on event