πŸ“Š Distributed Asset Monitoring System

Real-time observability for geographically distributed infrastructure, IoT fleets, and hybrid cloud assets.

The Distributed Asset Monitoring System (DAMS) provides a unified platform to collect, process, and visualize metrics from thousands of assets spread across multiple regions. It supports agent-based and agentless monitoring, anomaly detection, and automated incident response.

🌍 multi-region ⚑ low latency πŸ”’ zero-trust πŸ“ˆ prometheus-compatible
Latest: v2.4 introduces distributed tracing and enhanced edge aggregation. See installation guide.

πŸ—οΈ Architecture

DAMS follows a hub-and-spoke model with regional aggregators that preprocess data before forwarding to the central analytics engine.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Edge Agents │────▢│ Regional Hub   │────▢│ Central Brain   β”‚
β”‚  (assets)    β”‚     β”‚ (aggregator)   β”‚     β”‚ (ML + storage)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                    β”‚
                                            β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
                                            β”‚  Dashboard &   β”‚
                                            β”‚  Alert Manager β”‚
                                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            

Data flows through gRPC streams secured with mTLS. Regional hubs buffer telemetry during network partitions and synchronize once connectivity resumes.

πŸ”Ή Edge Layer

Lightweight agents (Rust) collecting CPU, memory, disk, custom metrics.

πŸ”Ή Regional Hub

Aggregates, deduplicates, and compresses data; runs local anomaly checks.

πŸ”Ή Central Brain

Timeseries DB (TimescaleDB) + Apache Kafka for streaming.

πŸ”Ή UI & API

React dashboard + REST/gRPC API for queries and management.

🧩 Core Components

1. Asset Agent

Runs on each monitored node. Supports Linux, Windows, ARM (Raspberry Pi). Exposes a local health endpoint on :9091.

2. Regional Collector

Deployed per region or logical cluster. Accepts agent streams and forwards compressed batches every 10 seconds.

3. Central Storage & Query Engine

Uses TimescaleDB for metrics and ClickHouse for log-derived data. Query latency <50ms for 95th percentile.

4. Policy Engine

Evaluates rules (e.g., cpu > 90% for 5min) and triggers alerts via email, Slack, PagerDuty, or webhook.

πŸ“₯ Installation

Quick start with Docker Compose (recommended for evaluation):

git clone https://github.com/dams-project/dams.git
cd dams/deploy
docker compose up -d

For production, use the Helm chart (Kubernetes):

helm repo add dams https://charts.dams.io
helm install dams-central dams/central --namespace dams
helm install dams-regional dams/regional --set region=eu-west-1
⚠️ Ensure dams-regional has network access to central Kafka brokers and the asset agents.

βš™οΈ Configuration

Central configuration file (dams-central.yaml):

central:
  listen: 0.0.0.0:8443
  tls:
    cert_file: /etc/dams/certs/server.crt
    key_file: /etc/dams/certs/server.key
storage:
  type: timescaledb
  connection: postgres://dams:secret@timescale:5432/dams
alerting:
  slack_webhook: https://hooks.slack.com/services/...
  pagerduty_key: YOUR_PD_KEY

Regional hub config overrides thresholds and retention per location.

πŸ“‘ API Reference

REST API base: https://central.dams.io/api/v2

GET /assets

List all registered assets with health status.

curl -H "Authorization: Bearer $TOKEN" \
  https://central.dams.io/api/v2/assets

GET /metrics/{asset_id}

Retrieve real-time metrics for a specific asset.

curl https://central.dams.io/api/v2/metrics/node-42 \
  -H "Authorization: Bearer $TOKEN"

POST /alerts/silence

Silence an active alert for a defined duration.

curl -X POST https://central.dams.io/api/v2/alerts/silence \
  -H "Content-Type: application/json" \
  -d '{"alert_id":"abc123","duration":"2h"}'

πŸ–₯️ Monitoring Dashboard

The web dashboard provides real-time maps, asset health timelines, and customizable widgets. Access it at https://central.dams.io after deployment.

πŸ—ΊοΈ Geo Map

Assets plotted by region with color-coded status.

πŸ“Š Timeseries

CPU, memory, disk, network charts with zoom.

πŸ”” Alert Feed

Live stream of firing and resolved alerts.

🧩 Custom Dashboards

Drag-and-drop widgets for team views.

🚨 Alerting & Policies

Define alert rules in policies.yaml or via the UI. Example rule:

rules:
  - name: high-cpu-warning
    condition: avg(cpu_percent) > 90
    duration: 5m
    severity: warning
    channels: [slack, email]
  - name: disk-full-critical
    condition: disk_used_percent > 95
    severity: critical
    channels: [pagerduty]

Alerts are automatically deduplicated and grouped by asset region.

❓ FAQ

What operating systems are supported for agents?

Linux (amd64, arm64), Windows Server 2019+, and macOS (for development). ARMv7 (Raspberry Pi) is fully supported.

Can I integrate with existing Prometheus setups?

Yes. DAMS exposes a Prometheus-compatible /metrics endpoint and can scrape external Prometheus instances.

How is data secured in transit?

All communication uses mTLS 1.3 with short-lived certificates. Agents authenticate via SPIFFE IDs.

What is the retention policy?

Default: 30 days raw metrics, 1 year downsampled. Configurable per regional hub.