π Distributed Asset Monitoring System
Real-time observability for geographically distributed infrastructure, IoT fleets, and hybrid cloud assets.
The Distributed Asset Monitoring System (DAMS) provides a unified platform to collect, process, and visualize metrics from thousands of assets spread across multiple regions. It supports agent-based and agentless monitoring, anomaly detection, and automated incident response.
ποΈ Architecture
DAMS follows a hub-and-spoke model with regional aggregators that preprocess data before forwarding to the central analytics engine.
ββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
β Edge Agents ββββββΆβ Regional Hub ββββββΆβ Central Brain β
β (assets) β β (aggregator) β β (ML + storage) β
ββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
β
βββββββββΌβββββββββ
β Dashboard & β
β Alert Manager β
ββββββββββββββββββ
Data flows through gRPC streams secured with mTLS. Regional hubs buffer telemetry during network partitions and synchronize once connectivity resumes.
πΉ Edge Layer
Lightweight agents (Rust) collecting CPU, memory, disk, custom metrics.πΉ Regional Hub
Aggregates, deduplicates, and compresses data; runs local anomaly checks.πΉ Central Brain
Timeseries DB (TimescaleDB) + Apache Kafka for streaming.πΉ UI & API
React dashboard + REST/gRPC API for queries and management.π§© Core Components
1. Asset Agent
Runs on each monitored node. Supports Linux, Windows, ARM (Raspberry Pi). Exposes a local health endpoint on :9091.
2. Regional Collector
Deployed per region or logical cluster. Accepts agent streams and forwards compressed batches every 10 seconds.
3. Central Storage & Query Engine
Uses TimescaleDB for metrics and ClickHouse for log-derived data. Query latency <50ms for 95th percentile.
4. Policy Engine
Evaluates rules (e.g., cpu > 90% for 5min) and triggers alerts via email, Slack, PagerDuty, or webhook.
π₯ Installation
Quick start with Docker Compose (recommended for evaluation):
git clone https://github.com/dams-project/dams.git
cd dams/deploy
docker compose up -d
For production, use the Helm chart (Kubernetes):
helm repo add dams https://charts.dams.io
helm install dams-central dams/central --namespace dams
helm install dams-regional dams/regional --set region=eu-west-1
dams-regional has network access to central Kafka brokers and the asset agents.
βοΈ Configuration
Central configuration file (dams-central.yaml):
central:
listen: 0.0.0.0:8443
tls:
cert_file: /etc/dams/certs/server.crt
key_file: /etc/dams/certs/server.key
storage:
type: timescaledb
connection: postgres://dams:secret@timescale:5432/dams
alerting:
slack_webhook: https://hooks.slack.com/services/...
pagerduty_key: YOUR_PD_KEY
Regional hub config overrides thresholds and retention per location.
π‘ API Reference
REST API base: https://central.dams.io/api/v2
GET /assets
List all registered assets with health status.
curl -H "Authorization: Bearer $TOKEN" \
https://central.dams.io/api/v2/assets
GET /metrics/{asset_id}
Retrieve real-time metrics for a specific asset.
curl https://central.dams.io/api/v2/metrics/node-42 \
-H "Authorization: Bearer $TOKEN"
POST /alerts/silence
Silence an active alert for a defined duration.
curl -X POST https://central.dams.io/api/v2/alerts/silence \
-H "Content-Type: application/json" \
-d '{"alert_id":"abc123","duration":"2h"}'
π₯οΈ Monitoring Dashboard
The web dashboard provides real-time maps, asset health timelines, and customizable widgets. Access it at https://central.dams.io after deployment.
πΊοΈ Geo Map
Assets plotted by region with color-coded status.π Timeseries
CPU, memory, disk, network charts with zoom.π Alert Feed
Live stream of firing and resolved alerts.π§© Custom Dashboards
Drag-and-drop widgets for team views.π¨ Alerting & Policies
Define alert rules in policies.yaml or via the UI. Example rule:
rules:
- name: high-cpu-warning
condition: avg(cpu_percent) > 90
duration: 5m
severity: warning
channels: [slack, email]
- name: disk-full-critical
condition: disk_used_percent > 95
severity: critical
channels: [pagerduty]
Alerts are automatically deduplicated and grouped by asset region.
β FAQ
What operating systems are supported for agents?
Linux (amd64, arm64), Windows Server 2019+, and macOS (for development). ARMv7 (Raspberry Pi) is fully supported.
Can I integrate with existing Prometheus setups?
Yes. DAMS exposes a Prometheus-compatible /metrics endpoint and can scrape external Prometheus instances.
How is data secured in transit?
All communication uses mTLS 1.3 with short-lived certificates. Agents authenticate via SPIFFE IDs.
What is the retention policy?
Default: 30 days raw metrics, 1 year downsampled. Configurable per regional hub.