Monitoring Flow¶
Observability untuk pipeline CDC, database, dan API. Bukan APM penuh — fokus operasional migrasi & sink data.
Komponen observability¶
flowchart TB
subgraph sources [Sumber sinyal]
PG[(PostgreSQL)]
KF[Kafka]
CH[(ClickHouse)]
BE[NestJS]
PM2[PM2 logs]
end
subgraph surfaces [Permukaan monitor]
UI[migrasi-ui / cdc-monitor]
APIH[GET /api/health]
LOGS[logs schema / file]
CHK[CHECKLIST_SERVER.md]
end
PG --> UI
KF --> UI
CH --> UI
BE --> APIH
BE --> LOGS
PM2 --> LOGS
UI --> CHK
Dashboard operasional (migrasi-ui)¶
| Path | Fungsi |
|---|---|
| frontend/web/migrasi-ui/ | Workflow bootstrap, ETL, CDC, validasi PG=CH |
| frontend/web/cdc-monitor/ | Monitor lag Kafka (read-focused) |
Metrik yang ditampilkan (CDC):
- Kafka lag per topik tenant
- Throughput consumer (
rows/s, ETA) — setelah tuning 2026 - Status connector Debezium (via REST :8083)
- Langkah workflow berurutan (guardrail sebagian)
Env contoh: KAFKA_CONTAINER, KAFKA_BOOTSTRAP, Tailscale URL monitor (lihat cdc/README).
Health API backend¶
| Endpoint | Cek |
|---|---|
GET /api/health |
Koneksi DB, audit healthcheck |
Rujuk backend/RUNBOOK.md.
Audit & logging PostgreSQL¶
| Lokasi | Isi |
|---|---|
logs.login |
Sesi auth |
logs.audit |
Perubahan sensitif (sesuai implementasi) |
Trigger → pg_notify |
Event untuk FCM/SSE (rencana) |
Validasi data (bukan metric time-series)¶
| Cek | Tool / dokumen |
|---|---|
| PG vs CH row count | migrasi-ui, SQL manual |
| Jurnal tidak imbang di CH | Query audit CH |
| Saldo kas PG vs CH | RUNBOOK CDC |
| Rekonsiliasi ETL | database/laporan/ |
PM2 proses¶
| Proses | Peran |
|---|---|
clickhouse-etl |
Consumer Kafka → CH |
cdc-monitor |
Web monitor (jika dipisah) |
erp-backend |
API NestJS |
Restart: pm2 restart clickhouse-etl — cdc/RUNBOOK.
Control plane & job audit (rencana/progres)¶
Metadata: pusat.cdc_job_operasional, cdc_job_operasional_log, cdc_status_sink_tenant.
API draft: backend/docs/01-CONTROL-PLANE-CDC.md.
Asumsi / belum diverifikasi: Dashboard produksi unified untuk semua job belum menggantikan seluruh fungsi migrasi-ui.
Checklist server¶
CHECKLIST_SERVER.md — verifikasi manual pasca-deploy (topik Kafka, publication, env).
Celah observability (TODO)¶
| Area | Status |
|---|---|
| Prometheus/Grafana | Tidak terdokumentasi di repo |
| Distributed tracing | Tidak ada |
| Alerting otomatis lag CDC | Parsial (manual UI) |
| DLQ monitoring dashboard | Log-based; metrik formal TBD |
Lihat ../troubleshooting/cdc-lag.md.