How the monitoring pipeline works

An hourly flow that turns raw PostHog behavior into prioritized, explained, revenue-scored issues. This site visualizes the data spine — what each step analyzes, what drives the recommendations, and what gets generated between steps.

Q1

What data is analyzed? Trace the inputs each stage reads.

Q2

What drives recommendations? Evidence → playbooks → template whitelist → LLM.

Q3

What is generated per step? Signals, clusters, hypotheses, diagnoses, alerts.

End-to-end flow

Hourly scheduler → snapshots → quality gate → signals → evidence → hypotheses → diagnosis → alerts. Click a step to drill in.

flowchart TD
    CRON["Hourly Cloud Scheduler
POST /jobs/pipeline"] subgraph S1["Stage 1 · Funnel Snapshots"] FS["Capture funnel metrics
purchase · checkout · engagement"] FS_DB[(posthog_funnel_snapshots)] FS --> FS_DB end subgraph S2["Stage 2 · Data-Quality Gate"] DQ["computeDataQualityReport
pixel vs volume · 2-run persistence
reliability + trustScore"] DQFLAG{"alertingDisabled?"} DQ --> DQFLAG end subgraph S3["Stage 3 · Signal Detection"] SD_ALL["Run all detectors
(full catalog)"] SD_HEALTH["Run data_health only
(tracking-only mode)"] QGATE["Signal Quality Gate
active vs watch"] SIG_DB[(posthog_signals)] SD_ALL --> QGATE SD_HEALTH --> QGATE QGATE --> SIG_DB end subgraph S3b["Stage 3b · Recording Enhancement"] REC["Sample recorded sessions
on top product page →
LLM friction analysis"] REC --> SIG_DB end EXP["Stage 4 · Expire signals
not re-detected in window"] subgraph S5["Stage 5 · Clustering + Evidence"] CLUSTER["correlateSignals
URL / funnel-step / type"] ENRICH["enrichClusterEvidence
replays · screenshots · DOM
JS stacks · cohort compare
→ predicates"] EVID_DB[(posthog_evidence_artifacts)] CLUSTER --> ENRICH --> EVID_DB end subgraph S6["Stage 6 · Planner + Hypotheses"] PLAN["planRecommendations
→ template whitelist"] HYP["generateHypotheses
precedence playbooks
6-factor MIN confidence"] HYP_DB[(posthog_hypotheses)] PLAN --> HYP --> HYP_DB end subgraph S7["Stage 7 · Constrained LLM"] IMPACT["Deterministic
recoverable-revenue range"] LLM["Gemini narrates rank-1
(bounded by whitelist)"] ZOD["Zod: revenue in range?
templates in whitelist?"] IMPACT --> LLM --> ZOD end QG["Stage 8 · Quality Gate
8 downgrade-only checks"] DIAG_DB[(posthog_diagnoses
+ recommendations)] subgraph S9["Stage 9 · Alerts"] SCORE_A["alert_score = geom-mean
impact · confidence
actionability · freshness"] DISPATCH["Email / In-app / Webhook"] ALERT_DB[(posthog_alerts)] SCORE_A --> DISPATCH --> ALERT_DB end CRON --> FS FS_DB --> DQ DQFLAG -->|"No"| SD_ALL DQFLAG -->|"Yes (outage)"| SD_HEALTH SIG_DB --> EXP --> CLUSTER EVID_DB --> PLAN HYP_DB --> IMPACT ZOD --> QG --> DIAG_DB --> SCORE_A click FS "stages/funnel-snapshots.html" "Stage 1" click DQ "stages/data-quality-gate.html" "Stage 2" click QGATE "stages/signal-detection.html" "Stage 3" click REC "stages/signal-detection.html" "Stage 3b" click ENRICH "stages/clustering-evidence.html" "Stage 5" click HYP "stages/hypothesis-engine.html" "Stage 6" click LLM "stages/llm-diagnosis.html" "Stage 7" click QG "stages/quality-gate.html" "Stage 8" click SCORE_A "stages/alerts.html" "Stage 9" style S1 fill:#e8f4f8,stroke:#2196F3 style S2 fill:#ffebee,stroke:#F44336 style S3 fill:#fff3e0,stroke:#FF9800 style S3b fill:#fce4ec,stroke:#E91E63 style S5 fill:#e0f7fa,stroke:#00ACC1 style S6 fill:#f1f8e9,stroke:#689F38 style S7 fill:#e8f5e9,stroke:#4CAF50 style S9 fill:#fff8e1,stroke:#FFC107 style EXP fill:#f3e5f5,stroke:#9C27B0 style QG fill:#fff9c4,stroke:#FBC02D
Cylinders are persisted tables — the artifacts generated between steps. Tip: click a node to open its detail page.
Snapshots Quality gate Detection Recordings Evidence Hypotheses LLM diagnosis Final gate Alerts

The data spine

One line: how data transforms from raw behavior to a delivered alert.

PostHog events & funnels → signals (active / watch) → clusters + evidence packranked hypotheses + template whitelistdiagnosis + recommendationsalerts

Stages

Each card shows what the stage reads and what it produces. Open one to see the full data flow.

Reference

Playbook catalog → — all 17 playbooks: what fires each one and what it may recommend.
Data model → — how the generated entities relate (signals, evidence, hypotheses, diagnoses, recommendations, alerts).