How the monitoring pipeline works
An hourly flow that turns raw PostHog behavior into prioritized, explained, revenue-scored issues. This site visualizes the data spine — what each step analyzes, what drives the recommendations, and what gets generated between steps.
What data is analyzed? Trace the inputs each stage reads.
What drives recommendations? Evidence → playbooks → template whitelist → LLM.
What is generated per step? Signals, clusters, hypotheses, diagnoses, alerts.
End-to-end flow
Hourly scheduler → snapshots → quality gate → signals → evidence → hypotheses → diagnosis → alerts. Click a step to drill in.
flowchart TD
CRON["Hourly Cloud Scheduler
POST /jobs/pipeline"]
subgraph S1["Stage 1 · Funnel Snapshots"]
FS["Capture funnel metrics
purchase · checkout · engagement"]
FS_DB[(posthog_funnel_snapshots)]
FS --> FS_DB
end
subgraph S2["Stage 2 · Data-Quality Gate"]
DQ["computeDataQualityReport
pixel vs volume · 2-run persistence
reliability + trustScore"]
DQFLAG{"alertingDisabled?"}
DQ --> DQFLAG
end
subgraph S3["Stage 3 · Signal Detection"]
SD_ALL["Run all detectors
(full catalog)"]
SD_HEALTH["Run data_health only
(tracking-only mode)"]
QGATE["Signal Quality Gate
active vs watch"]
SIG_DB[(posthog_signals)]
SD_ALL --> QGATE
SD_HEALTH --> QGATE
QGATE --> SIG_DB
end
subgraph S3b["Stage 3b · Recording Enhancement"]
REC["Sample recorded sessions
on top product page →
LLM friction analysis"]
REC --> SIG_DB
end
EXP["Stage 4 · Expire signals
not re-detected in window"]
subgraph S5["Stage 5 · Clustering + Evidence"]
CLUSTER["correlateSignals
URL / funnel-step / type"]
ENRICH["enrichClusterEvidence
replays · screenshots · DOM
JS stacks · cohort compare
→ predicates"]
EVID_DB[(posthog_evidence_artifacts)]
CLUSTER --> ENRICH --> EVID_DB
end
subgraph S6["Stage 6 · Planner + Hypotheses"]
PLAN["planRecommendations
→ template whitelist"]
HYP["generateHypotheses
precedence playbooks
6-factor MIN confidence"]
HYP_DB[(posthog_hypotheses)]
PLAN --> HYP --> HYP_DB
end
subgraph S7["Stage 7 · Constrained LLM"]
IMPACT["Deterministic
recoverable-revenue range"]
LLM["Gemini narrates rank-1
(bounded by whitelist)"]
ZOD["Zod: revenue in range?
templates in whitelist?"]
IMPACT --> LLM --> ZOD
end
QG["Stage 8 · Quality Gate
8 downgrade-only checks"]
DIAG_DB[(posthog_diagnoses
+ recommendations)]
subgraph S9["Stage 9 · Alerts"]
SCORE_A["alert_score = geom-mean
impact · confidence
actionability · freshness"]
DISPATCH["Email / In-app / Webhook"]
ALERT_DB[(posthog_alerts)]
SCORE_A --> DISPATCH --> ALERT_DB
end
CRON --> FS
FS_DB --> DQ
DQFLAG -->|"No"| SD_ALL
DQFLAG -->|"Yes (outage)"| SD_HEALTH
SIG_DB --> EXP --> CLUSTER
EVID_DB --> PLAN
HYP_DB --> IMPACT
ZOD --> QG --> DIAG_DB --> SCORE_A
click FS "stages/funnel-snapshots.html" "Stage 1"
click DQ "stages/data-quality-gate.html" "Stage 2"
click QGATE "stages/signal-detection.html" "Stage 3"
click REC "stages/signal-detection.html" "Stage 3b"
click ENRICH "stages/clustering-evidence.html" "Stage 5"
click HYP "stages/hypothesis-engine.html" "Stage 6"
click LLM "stages/llm-diagnosis.html" "Stage 7"
click QG "stages/quality-gate.html" "Stage 8"
click SCORE_A "stages/alerts.html" "Stage 9"
style S1 fill:#e8f4f8,stroke:#2196F3
style S2 fill:#ffebee,stroke:#F44336
style S3 fill:#fff3e0,stroke:#FF9800
style S3b fill:#fce4ec,stroke:#E91E63
style S5 fill:#e0f7fa,stroke:#00ACC1
style S6 fill:#f1f8e9,stroke:#689F38
style S7 fill:#e8f5e9,stroke:#4CAF50
style S9 fill:#fff8e1,stroke:#FFC107
style EXP fill:#f3e5f5,stroke:#9C27B0
style QG fill:#fff9c4,stroke:#FBC02D
The data spine
One line: how data transforms from raw behavior to a delivered alert.
Stages
Each card shows what the stage reads and what it produces. Open one to see the full data flow.
Reference
Playbook catalog → — all 17 playbooks: what fires each one and what it may recommend.
Data model → — how the generated entities relate (signals, evidence, hypotheses, diagnoses, recommendations, alerts).