How the monitoring pipeline works

An hourly flow that turns raw PostHog behavior into prioritized, explained, revenue-scored issues. This site visualizes the data spine — what each step analyzes, what drives the recommendations, and what gets generated between steps.

What data is analyzed? Trace the inputs each stage reads.

What drives recommendations? Evidence → playbooks → template whitelist → LLM.

What is generated per step? Signals, clusters, hypotheses, diagnoses, alerts.

End-to-end flow

Hourly scheduler → snapshots → quality gate → signals → evidence → hypotheses → diagnosis → alerts. Click a step to drill in.

flowchart TD
    CRON["Hourly Cloud Scheduler
POST /jobs/pipeline"]

    subgraph S1["Stage 1 · Funnel Snapshots"]
        FS["Capture funnel metrics
purchase · checkout · engagement"]
        FS_DB[(posthog_funnel_snapshots)]
        FS --> FS_DB
    end

    subgraph S2["Stage 2 · Data-Quality Gate"]
        DQ["computeDataQualityReport
pixel vs volume · 2-run persistence
reliability + trustScore"]
        DQFLAG{"alertingDisabled?"}
        DQ --> DQFLAG
    end

    subgraph S3["Stage 3 · Signal Detection"]
        SD_ALL["Run all detectors
(full catalog)"]
        SD_HEALTH["Run data_health only
(tracking-only mode)"]
        QGATE["Signal Quality Gate
active vs watch"]
        SIG_DB[(posthog_signals)]
        SD_ALL --> QGATE
        SD_HEALTH --> QGATE
        QGATE --> SIG_DB
    end

    subgraph S3b["Stage 3b · Recording Enhancement"]
        REC["Sample recorded sessions
on top product page →
LLM friction analysis"]
        REC --> SIG_DB
    end

    EXP["Stage 4 · Expire signals
not re-detected in window"]

    subgraph S5["Stage 5 · Clustering + Evidence"]
        CLUSTER["correlateSignals
URL / funnel-step / type"]
        ENRICH["enrichClusterEvidence
replays · screenshots · DOM
JS stacks · cohort compare
→ predicates"]
        EVID_DB[(posthog_evidence_artifacts)]
        CLUSTER --> ENRICH --> EVID_DB
    end

    subgraph S6["Stage 6 · Planner + Hypotheses"]
        PLAN["planRecommendations
→ template whitelist"]
        HYP["generateHypotheses
precedence playbooks
6-factor MIN confidence"]
        HYP_DB[(posthog_hypotheses)]
        PLAN --> HYP --> HYP_DB
    end

    subgraph S7["Stage 7 · Constrained LLM"]
        IMPACT["Deterministic
recoverable-revenue range"]
        LLM["Gemini narrates rank-1
(bounded by whitelist)"]
        ZOD["Zod: revenue in range?
templates in whitelist?"]
        IMPACT --> LLM --> ZOD
    end

    QG["Stage 8 · Quality Gate
8 downgrade-only checks"]
    DIAG_DB[(posthog_diagnoses
+ recommendations)]

    subgraph S9["Stage 9 · Alerts"]
        SCORE_A["alert_score = geom-mean
impact · confidence
actionability · freshness"]
        DISPATCH["Email / In-app / Webhook"]
        ALERT_DB[(posthog_alerts)]
        SCORE_A --> DISPATCH --> ALERT_DB
    end

    CRON --> FS
    FS_DB --> DQ
    DQFLAG -->|"No"| SD_ALL
    DQFLAG -->|"Yes (outage)"| SD_HEALTH
    SIG_DB --> EXP --> CLUSTER
    EVID_DB --> PLAN
    HYP_DB --> IMPACT
    ZOD --> QG --> DIAG_DB --> SCORE_A

    click FS "stages/funnel-snapshots.html" "Stage 1"
    click DQ "stages/data-quality-gate.html" "Stage 2"
    click QGATE "stages/signal-detection.html" "Stage 3"
    click REC "stages/session-recording-analysis.html" "Session Recording Analysis"
    click ENRICH "stages/clustering-evidence.html" "Stage 5"
    click HYP "stages/hypothesis-engine.html" "Stage 6"
    click LLM "stages/llm-diagnosis.html" "Stage 7"
    click QG "stages/quality-gate.html" "Stage 8"
    click SCORE_A "stages/alerts.html" "Stage 9"

    style S1 fill:#e8f4f8,stroke:#2196F3
    style S2 fill:#ffebee,stroke:#F44336
    style S3 fill:#fff3e0,stroke:#FF9800
    style S3b fill:#fce4ec,stroke:#E91E63
    style S5 fill:#e0f7fa,stroke:#00ACC1
    style S6 fill:#f1f8e9,stroke:#689F38
    style S7 fill:#e8f5e9,stroke:#4CAF50
    style S9 fill:#fff8e1,stroke:#FFC107
    style EXP fill:#f3e5f5,stroke:#9C27B0
    style QG fill:#fff9c4,stroke:#FBC02D

Cylinders are persisted tables — the artifacts generated between steps. Tip: click a node to open its detail page.

Snapshots Quality gate Detection Recordings Evidence Hypotheses LLM diagnosis Final gate Alerts

The data spine

One line: how data transforms from raw behavior to a delivered alert.

PostHog events & funnels → signals (active / watch) → clusters + evidence pack → ranked hypotheses + template whitelist → diagnosis + recommendations → alerts

Stages

Each card shows what the stage reads and what it produces. Open one to see the full data flow.

Reference

Playbook catalog → — all 17 playbooks: what fires each one and what it may recommend.
Session recording analysis → — the daily replay-analysis batch (separate schedule) and the inline Stage 3b enrichment.
Data model → — how the generated entities relate (signals, evidence, hypotheses, diagnoses, recommendations, alerts).