Data flow

Active signals are correlated into clusters, enriched with concrete evidence, and reduced to short predicate tags the downstream engine routes on.

flowchart TD
    ACTIVE["Active, undiagnosed signals"]
    subgraph Correlate["correlateSignals"]
        C1{"data_health?"}
        C1 -->|"Yes"| SOLO["Solo cluster"]
        C1 -->|"No"| C2["Group by URL / funnel step"]
        C2 --> C3["Merge funnel_drop into
matching URL cluster"] C2 --> C4["Remaining → group by type"] end CLUSTERS["Signal Clusters"] subgraph Enrich["enrichClusterEvidence"] E1["Pull replays (PostHog)"] E2["Capture screenshots (Steel → R2)"] E3["Fetch JS stack traces"] E4["Pull rage_click DOM snapshots"] E5["Cohort comparison
mobile vs desktop · source vs source"] E6["Derive predicates
url_specific · js_error_on_conversion_path
rage_click_on_cta · replay_corroboration"] end PACK["EvidencePack
supporting · contradicting · missing
predicatesSatisfied · clusterKey"] EVID_DB[(posthog_evidence_artifacts
upsert on dedupKey)] ACTIVE --> C1 SOLO --> CLUSTERS C3 --> CLUSTERS C4 --> CLUSTERS CLUSTERS --> Enrich Enrich --> PACK PACK --> EVID_DB style Enrich fill:#e0f7fa style PACK fill:#c8e6c9 style EVID_DB fill:#e0f7fa,stroke:#00ACC1