Glozr docs

Operate

Observability

Production telemetry runs on three legs: Sentry for errors, OpenTelemetry traces for the hot path, and Horizon for queue health. The platform admin surfaces a Site Health pill that rolls all three up to a single colour.

Overview

Each leg answers a different question. Sentry tells you what broke. OpenTelemetry tells you where the time went. Horizon tells you what's piling up. All three are optional — the app runs without any of them — but production deployments should enable at least Sentry and Horizon.

Error tracking — Sentry

When SENTRY_LARAVEL_DSN is set, unhandled exceptions are forwarded with full stack traces. Workspace id and agent id are attached as Sentry tags so errors can be scoped to a specific tenant. PII is scrubbed via Sentry's data-scrubbing settings — configure those server-side.

Performance tracing — OpenTelemetry

When OTEL_EXPORTER_OTLP_ENDPOINT is set, the request pipeline is instrumented end-to-end. The headline metric is p95 of rag.llm.first_token — the time from request receipt to the first SSE token. The target is under 1 second.

Other named spans worth alerting on:

  • rag.retrieve — vector search + rerank.
  • rag.prompt.build — system-prompt assembly.
  • rag.llm.stream — full streaming duration.
  • rag.persist — the post-stream persistence batch.

Queue health — Horizon

Horizon's dashboard lives at /horizon in the dashboard (super-admin only). Three queues carry the workload:

  • default — miscellaneous jobs (notifications, webhook dispatch).
  • crawl — source crawling.
  • index — chunking, embedding, vector upsert.

Priority metrics

MetricTarget
First-token latency (p95)< 1 s
Full response latency (p95)< 5 s
Queue depth (all queues)Drains within minutes
Failed jobs0 in steady state
Vector store query latency (p95)< 100 ms

Site Health pill

The platform admin header shows a Site Health pill that aggregates the configured signals into a single colour:

  • Green — everything operational, no configuration warnings.
  • Amber — configuration warnings (missing keys, no observability target, no mail credentials). Clicking the pill links straight to the remediation page.
  • Red — the app is up but a critical subsystem (LLM provider, vector store, queue worker) is failing health checks.

Note. If you only set up one observability tool, pick Sentry. The error stream is the most actionable signal you'll have in the first weeks of running Glozr in production.