Glozr docs

Architecture

Data model

Glozr's Postgres schema is organised so that every table holding tenant data carries a workspace_id column — either directly, or transitively through an agent_id. Models use the BelongsToWorkspace or BelongsToAgent traits, which apply a global scope at the ORM layer.

Overview

The schema falls into four broad areas: identity (users, workspaces, memberships), agent configuration (agents and their immutable published versions), knowledge (sources, documents, chunks), and operational data (conversations, messages, leads, usage events). Vector embeddings themselves live outside Postgres, in the vector store.

Identity & workspaces

Users authenticate with email + bcrypt password and may enable TOTP-based two-factor authentication. The role column on users is global (customer or super_admin) and gates platform-admin access.

Workspaces own Stripe customer ids, plan information, and billing state. The workspace_users pivot assigns roles per membership: owner, admin, or member.

Agents & versions

The agents table stores the mutable editing state — persona, theme, system prompt, guardrails, behaviour rules, CTAs, curated answers. Editing an agent does not affect live visitors.

Every publish writes an immutable snapshot to agent_versions and points the agent's published_version_id at it. The widget runtime reads from the version, never the agent row. Rollback is a pointer change.

Knowledge: sources, documents, chunks

Sources are the ingestion units. A source can be a URL, a sitemap, an RSS feed, a Notion page, a Google Drive document, or a SQL database connection. Sensitive credentials (OAuth tokens, DB passwords) are encrypted at rest with AES-256-GCM via Laravel's encrypted casts.

Each source produces one or more documents after crawling. Documents are split into chunks — the embeddable unit. The chunks table tracks chunk text, token count, and metadata, but the embeddings themselves live in Vectorize (or Qdrant) and are joined back by chunk id.

Operational data

TableWhat it holds
conversationsOne row per visitor session. Carries agent_id, visitor identity, and a claimed_by_user_id when a human takes over.
messagesThe turn log. Role (user / assistant / human), content, citations, confidence score.
leadsCaptured visitor info, with a unique constraint to deduplicate within an agent.
knowledge_gapsLow-confidence answers queued for review and curated-answer drafting.
usage_eventsOne row per billable event. Aggregated for quota enforcement and admin analytics.

Critical indexes

  • conversations(workspace_id, visitor_id) — fast lookup of a returning visitor's history.
  • messages(conversation_id, created_at) — turn-order retrieval.
  • usage_events(workspace_id, created_at) — monthly quota windows.
  • leads(agent_id, email) unique — dedup at insert time.
  • chunks(document_id) + chunks(source_id) — purge and re-index paths.

Note. Vector embeddings are never stored in Postgres. They live in the vector store keyed by chunk id, with workspace_id and agent_id attached as metadata for tenant filtering.