What you can do#
The public Q&A widget provides a chat interface for website visitors to ask questions about your product. It uses LumenFlow's knowledge corpus and hybrid retrieval to generate accurate, grounded answers without requiring authentication.
| Capability | Description | Status |
|---|---|---|
| Streaming chat | Real-time SSE streaming responses | Shipped |
| Hybrid retrieval | Combines corpus knowledge with semantic search | Shipped |
| Rate limiting | Per-IP rate limiting (database-backed or in-memory) | Shipped |
| Abuse controls | Message length limits, conversation cap, payload size | Shipped |
| Analytics | Aggregate metrics to Axiom (no transcript storage) | Shipped |
| Kill switch | Disable instantly via environment variable | Shipped |
How it works#
The public chat endpoint streams AI responses grounded in your knowledge corpus. When a visitor asks a question:
- The message is validated for length, payload size, and abuse
- Rate limits are checked (per-IP)
- Relevant knowledge is retrieved using hybrid search (keyword + semantic)
- The AI generates a streaming response grounded in the retrieved context
- Aggregate analytics are emitted (topics, turn count, message length bucket)
info The public chat does not store conversation transcripts. Only aggregate analytics (topic distribution, turn counts, rate limit events) are logged.
Configuration#
The public chat widget is controlled by environment variables:
| Variable | Description | Default |
|---|---|---|
SIDEKICK_PUBLIC_CHAT_ENABLED | Kill switch ("false" to disable) | "true" |
SIDEKICK_PUBLIC_API_KEY | Company-owned API key for the AI model | Required |
Limits#
- Conversation turns -- maximum 10 turns per conversation
- Message length -- enforced server-side per message
- Payload size -- request body size capped for abuse prevention
- Rate limiting -- per-IP, database-backed in production, in-memory fallback for development
Knowledge corpus#
The widget draws answers from the public knowledge corpus built into LumenFlow. This includes product documentation, feature descriptions, and frequently asked questions. The corpus is cached at the module level for fast response times.
Embedding the widget#
The public chat endpoint is available at:
POST /api/public/sidekick/chat
Send a JSON body with a messages array following the standard
chat message format. The response is an SSE stream compatible with
the Vercel AI SDK useChat hook.
warning The public chat uses a company-owned API key, not workspace keys. Ensure
SIDEKICK_PUBLIC_API_KEYis set in your deployment environment before enabling the widget.