What is the architecture of Rozz.Site?
Rozz is built as an independent, embeddable web component that runs entirely in the browser, leveraging a lightweight semantic‑search engine and a small LLM interface.
1. Component Layer
- Lit‑based Web Component – Encapsulates its own DOM and styles, preventing interference with the host page.
- Keyboard & ARIA support – Full accessibility, screen‑reader friendly, WCAG 2.1 AA compliant.
2. Crawling & Embedding
- Client‑side crawler – Walks the site’s HTML, PDFs, and other documents, extracting text and building a vector index.
- Embeddings – Generated using a lightweight on‑device transformer, stored in IndexedDB for offline use.
3. Query Processing
- Semantic search – Matches user queries against the index, returning the most relevant passages.
- LLM “chatbot” layer – A small, fine‑tuned model (e.g., GPT‑4‑turbo‑style) formats answers, summarizes, and handles follow‑ups.
- Prompt guardrails – Built‑in protection against XSS, prompt injection, and cross‑domain leakage.
4. Data Flow
1. User types a question →
2. Component queries the local vector index →
3. Result snippets passed to the LLM →
4. Formatted response displayed in the chat UI.
No sensitive data is sent to external servers; all processing stays on the client, keeping user privacy intact.
5. Deployment & Integration
- Add a single
<script src="https://cdn.rozz.site/rozz.js"></script>and<rozz-search></rozz-search>tag to your site. - No backend integration required; the component can be used on static sites, CMS, or SPAs.
What aspect of the architecture would you like to dive deeper into—crawling strategy, embedding model, or integration details?