System Architecture

This page gives a high-level view of how Genie is built, intended for users and admins curious about what runs behind the scenes. It is not a developer reference — for technical specifications, see the engineering repositories.

Authoritative architecture diagram

The full architecture is documented in the source-of-truth diagram below (also available as a standalone file: architecture-diagram.svg):

Click the diagram to open the full-size SVG in a new tab.

The simplified Mermaid renderings below are easier to read in-page but reflect the same logical layers.

High-level flow

The layered architecture

Genie is built as a layered system with clear separation of concerns:

The components in plain English

Frontend (React SPA)

Built with React 19 + TypeScript using Vite as the build tool.
UI is composed from Microsoft Fluent UI and Material UI components, with Tailwind CSS for layout and theming utilities.
Hosted on Azure Static Web Apps for global low-latency delivery.
Uses MSAL.js (Microsoft Authentication Library) to handle SSO + MFA against Entra ID.
Streaming chat responses are consumed as NDJSON (newline-delimited JSON) over plain HTTP — every token is appended to the answer as it arrives.

Backend (Python Quart)

Quart is an asynchronous Python web framework (similar to Flask but ASGI-based) — chosen so the chat streaming and background jobs can scale on a single process.
Runs on Azure Container Apps with horizontal auto-scaling.
Validates every incoming JWT against Entra ID's JWKS and uses the On-Behalf-Of (OBO) flow to call Microsoft Graph on behalf of the user (for user/group lookups).

Knowledge Base = Multi-Tenant Architecture

Each knowledge base in Genie is a logical tenant:

Documents live in Azure Blob Storage under a {kb-id}/{filename} path
Search records live in a shared Azure AI Search index, filtered by a forced knowledge_base_id filter on every query (so you can never search across KBs)
Metadata, users, and admin lists live in Cosmos DB documents
Chat history is in a separate Cosmos container, partitioned by your user ID — KB-scoped but per-user

There is no traditional "department" or "tenant" object — each KB is its own tenant, and HR/HSE/Quality/etc. are just KBs with the right Entra groups assigned.

The RAG pipeline

When you ask a question:

Document ingestion pipeline

When an admin uploads a document:

Each chunk is stored with its source filename and page number, which is how citations work.

Azure services in use

Service	Purpose
Azure Container Apps	Hosts the backend Quart application
Azure Static Web Apps	Hosts the React frontend + this documentation site
Azure OpenAI	GPT models for chat, embeddings for vector search
Azure AI Search	Hybrid keyword + vector + semantic search over documents
Azure Cosmos DB	KB metadata, chat history, file processing jobs, join requests, audit logs
Azure Blob Storage / Data Lake Gen2	Source documents and figure images
Azure Key Vault	Secrets and certificates (no plain-text secrets in code or env vars)
Azure Speech Services	Text-to-speech for the optional "read aloud" feature
Azure Document Intelligence	Extracting text, tables, and figures from PDFs
Azure Cognitive Services Vision	OCR and image analysis for figures inside documents
Azure Communication Services	Email notifications for join requests
Azure Defender for Storage	Malware scanning every uploaded file
Azure Application Insights	Telemetry, performance monitoring, error tracking
Microsoft Entra ID (Azure AD)	Identity, SSO, MFA, group memberships
Microsoft Graph	Looking up user profiles and group lists

Security & Privacy — how the architecture protects your data
Known Limitations — current architectural constraints

Authoritative architecture diagram​

High-level flow​

The layered architecture​

The components in plain English​

Frontend (React SPA)​

Backend (Python Quart)​

Knowledge Base = Multi-Tenant Architecture​

The RAG pipeline​

Document ingestion pipeline​

Azure services in use​

Related​