Knowledge Base

The assistant uses a two-tier knowledge base to ground its responses in relevant documentation. Documents are chunked, embedded, and stored in a local ChromaDB vector database at ~/.qai/assist/chroma/.

Two Tiers

Product Knowledge (Tier 1)

qai’s own documentation is indexed automatically. This includes:

All Mintlify pages in the documentation/ directory
README.md
docs/Architecture.md and docs/Roadmap.md
src/q_ai/core/data/frameworks.yaml

Product knowledge is treated as trusted content — it’s injected directly into the prompt without delimiting. Approximately two-thirds of retrieved chunks come from this tier.

User Knowledge (Tier 2)

You can add your own reference material to ~/.qai/knowledge/. The assistant indexes these files and retrieves relevant chunks alongside product documentation. User knowledge is treated as semi-trusted content — it’s wrapped in boundary markers and referenced but not followed as instructions. Approximately one-third of retrieved chunks come from this tier.

Supported File Formats

Format	Extensions
Markdown	`.md`, `.mdx`
Plain text	`.txt`
YAML	`.yaml`, `.yml`

MDX files are preprocessed to strip JSX component tags (e.g., Mintlify Card, Tabs, CodeGroup elements) while preserving the text content.

What to Put in the User Knowledge Directory

The user knowledge directory is for reference material that helps the assistant give you better answers. Good candidates:

TTPs and attack playbooks — techniques and procedures relevant to your testing methodology
Threat models — architecture diagrams and threat narratives for systems you’re testing
Environment documentation — target system details, network layouts, MCP server configurations
Research notes — findings from previous assessments, paper summaries, conference notes
Custom frameworks — organisation-specific security taxonomies or testing checklists

The assistant will retrieve relevant sections when your questions match the content. You don’t need to tell it about specific files — retrieval is automatic.

How Indexing Works

Chunking

Documents are split into chunks of approximately 500 tokens:

YAML frontmatter is extracted (title and description prepended to the first chunk)
Text is split on Markdown headings (h2, h3, h4)
Sections exceeding the target size are split on paragraph boundaries
Adjacent chunks overlap by approximately 50 tokens for continuity

Embedding

Chunks are embedded using a local sentence-transformers model (default: all-MiniLM-L6-v2). Embeddings run entirely on your machine — no data is sent to external services for indexing.

Storage

Embeddings are stored in two ChromaDB collections:

Collection	Content
`product_knowledge`	qai documentation chunks
`user_knowledge`	User-provided reference chunks

Change Detection

On each query, the assistant checks whether indexed files have changed:

Computes a SHA-256 hash of each source file
Compares against the stored manifest at ~/.qai/assist/index_manifest.json
If any file in a tier has changed, that tier’s collection is rebuilt

This means adding or modifying files in ~/.qai/knowledge/ takes effect on the next query without manual intervention.

Forcing a Reindex

To rebuild the entire knowledge base from scratch:

qai assist reindex

Use this if:

You changed the embedding model
The index seems corrupted or stale
You want to verify that all files are properly indexed

Retrieval

When you ask a question, the assistant:

Encodes your query using the embedding model
Retrieves the most relevant chunks from both collections (weighted ~2/3 product, ~1/3 user)
Ranks results by cosine similarity
Includes retrieved chunks in the prompt, respecting the model’s context window budget

The number of chunks retrieved adapts to available context space. Larger context windows (common with cloud models) allow more retrieval; smaller windows (common with local models) retrieve fewer but still relevant chunks.

​Two Tiers

​Product Knowledge (Tier 1)

​User Knowledge (Tier 2)

​Supported File Formats

​What to Put in the User Knowledge Directory

​How Indexing Works

​Chunking

​Embedding

​Storage

​Change Detection

​Forcing a Reindex

​Retrieval