Skip to main content
The assistant uses a two-tier knowledge base to ground its responses in relevant documentation. Documents are chunked, embedded, and stored in a local ChromaDB vector database at ~/.qai/assist/chroma/.

Two Tiers

Product Knowledge (Tier 1)

qai’s own documentation is indexed automatically. This includes:
  • All Mintlify pages in the documentation/ directory
  • README.md
  • docs/Architecture.md and docs/Roadmap.md
  • src/q_ai/core/data/frameworks.yaml
Product knowledge is treated as trusted content — it’s injected directly into the prompt without delimiting. Approximately two-thirds of retrieved chunks come from this tier.

User Knowledge (Tier 2)

You can add your own reference material to ~/.qai/knowledge/. The assistant indexes these files and retrieves relevant chunks alongside product documentation. User knowledge is treated as semi-trusted content — it’s wrapped in boundary markers and referenced but not followed as instructions. Approximately one-third of retrieved chunks come from this tier.

Supported File Formats

FormatExtensions
Markdown.md, .mdx
Plain text.txt
YAML.yaml, .yml
MDX files are preprocessed to strip JSX component tags (e.g., Mintlify Card, Tabs, CodeGroup elements) while preserving the text content.

What to Put in the User Knowledge Directory

The user knowledge directory is for reference material that helps the assistant give you better answers. Good candidates:
  • TTPs and attack playbooks — techniques and procedures relevant to your testing methodology
  • Threat models — architecture diagrams and threat narratives for systems you’re testing
  • Environment documentation — target system details, network layouts, MCP server configurations
  • Research notes — findings from previous assessments, paper summaries, conference notes
  • Custom frameworks — organisation-specific security taxonomies or testing checklists
The assistant will retrieve relevant sections when your questions match the content. You don’t need to tell it about specific files — retrieval is automatic.

How Indexing Works

Chunking

Documents are split into chunks of approximately 500 tokens:
  1. YAML frontmatter is extracted (title and description prepended to the first chunk)
  2. Text is split on Markdown headings (h2, h3, h4)
  3. Sections exceeding the target size are split on paragraph boundaries
  4. Adjacent chunks overlap by approximately 50 tokens for continuity

Embedding

Chunks are embedded using a local sentence-transformers model (default: all-MiniLM-L6-v2). Embeddings run entirely on your machine — no data is sent to external services for indexing.

Storage

Embeddings are stored in two ChromaDB collections:
CollectionContent
product_knowledgeqai documentation chunks
user_knowledgeUser-provided reference chunks

Change Detection

On each query, the assistant checks whether indexed files have changed:
  1. Computes a SHA-256 hash of each source file
  2. Compares against the stored manifest at ~/.qai/assist/index_manifest.json
  3. If any file in a tier has changed, that tier’s collection is rebuilt
This means adding or modifying files in ~/.qai/knowledge/ takes effect on the next query without manual intervention.

Forcing a Reindex

To rebuild the entire knowledge base from scratch:
qai assist reindex
Use this if:
  • You changed the embedding model
  • The index seems corrupted or stale
  • You want to verify that all files are properly indexed

Retrieval

When you ask a question, the assistant:
  1. Encodes your query using the embedding model
  2. Retrieves the most relevant chunks from both collections (weighted ~2/3 product, ~1/3 user)
  3. Ranks results by cosine similarity
  4. Includes retrieved chunks in the prompt, respecting the model’s context window budget
The number of chunks retrieved adapts to available context space. Larger context windows (common with cloud models) allow more retrieval; smaller windows (common with local models) retrieve fewer but still relevant chunks.