~/.qai/assist/chroma/.
Two Tiers
Product Knowledge (Tier 1)
qai’s own documentation is indexed automatically. This includes:- All Mintlify pages in the
documentation/directory README.mddocs/Architecture.mdanddocs/Roadmap.mdsrc/q_ai/core/data/frameworks.yaml
User Knowledge (Tier 2)
You can add your own reference material to~/.qai/knowledge/. The assistant indexes these files and retrieves relevant chunks alongside product documentation.
User knowledge is treated as semi-trusted content — it’s wrapped in boundary markers and referenced but not followed as instructions. Approximately one-third of retrieved chunks come from this tier.
Supported File Formats
| Format | Extensions |
|---|---|
| Markdown | .md, .mdx |
| Plain text | .txt |
| YAML | .yaml, .yml |
Card, Tabs, CodeGroup elements) while preserving the text content.
What to Put in the User Knowledge Directory
The user knowledge directory is for reference material that helps the assistant give you better answers. Good candidates:- TTPs and attack playbooks — techniques and procedures relevant to your testing methodology
- Threat models — architecture diagrams and threat narratives for systems you’re testing
- Environment documentation — target system details, network layouts, MCP server configurations
- Research notes — findings from previous assessments, paper summaries, conference notes
- Custom frameworks — organisation-specific security taxonomies or testing checklists
How Indexing Works
Chunking
Documents are split into chunks of approximately 500 tokens:- YAML frontmatter is extracted (title and description prepended to the first chunk)
- Text is split on Markdown headings (h2, h3, h4)
- Sections exceeding the target size are split on paragraph boundaries
- Adjacent chunks overlap by approximately 50 tokens for continuity
Embedding
Chunks are embedded using a local sentence-transformers model (default:all-MiniLM-L6-v2). Embeddings run entirely on your machine — no data is sent to external services for indexing.
Storage
Embeddings are stored in two ChromaDB collections:| Collection | Content |
|---|---|
product_knowledge | qai documentation chunks |
user_knowledge | User-provided reference chunks |
Change Detection
On each query, the assistant checks whether indexed files have changed:- Computes a SHA-256 hash of each source file
- Compares against the stored manifest at
~/.qai/assist/index_manifest.json - If any file in a tier has changed, that tier’s collection is rebuilt
~/.qai/knowledge/ takes effect on the next query without manual intervention.
Forcing a Reindex
To rebuild the entire knowledge base from scratch:- You changed the embedding model
- The index seems corrupted or stale
- You want to verify that all files are properly indexed
Retrieval
When you ask a question, the assistant:- Encodes your query using the embedding model
- Retrieves the most relevant chunks from both collections (weighted ~2/3 product, ~1/3 user)
- Ranks results by cosine similarity
- Includes retrieved chunks in the prompt, respecting the model’s context window budget