src/q_ai/ipi/ and implements an indirect prompt injection testing framework. It generates documents with hidden payloads, serves a callback listener to capture agent execution, and provides a web dashboard for campaign monitoring.
Module Structure
Data Flow
Generation Orchestrator
Source:generate_service.py
The generate_documents() function is the central dispatch point for payload creation, called by both the CLI and the web API. It:
- Resolves the target format and technique(s)
- Routes to the appropriate format-specific generator
- Builds a
Campaignmodel with a unique UUID and cryptographic token - Persists the campaign to
ipi.db - Returns generated document bytes and campaign metadata
Format-Specific Generators
Source:generators/
One module per document format, each implementing a common interface:
| Generator | Techniques | Library |
|---|---|---|
pdf.py | 10 — white ink, off-canvas, metadata, tiny text, white rect, form field, annotation, JavaScript, embedded file, incremental | ReportLab |
image.py | 3 — visible text, subtle text, EXIF metadata | Pillow |
markdown.py | 4 — HTML comment, link reference, zero-width chars, hidden block | Built-in |
html.py | 4 — script comment, CSS offscreen, data attribute, meta tag | Built-in |
docx.py | 6 — hidden text, tiny text, white text, comment, metadata, header/footer | python-docx |
ics.py | 4 — description, location, VALARM, X-property | Built-in |
eml.py | 3 — X-header, hidden HTML, attachment | Built-in |
create_<format>(technique, ...)— Generate a single document with a specific techniquecreate_all_<format>_variants(...)— Generate documents for all techniques supported by that format
Callback Server
Source:server.py
A FastAPI application started via qai ipi listen. The server mounts two routers:
| Router | Source | Purpose |
|---|---|---|
| API | api.py | Callback endpoints (/c/{uuid}/{token}), campaign JSON API |
| UI | ui.py | HTMX dashboard routes (server-rendered HTML) |
- Fake 404 responses — Callback endpoints return a fake 404 to avoid alerting the target system that the payload executed successfully
- Authenticated callbacks — URLs include a per-campaign token (
/c/{uuid}/{token}). Unauthenticated requests (/c/{uuid}) are still recorded but receive lower confidence scores - Background hit recording — Callback processing uses FastAPI
BackgroundTasksto avoid blocking the response
HTMX Dashboard
Source:ui.py, templates/, static/
The web dashboard is server-rendered with Jinja2 templates and uses HTMX for partial-page updates. No JavaScript framework — just htmx.min.js for dynamic behavior.
The dashboard provides:
- Campaign list with hit counts and confidence breakdown
- Per-campaign detail view with hit timeline
- Real-time hit feed (HTMX polling)
Payload Styles and Types
Source:models.py
Payload Styles (7)
Control the social engineering framing of the hidden instruction:| Style | Description |
|---|---|
obvious | Direct injection markers — baseline |
citation | Disguised as document reference |
reviewer | Appears as reviewer/editor note |
helpful | Framed as helpful supplementary resource |
academic | Academic cross-reference format |
compliance | Looks like compliance verification |
datasource | Appears as data source attribution |
Payload Types (7)
Define the attack objective:| Type | Dangerous | Description |
|---|---|---|
callback | No | Simple HTTP callback — proof of execution |
exfil_summary | Yes | Exfiltrate document summary via callback |
exfil_context | Yes | Exfiltrate conversation context |
ssrf_internal | Yes | Server-side request forgery to internal endpoints |
instruction_override | Yes | Override system instructions |
tool_abuse | Yes | Misuse agent tools/capabilities |
persistence | Yes | Persist instructions across sessions |
Dangerous Payload Gating
Payload types beyondcallback are considered dangerous and require the --dangerous flag at generation time. The gating is enforced in generate_service.py — attempting to generate exfil, SSRF, or behavior modification payloads without the flag raises an error.
Deterministic Generation
The--seed flag passes a seed value through to generators for reproducible output. This enables consistent test corpus generation across runs.