The import module brings external tool results into qai’s unified findings table. Findings from Garak, PyRIT, or any SARIF-producing security tool are parsed, normalized with taxonomy bridging, and stored alongside native audit and inject findings. Imported findings can then inform inject template selection in the assess workflow.
| Format | Tool | Input | What Gets Extracted |
|---|
garak | Garak | JSONL report | Eval-level summaries — probe, detector, pass rate, taxonomy tags |
pyrit | PyRIT | JSON conversation export | Final scored results — orchestrator, score value/type, labels |
sarif | Any SARIF producer | SARIF 2.1.0 JSON | Results with rule metadata, severity, and message text |
CLI Usage
# Import a Garak report
qai import report.jsonl --format garak
# Import with target association (enables workflow integration)
qai import report.jsonl --format garak --target my-target-id
# Preview what would be imported without writing to the database
qai import report.jsonl --format garak --dry-run
# Import PyRIT results
qai import conversations.json --format pyrit --target my-target-id
# Import SARIF output from any tool
qai import results.sarif --format sarif
Flags
| Flag | Short | Description |
|---|
--format | -f | Required. Source format: garak, pyrit, or sarif |
--target | -t | Target ID to associate imported findings with. Enables imported findings to inform inject template selection in workflows |
--dry-run | | Parse and display what would be imported without writing to the database |
Taxonomy Bridging
External tools use their own vulnerability taxonomies. qai maps these to its internal audit categories where a meaningful equivalent exists.
Each mapping has a confidence level:
| Confidence | Meaning |
|---|
direct | The external category maps directly to a qai audit category |
adjacent | Related but not identical — the qai category covers overlapping concerns |
none | No infrastructure-level equivalent in qai (typically model-level concerns) |
Current OWASP LLM Top 10 Mappings
| External ID | OWASP LLM Top 10 | qai Category | Confidence |
|---|
| LLM01 | Prompt Injection | prompt_injection | direct |
| LLM02 | Insecure Output Handling | token_exposure | adjacent |
| LLM03 | Training Data Poisoning | — | none |
| LLM04 | Model Denial of Service | — | none |
| LLM05 | Supply Chain Vulnerabilities | — | none |
| LLM06 | Sensitive Information Disclosure | permissions | adjacent |
| LLM07 | Insecure Plugin Design | — | none |
| LLM08 | Excessive Agency | — | none |
| LLM09 | Overreliance | — | none |
| LLM10 | Model Theft | — | none |
Entries with none confidence are preserved in the database with their original taxonomy IDs but don’t map to a qai category. They won’t influence inject template selection (which matches on qai categories), but they’re visible as imported findings.
How Target Association Works
The --target flag connects imported findings to a qai target. When you later run an assess workflow against the same target, the inject adapter queries both native audit findings and imported findings for that target to inform template selection. This means external model-testing results from Garak or PyRIT can guide which inject payloads are prioritized.
Without --target, findings are stored but isolated — they won’t inform any workflow.
Provenance
Every import creates a parent run record (module=import) with metadata tracking:
- Tool name and version — detected from the input file (Garak version from
start_run setup entry; PyRIT version not available in export format; SARIF tool version from runs[].tool.version)
- Parser version — qai version at the time of import
- Source file checksum — SHA-256 of the imported file for integrity verification
Raw evidence and metadata are stored as evidence records on the import run, preserving the original data alongside the normalized findings.
What Each Parser Expects
Garak
A JSONL file where each line is a JSON object. The parser looks for:
- A
start_run setup entry (required — validates the file is a Garak report)
eval entries with probe name, detector name, pass/fail counts, and optional owasp_llm taxonomy tags
attempt entries are counted but not imported (stored in raw evidence)
Severity is derived from the detector pass rate: ≤25% pass → Critical, ≤50% → High, ≤75% → Medium, under 100% → Low, 100% → Info.
PyRIT
A JSON file containing an array of conversation objects. Each conversation should include:
- A
score object with score_value and score_type fields
- An
orchestrator field identifying the test orchestrator
- Optional
labels for taxonomy preservation
Severity mapping depends on score_type: boolean true/false scores map to High/Info; numeric scores use threshold bands.
SARIF
A SARIF 2.1.0 JSON file with standard structure. The parser extracts:
- Results from
runs[].results[] with rule ID, message text, and severity
- Rule metadata from
runs[].tool.driver.rules[] for descriptions and taxonomy
- Tool name and version from
runs[].tool.driver
Severity uses security-severity property (CVSS-style thresholds) when available, falling back to the SARIF level field.
qai maintains best-effort parsers for current versions of Garak and PyRIT. External tool output formats may change between releases. If a parser fails on a newer format version, file an issue.
Next Steps