Skip to main content
The import module brings external tool results into qai’s unified findings table. Findings from Garak, PyRIT, or any SARIF-producing security tool are parsed, normalized with taxonomy bridging, and stored alongside native audit and inject findings. Imported findings can then inform inject template selection in the assess workflow.

Supported Formats

FormatToolInputWhat Gets Extracted
garakGarakJSONL reportEval-level summaries — probe, detector, pass rate, taxonomy tags
pyritPyRITJSON conversation exportFinal scored results — orchestrator, score value/type, labels
sarifAny SARIF producerSARIF 2.1.0 JSONResults with rule metadata, severity, and message text

CLI Usage

# Import a Garak report
qai import report.jsonl --format garak

# Import with target association (enables workflow integration)
qai import report.jsonl --format garak --target my-target-id

# Preview what would be imported without writing to the database
qai import report.jsonl --format garak --dry-run

# Import PyRIT results
qai import conversations.json --format pyrit --target my-target-id

# Import SARIF output from any tool
qai import results.sarif --format sarif

Flags

FlagShortDescription
--format-fRequired. Source format: garak, pyrit, or sarif
--target-tTarget ID to associate imported findings with. Enables imported findings to inform inject template selection in workflows
--dry-runParse and display what would be imported without writing to the database

Taxonomy Bridging

External tools use their own vulnerability taxonomies. qai maps these to its internal audit categories where a meaningful equivalent exists. Each mapping has a confidence level:
ConfidenceMeaning
directThe external category maps directly to a qai audit category
adjacentRelated but not identical — the qai category covers overlapping concerns
noneNo infrastructure-level equivalent in qai (typically model-level concerns)

Current OWASP LLM Top 10 Mappings

External IDOWASP LLM Top 10qai CategoryConfidence
LLM01Prompt Injectionprompt_injectiondirect
LLM02Insecure Output Handlingtoken_exposureadjacent
LLM03Training Data Poisoningnone
LLM04Model Denial of Servicenone
LLM05Supply Chain Vulnerabilitiesnone
LLM06Sensitive Information Disclosurepermissionsadjacent
LLM07Insecure Plugin Designnone
LLM08Excessive Agencynone
LLM09Overreliancenone
LLM10Model Theftnone
Entries with none confidence are preserved in the database with their original taxonomy IDs but don’t map to a qai category. They won’t influence inject template selection (which matches on qai categories), but they’re visible as imported findings.

How Target Association Works

The --target flag connects imported findings to a qai target. When you later run an assess workflow against the same target, the inject adapter queries both native audit findings and imported findings for that target to inform template selection. This means external model-testing results from Garak or PyRIT can guide which inject payloads are prioritized. Without --target, findings are stored but isolated — they won’t inform any workflow.

Provenance

Every import creates a parent run record (module=import) with metadata tracking:
  • Tool name and version — detected from the input file (Garak version from start_run setup entry; PyRIT version not available in export format; SARIF tool version from runs[].tool.version)
  • Parser version — qai version at the time of import
  • Source file checksum — SHA-256 of the imported file for integrity verification
Raw evidence and metadata are stored as evidence records on the import run, preserving the original data alongside the normalized findings.

What Each Parser Expects

Garak

A JSONL file where each line is a JSON object. The parser looks for:
  • A start_run setup entry (required — validates the file is a Garak report)
  • eval entries with probe name, detector name, pass/fail counts, and optional owasp_llm taxonomy tags
  • attempt entries are counted but not imported (stored in raw evidence)
Severity is derived from the detector pass rate: ≤25% pass → Critical, ≤50% → High, ≤75% → Medium, under 100% → Low, 100% → Info.

PyRIT

A JSON file containing an array of conversation objects. Each conversation should include:
  • A score object with score_value and score_type fields
  • An orchestrator field identifying the test orchestrator
  • Optional labels for taxonomy preservation
Severity mapping depends on score_type: boolean true/false scores map to High/Info; numeric scores use threshold bands.

SARIF

A SARIF 2.1.0 JSON file with standard structure. The parser extracts:
  • Results from runs[].results[] with rule ID, message text, and severity
  • Rule metadata from runs[].tool.driver.rules[] for descriptions and taxonomy
  • Tool name and version from runs[].tool.driver
Severity uses security-severity property (CVSS-style thresholds) when available, falling back to the SARIF level field.
qai maintains best-effort parsers for current versions of Garak and PyRIT. External tool output formats may change between releases. If a parser fails on a newer format version, file an issue.

Next Steps