v1 roadmap live — evidence model shipped, GitHub Action next
Pre-deployment intelligence for infrastructure changes

Deploy with
proof, not hope.

The open-source, evidence-backed deploy briefing for Terraform, Kubernetes, Ansible, Jenkins, and CloudFormation. Every finding traces to evidence. Every verdict is auditable. Self-hosted. MIT.

MIT licensed
Self-hosted
Raw IaC stays local
BYO LLM
$ docker compose up -d
analysis · #dw-0427-18:42
v1.0.0 streaming
78 High
Deployment verdict — CAUTION
RDS security-group ingress changes while ECS task definition updates. Database port widened during rollout window. Review blast radius and rollback before proceeding.
CAUTION
Top risk: security group sg-0a1b2c exposes port 5432 to 0.0.0.0/0 · evidence from rds/main.tf:41
Blast radius
8
services affected
Incident match
89%
INC-2024-Q3-17
Rollback
5 steps
~12 min · low complexity
findings (4) evidence (11) blast radius rollback export ↓
<15s
Analysis latency
Full narrative + blast radius + rollback
5 tools
Unified parsers
Terraform · K8s · Ansible · Jenkins · CFN
7+
AI Skills loaded
Tool-specific domain expertise
100%
Local parsing
Raw IaC never leaves your network
Unified across your IaC toolchain
Terraform Kubernetes Ansible Jenkins CloudFormation Docker Git
Why DeployWhisper

The pre-deploy brief
that trusts evidence
over AI prose.

Scanners flag syntax. LLM wrappers hallucinate. Enterprise platforms gate you off. DeployWhisper fills the gap nobody else owns: multi-tool, evidence-backed, community-extensible, open source.

Tool
Category
Does well
What's missing
Checkov / TFLint
Static IaC scanners
Rule-based policy checks
No narrative, no blast radius, one tool at a time
K8sGPT
AI K8s diagnosis
Plain-English Kubernetes issues
Kubernetes-only, runtime-focused (not pre-deploy)
Spacelift / env0
Terraform platforms
Orchestration + AI review
Proprietary, Terraform-first, no community Skills
OPA / Sentinel
Policy engines
Enforcement of known rules
Cannot reason about novel cross-tool interactions
Wiz / Orca
Cloud security posture
Enterprise attack-path analysis
Closed-source, $50k+/year, top-down sell
Vanilla ChatGPT
Generic LLM prompting
Free and fast
No grounding, no evidence, no audit trail, no trust
DeployWhisper
Pre-deploy intelligence
Evidence-backed verdict across 5 tools, blast radius, rollback, incident memory, community Skills
Open source, advisory, self-hosted — designed for trust, not gatekeeping
The evidence model

No verdict without
a paper trail.

The difference between DeployWhisper and "just another AI DevOps tool": every finding you see traces back to a structured evidence item. You can audit it. Export it. Defend it in an approval thread.

01
Intake & parse
Auto-detect tool. Extract structured changes. Isolate parser failures. Never send raw IaC to a model.
ArtifactBundle → NormalizedChange[]
02
Evidence engine
Deterministic extraction. Each change + context + skill produces typed EvidenceItems.
evidence[].{source_type, source_ref, deterministic, confidence}
03
Risk engine
Deterministic scoring. Cross-tool interaction detection. Contributors breakdown. Runs before any LLM.
RiskAssessment.top_risk_contributors → evidence_ids
04
Narrator (downstream)
LLM receives the frozen verdict. Produces plain-English narrative. Cannot mutate severity. Disable it, report still renders.
narrator(report) → narrative // never the other way
Trust architecture

Every pillar is structural, not a policy.

Deterministic core
Scoring runs before narrative. Disable the LLM and the report still renders with evidence, scores, blast radius, rollback.
Evidence traceability
Every finding references structured EvidenceItem objects with source type, severity hint, and confidence.
Uncertainty surfaced
Stale topology, weak incident matches, partial coverage — all visible in the report, never hidden.
Benchmark transparency
Accuracy published quarterly against a public 100-scenario corpus. Comparator results included.
How it works

From raw diffs to a
decision-ready briefing.

Four stages. Twelve seconds on average. One audit trail.

01 · INTAKE
Drop your artifacts
Terraform plans, K8s manifests, Ansible playbooks, Jenkinsfiles, CloudFormation templates. Auto-detection. Sensitive-file blocking.
02 · ANALYZE
Evidence + Skills
Parsers produce structured changes. AI Skills inject tool-specific expertise. Evidence engine captures every signal with provenance.
03 · SCORE
Deterministic verdict
Weighted scoring, cross-tool interaction detection, blast radius traversal, rollback planning, incident similarity — all before a single LLM token.
04 · NARRATE
Ship the briefing
LLM explains the frozen verdict in plain English. Persisted report. Shareable link. PR comment. CLI output. One object, many surfaces.
A worked example

One Terraform diff.
Eight services at risk.

Here's what DeployWhisper sees when a platform engineer proposes a "routine" RDS security-group change.

rds/main.tf · staging → production +3 −2
38resource "aws_security_group_rule" "db_ingress" {
39 type = "ingress"
40 from_port = 5432
41 to_port = 5432
42 - cidr_blocks = ["10.0.0.0/16"]
43 + cidr_blocks = ["0.0.0.0/0"]
44 security_group_id = aws_security_group.db.id
45}
46
47# Related change in k8s/deployment.yaml
48- replicas: 3
49+ replicas: 10
78 HIGH
Deploy briefing — CAUTION
analysis #dw-0427-18:42 · 11.8s
CAUTION
HIGH Database exposed to public internet
Security-group CIDR widened from VPC to 0.0.0.0/0. During the ECS rollout window, RDS port 5432 is reachable from any source.
evidence · rds/main.tf:43
HIGH Cross-tool interaction: scale + connection pressure
Kubernetes replica count 3 → 10 during the same window. Connection pool capacity in database.yml unchanged at 20. Expect pool exhaustion under load.
evidence · 3 files · 89% match to INC-2024-Q3-17
MED Blast radius: 8 downstream services
Dependency traversal identifies checkout-api, billing-svc, and 6 others with direct or transitive DB dependency. Outage blast > single tenant.
evidence · topology.json · 30 days fresh
What a generic scanner misses: Checkov would flag line 43 in isolation. It would not connect the SG change to the replica scale-up, nor surface the matching incident. That cross-tool link is the point.
What ships today

A complete product,
not a stripped-down MVP.

Nine shipped capabilities. All advisory. All self-hosted. All open source under MIT.

Evidence-backed findings
Every high-severity claim traces to a concrete artifact, topology node, or prior incident. No "just AI text."
Weighted risk scoring
0–100 severity with resource multipliers, environment detection (prod 2x), and action weights. Fully deterministic.
Blast radius mapping
NetworkX dependency graph with BFS traversal. Direct + transitive services affected, visualized.
Automated rollback plan
Ordered steps with time estimates, critical-path flags, and complexity score 1–5.
Incident memory
Cosine-similarity match against past postmortems. 70%+ similarity triggers contextual warning.
Multi-tool parsing
Unified change schema across Terraform, K8s, Ansible, Jenkins, CloudFormation. Auto-detects tool type.
Bring your own LLM
Claude, OpenAI, Ollama, Groq, Azure. Swap via env var. Ollama for fully air-gapped operation.
Analysis history
SQLite-backed. Compare runs, export JSON, track deployment-safety trends across the team.
API, CLI, GitHub-native
FastAPI + OpenAPI. CI-friendly JSON. PR comments update in place on every new commit.
AI Skills engine

Curated expertise,
not a generic prompt.

Skills are versioned markdown packs. Each one encodes the failure modes, risky patterns, and operational wisdom for one tool. They ship built-in — and soon, from the community marketplace.

T Terraform
Security group 0.0.0.0/0 detection
IAM wildcard policy detection
RDS deletion-protection gaps
State drift & backend risk patterns
Provider-specific AWS / GCP / Azure concerns
count / for_each index shift risks
K Kubernetes
Missing readiness probes
Privileged container detection
RBAC ClusterRole escalation
HPA / VPA coordination risks
Rolling-update failure patterns
Network policy gaps
A Ansible
Non-idempotent shell usage
Production targeting mistakes
Missing changed_when guards
Privilege-escalation patterns
Variable precedence conflicts
Handler ordering pitfalls
J Jenkins
Removed approval gates
Credential exposure in env vars
Jobs on the controller node
Missing rollback hooks
Shared library version drift
@NonCPS sandbox bypass patterns
C CloudFormation
Replacement-required updates
Missing DeletionPolicy rules
Cross-stack Fn::ImportValue risks
G Git
Sensitive file auto-blocking
Force-push detection
Commit message risk signals
D Docker
Running as root detection
Unpinned base-image tags
docker.sock mount exposure
Coming Q3 · Epic 4 Skills Marketplace

Browse, install, and publish community skills. Seed catalog: Helm, ArgoCD, Pulumi, Crossplane, Istio, Nginx Ingress, Cert-Manager, Flux, Tekton, OPA Gatekeeper — 20+ skills on launch.

$ deploywhisper skill install helm-rollout-risks
The path to #1

Six epics.
24 weeks.
One trust platform.

Transparent roadmap. Every epic has exit criteria, not just ambition. Read the full PRD →

✓ Shipped Epic 1
Evidence Model
Every finding traces to evidence; deterministic scoring before narrative; confidence + uncertainty on every verdict.
✓ Shipped Epic 2
Review Experience
Verdict-first UI. Evidence inspector. Context-completeness badge. Report comparison across runs.
In build Epic 3 · Q2
GitHub-Native Delivery
Official Action + App. PR comments that update in place. Check-run integration. Advisory, never blocking.
Coming Epic 4 · Q3
AI Skills Marketplace
Community registry of risk patterns. Install with one command. 20+ seed skills: Helm, ArgoCD, Pulumi, Crossplane…
Coming Epic 5 · Q3
Context Moat
Auto-topology from Terraform state. Deployment-outcome capture. Feedback loop. Calibration dashboard.
Coming Epic 6 · Q4
Benchmark Program
Public 100-scenario corpus. Quarterly published precision/recall vs. Checkov, K8sGPT, vanilla LLM. Open-source.
Built for every role

Four workflows.
One briefing.

PE
Platform engineer
Pre-deployment review
Upload Terraform plan + K8s manifests. Risk narrative, blast radius, rollback — 15 seconds. Re-analyze as you fix.
SR
SRE approver
Go / no-go decision
Shared report link shows verdict, blast radius, incident match, rollback complexity. Defend the call with evidence.
JE
Junior engineer
Learning path
Plain-English "why this is risky" with actionable remediation. Shortens the tribal-knowledge gap.
CI
CI / pipelines
Automated advisory
Send changed files to /analyze, get JSON report, post PR comment. Humans decide. Never blocks.
Security-first by architecture

Your infrastructure code,
under your control.

Five non-negotiable hard lines baked into the system — not policies you have to trust us on.

Raw IaC stays local
Parsers extract structured metadata on-machine. File content never reaches an external LLM endpoint.
Memory-only credentials
API keys live in env vars or session memory. Never persisted to disk, database, or logs.
Sensitive-file blocking
.env, kubeconfig, *.pem, *.tfstate auto-detected and excluded from any model-bound payload.
Air-gap operation
Ollama backend. Zero egress. Zero telemetry. Works in regulated and disconnected networks.
Advisory, never blocking
DeployWhisper produces intelligence, not authorization. No mode can prevent deployment. Humans always decide.
Pure Python. Bring your own LLM.

One mental model. No frontend build chain.

NiceGUIDashboard
FastAPIREST API
LiteLLMLLM layer
SQLitePersistence
NetworkXBlast radius
PlotlyVisualization
PydanticValidation
DockerDeployment
LLM providers: Claude GPT-4o Llama 3 (Ollama) Mixtral (Groq) Azure OpenAI
Free, MIT-licensed, forever

Deploy with proof.
Start in five minutes.

Clone the repo, run docker compose up -d, and your team has a trusted pre-deploy briefing. No subscription. No sign-up. No vendor lock-in.

$ docker compose up -d
MIT License
Self-hosted
No telemetry
No sign-up
Air-gap ready