DeployWhisper — The evidence-backed deploy briefing for infrastructure changes

Why DeployWhisper

The pre-deploy brief
that trusts evidence
over AI prose.

Scanners flag syntax. LLM wrappers hallucinate. Enterprise platforms gate you off. DeployWhisper fills the gap nobody else owns: multi-tool, evidence-backed, community-extensible, open source.

Tool

No verdict without
a paper trail.

The difference between DeployWhisper and "just another AI DevOps tool": every finding you see traces back to a structured evidence item. You can audit it. Export it. Defend it in an approval thread.

01

Intake & parse

Auto-detect tool. Extract structured changes. Isolate parser failures. Never send raw IaC to a model.

ArtifactBundle → NormalizedChange[]

02

Evidence engine

Deterministic extraction. Each change + context + skill produces typed EvidenceItems.

evidence[].{source_type, source_ref, deterministic, confidence}

03

Risk engine

Deterministic scoring. Cross-tool interaction detection. Contributors breakdown. Runs before any LLM.

RiskAssessment.top_risk_contributors → evidence_ids

04

Narrator (downstream)

LLM receives the frozen verdict. Produces plain-English narrative. Cannot mutate severity. Disable it, report still renders.

narrator(report) → narrative // never the other way

Trust architecture

Every pillar is structural, not a policy.

Deterministic core

Scoring runs before narrative. Disable the LLM and the report still renders with evidence, scores, blast radius, rollback.

Evidence traceability

Every finding references structured EvidenceItem objects with source type, severity hint, and confidence.

Uncertainty surfaced

Stale topology, weak incident matches, partial coverage — all visible in the report, never hidden.

Benchmark transparency

Accuracy published quarterly against a public 100-scenario corpus. Comparator results included.

How it works

From raw diffs to a
decision-ready briefing.

Four stages. Twelve seconds on average. One audit trail.

01 · INTAKE

Drop your artifacts

Terraform plans, K8s manifests, Ansible playbooks, Jenkinsfiles, CloudFormation templates. Auto-detection. Sensitive-file blocking.

02 · ANALYZE

Evidence + Skills

Parsers produce structured changes. AI Skills inject tool-specific expertise. Evidence engine captures every signal with provenance.

03 · SCORE

Deterministic verdict

Weighted scoring, cross-tool interaction detection, blast radius traversal, rollback planning, incident similarity — all before a single LLM token.

04 · NARRATE

Ship the briefing

LLM explains the frozen verdict in plain English. Persisted report. Shareable link. PR comment. CLI output. One object, many surfaces.

A worked example

One Terraform diff.
Eight services at risk.

Here's what DeployWhisper sees when a platform engineer proposes a "routine" RDS security-group change.

rds/main.tf · staging → production +3 −2

38resource "aws_security_group_rule" "db_ingress" {
39  type = "ingress"
40  from_port = 5432
41  to_port = 5432
42  - cidr_blocks = ["10.0.0.0/16"]
43  + cidr_blocks = ["0.0.0.0/0"]
44  security_group_id = aws_security_group.db.id
45}
46 
47# Related change in k8s/deployment.yaml
48- replicas: 3
49+ replicas: 10

78 HIGH

Deploy briefing — CAUTION

analysis #dw-0427-18:42 · 11.8s

CAUTION

HIGH Database exposed to public internet

Security-group CIDR widened from VPC to 0.0.0.0/0. During the ECS rollout window, RDS port 5432 is reachable from any source.

evidence · rds/main.tf:43

HIGH Cross-tool interaction: scale + connection pressure

Kubernetes replica count 3 → 10 during the same window. Connection pool capacity in database.yml unchanged at 20. Expect pool exhaustion under load.

evidence · 3 files · 89% match to INC-2024-Q3-17

MED Blast radius: 8 downstream services

Dependency traversal identifies checkout-api, billing-svc, and 6 others with direct or transitive DB dependency. Outage blast > single tenant.

evidence · topology.json · 30 days fresh

What a generic scanner misses: Checkov would flag line 43 in isolation. It would not connect the SG change to the replica scale-up, nor surface the matching incident. That cross-tool link is the point.

What ships today

A complete product,
not a stripped-down MVP.

Nine shipped capabilities. All advisory. All self-hosted. All open source under MIT.

Evidence-backed findings

Every high-severity claim traces to a concrete artifact, topology node, or prior incident. No "just AI text."

Weighted risk scoring

0–100 severity with resource multipliers, environment detection (prod 2x), and action weights. Fully deterministic.

Blast radius mapping

NetworkX dependency graph with BFS traversal. Direct + transitive services affected, visualized.

Automated rollback plan

Ordered steps with time estimates, critical-path flags, and complexity score 1–5.

Incident memory

Cosine-similarity match against past postmortems. 70%+ similarity triggers contextual warning.

Multi-tool parsing

Unified change schema across Terraform, K8s, Ansible, Jenkins, CloudFormation. Auto-detects tool type.

Bring your own LLM

Claude, OpenAI, Ollama, Groq, Azure. Swap via env var. Ollama for fully air-gapped operation.

Analysis history

SQLite-backed. Compare runs, export JSON, track deployment-safety trends across the team.

API, CLI, GitHub-native

FastAPI + OpenAPI. CI-friendly JSON. PR comments update in place on every new commit.

AI Skills engine

Curated expertise,
not a generic prompt.

Skills are versioned markdown packs. Each one encodes the failure modes, risky patterns, and operational wisdom for one tool. They ship built-in — and soon, from the community marketplace.

T Terraform

Security group 0.0.0.0/0 detection

IAM wildcard policy detection

RDS deletion-protection gaps

State drift & backend risk patterns

Provider-specific AWS / GCP / Azure concerns

count / for_each index shift risks

K Kubernetes

Missing readiness probes

Privileged container detection

RBAC ClusterRole escalation

HPA / VPA coordination risks

Rolling-update failure patterns

Network policy gaps

A Ansible

Non-idempotent shell usage

Production targeting mistakes

Missing changed_when guards

Privilege-escalation patterns

Variable precedence conflicts

Handler ordering pitfalls

J Jenkins

Removed approval gates

Credential exposure in env vars

Jobs on the controller node

Missing rollback hooks

Shared library version drift

@NonCPS sandbox bypass patterns

C CloudFormation

Replacement-required updates

Missing DeletionPolicy rules

Cross-stack Fn::ImportValue risks

G Git

Sensitive file auto-blocking

Force-push detection

Commit message risk signals

D Docker

Running as root detection

Unpinned base-image tags

docker.sock mount exposure

Coming Q3 · Epic 4 Skills Marketplace

Browse, install, and publish community skills. Seed catalog: Helm, ArgoCD, Pulumi, Crossplane, Istio, Nginx Ingress, Cert-Manager, Flux, Tekton, OPA Gatekeeper — 20+ skills on launch.

$ deploywhisper skill install helm-rollout-risks

The path to #1

Six epics.
24 weeks.
One trust platform.

Transparent roadmap. Every epic has exit criteria, not just ambition. Read the full PRD →

✓ Shipped Epic 1

Evidence Model

Every finding traces to evidence; deterministic scoring before narrative; confidence + uncertainty on every verdict.

✓ Shipped Epic 2

Review Experience

Verdict-first UI. Evidence inspector. Context-completeness badge. Report comparison across runs.

In build Epic 3 · Q2

GitHub-Native Delivery

Official Action + App. PR comments that update in place. Check-run integration. Advisory, never blocking.

Coming Epic 4 · Q3

AI Skills Marketplace

Community registry of risk patterns. Install with one command. 20+ seed skills: Helm, ArgoCD, Pulumi, Crossplane…

Coming Epic 5 · Q3

Context Moat

Auto-topology from Terraform state. Deployment-outcome capture. Feedback loop. Calibration dashboard.

Coming Epic 6 · Q4

Benchmark Program

Public 100-scenario corpus. Quarterly published precision/recall vs. Checkov, K8sGPT, vanilla LLM. Open-source.

Built for every role

Four workflows.
One briefing.

PE

Platform engineer

Pre-deployment review

Upload Terraform plan + K8s manifests. Risk narrative, blast radius, rollback — 15 seconds. Re-analyze as you fix.

SR

SRE approver

Go / no-go decision

Shared report link shows verdict, blast radius, incident match, rollback complexity. Defend the call with evidence.

JE

Junior engineer

Learning path

Plain-English "why this is risky" with actionable remediation. Shortens the tribal-knowledge gap.

CI

CI / pipelines

Automated advisory

Send changed files to /analyze, get JSON report, post PR comment. Humans decide. Never blocks.

Security-first by architecture

Your infrastructure code,
under your control.

Five non-negotiable hard lines baked into the system — not policies you have to trust us on.

Raw IaC stays local

Parsers extract structured metadata on-machine. File content never reaches an external LLM endpoint.

Memory-only credentials

API keys live in env vars or session memory. Never persisted to disk, database, or logs.

Sensitive-file blocking

.env, kubeconfig, *.pem, *.tfstate auto-detected and excluded from any model-bound payload.

Air-gap operation

Ollama backend. Zero egress. Zero telemetry. Works in regulated and disconnected networks.

Advisory, never blocking

DeployWhisper produces intelligence, not authorization. No mode can prevent deployment. Humans always decide.

Free, MIT-licensed, forever

Deploy with proof.
Start in five minutes.

Clone the repo, run docker compose up -d, and your team has a trusted pre-deploy briefing. No subscription. No sign-up. No vendor lock-in.

Star on GitHub Read the docs

$ docker compose up -d

MIT License

Self-hosted

No telemetry

No sign-up

Air-gap ready

Deploy with
proof, not hope.

The pre-deploy brief
that trusts evidence
over AI prose.

No verdict without
a paper trail.

Every pillar is structural, not a policy.

From raw diffs to a
decision-ready briefing.

One Terraform diff.
Eight services at risk.

A complete product,
not a stripped-down MVP.

Curated expertise,
not a generic prompt.

Six epics.
24 weeks.
One trust platform.

Four workflows.
One briefing.

Your infrastructure code,
under your control.

One mental model. No frontend build chain.

Deploy with proof.
Start in five minutes.

The pre-deploy briefthat trusts evidenceover AI prose.

No verdict withouta paper trail.

Every pillar is structural, not a policy.

From raw diffs to adecision-ready briefing.

One Terraform diff.Eight services at risk.

A complete product,not a stripped-down MVP.

Curated expertise,not a generic prompt.

Six epics.24 weeks.One trust platform.

Four workflows.One briefing.

Your infrastructure code,under your control.

One mental model. No frontend build chain.

Deploy with proof.Start in five minutes.

The pre-deploy brief
that trusts evidence
over AI prose.

No verdict without
a paper trail.

From raw diffs to a
decision-ready briefing.

One Terraform diff.
Eight services at risk.

A complete product,
not a stripped-down MVP.

Curated expertise,
not a generic prompt.

Six epics.
24 weeks.
One trust platform.

Four workflows.
One briefing.

Your infrastructure code,
under your control.

Deploy with proof.
Start in five minutes.