Post

What --dryrun Taught Me About Confidence

What --dryrun Taught Me About Confidence

I shipped a false positive to my team. In bold. With a CRITICAL severity tag. And I was wrong.

Not “wrong about a detail” wrong. Wrong about the entire finding. The vulnerability I described didn’t exist. I had tested it with --dryrun, seen a “success,” and written it up with the confidence of someone who has never been humbled by AWS IAM.

This post is about the 48 hours between “I found something huge” and “I need to correct my own report.”

The Setup

I was deep into an infrastructure pentest. I’d already confirmed that Cognito was handing out AWS credentials to anyone on the internet (that part was real – I wrote about it here). Two curl commands, full S3 access. Bad.

But I wanted to know: how far does this go? The uploads bucket was compromised. What about the frontend bucket – the one serving the actual website?

I had the bucket name. I had Cognito credentials. I ran:

1
aws s3 cp test.txt s3://frontend-bucket/test.txt --dryrun

Output:

1
(dryrun) copy: test.txt to s3://frontend-bucket/test.txt

My heart rate went up. “The frontend bucket is writable. An attacker could replace index.html. Website defacement. This is CRITICAL.”

I wrote it up. Formatted it nicely. Sent it to the team with appropriate urgency.

Then something nagged at me. I went back and ran the command without --dryrun:

1
echo "test" | aws s3 cp - s3://frontend-bucket/test.txt
1
2
3
upload failed: - to s3://frontend-bucket/test.txt
An error occurred (AccessDenied): not authorized to perform s3:PutObject
because no identity-based policy allows the s3:PutObject action

AccessDenied. The bucket was never writable. The --dryrun flag had lied to me.

What –dryrun Actually Does

Here’s what I thought --dryrun did: “Simulate the operation against AWS and tell me if it would succeed.”

Here’s what it actually does: “Check if the local arguments are valid and show what the command would attempt. Don’t actually call AWS.”

The flag checks client-side logic. It validates that the source file exists, the destination path is well-formed, and the arguments don’t conflict. It does not make an API call. It does not check IAM permissions. It does not talk to S3 at all.

It’s like checking if your car key fits in the ignition and concluding that the engine works. The key fitting doesn’t mean the battery isn’t dead.

1
2
3
# This "succeeds" with --dryrun even though the bucket doesn't exist
aws s3 cp test.txt s3://bucket-that-does-not-exist-anywhere/test.txt --dryrun
# (dryrun) copy: test.txt to s3://bucket-that-does-not-exist-anywhere/test.txt

Try it. Any bucket name. Any path. --dryrun will always say “yep, looks good” as long as the syntax is correct.

The Blast Radius of a False Positive

You might think a false positive is harmless. “Better safe than sorry, right?” No. False positives have real costs:

They erode trust. When I corrected the finding, the team’s first question was: “What else in the report is wrong?” Fair question. I had to re-verify every finding from scratch. Hours of work, redone.

They waste engineering time. Before I caught my mistake, a developer had already started adding explicit deny statements to the frontend bucket policy. Those deny statements were doing nothing – IAM was already blocking the action. But now they’re in the codebase, and the next person who reads them will think there’s a vulnerability that needs defending against. Complexity for free.

They distract from real issues. While we were discussing the (nonexistent) frontend bucket write vulnerability, the actual remaining risk – a Lambda endpoint generating signed download URLs without authorization – wasn’t getting attention.

They make you second-guess yourself. After shipping a false positive, you start hedging every finding. “This might be exploitable.” “I think this works.” That uncertainty is poison for a security report. Findings should be definitive: “This works. Here’s the proof. Here’s how to reproduce it.”

The Pattern I Keep Falling Into

This wasn’t my only false positive in this engagement. It was the third:

  1. Frontend bucket writable--dryrun said yes, actual test said AccessDenied
  2. Presigned URL bypass – I could generate presigned download URLs, but they returned AccessDenied when accessed (presigned URLs carry the caller’s permissions, not magic override powers)
  3. Unnecessary deny statements – I added security controls for a problem that didn’t exist, then patted myself on the back for “defense in depth”

Every time, the same underlying mistake: I tested the setup step and assumed the execution step would follow.

  • --dryrun tests whether the command is syntactically valid, not whether it’s permitted
  • aws s3 presign does local HMAC math, not an API call – it’ll sign a URL for a bucket that doesn’t exist
  • Adding a deny to a bucket policy feels like security, but if IAM already denies the action, the deny does nothing

The fix is embarrassingly simple: turn the handle.

Don’t check if the command looks right. Run it. Don’t check if a URL generates. Fetch it. Don’t assume a permission exists because the policy is complex. Test the actual operation.

My New Testing Protocol

After this experience, I wrote myself a rule. It’s taped to my monitor now:

Every finding must have a proof-of-execution, not just proof-of-setup.

This means:

  • Upload tests: Actually upload a file, then verify it exists with a separate GET request
  • Download tests: Actually download the content and verify it matches
  • Delete tests: Actually delete a file (in staging!), then confirm it’s gone
  • URL generation tests: Fetch the generated URL and check the HTTP status
  • Permission tests: Never use --dryrun. Run the real command. Check the real response.

For the presigned URL case specifically:

1
2
3
4
5
6
7
8
9
# BAD: "I can generate a URL, therefore the read path is open"
aws s3 presign s3://bucket/file.txt
# This always works. It's local math.

# GOOD: "I can download the file through the presigned URL"
URL=$(aws s3 presign s3://bucket/file.txt --expires-in 60)
curl -s -o /dev/null -w "%{http_code}" "$URL"
# 403 = AccessDenied = finding is invalid
# 200 = finding is real

The Deeper Lesson

The --dryrun incident taught me something beyond “test properly.” It taught me about the confidence gradient in security testing.

There’s a spectrum:

  1. Theoretical – “This could be vulnerable based on the config”
  2. Indicated – “A tool or flag suggests it might work”
  3. Demonstrated – “I executed the attack and it succeeded”
  4. Reproduced – “I executed the attack multiple times with clean state”

I was at level 2 and reported it as level 3. The severity was right for level 3 – website defacement via unauthenticated write would be CRITICAL. But I wasn’t at level 3. I was at level 2, which should have been reported as “needs verification” at best.

Now I label findings explicitly:

  • CONFIRMED: Demonstrated with actual execution, reproduced
  • LIKELY: Multiple indicators suggest it works, but execution test pending
  • INVESTIGATE: Config suggests vulnerability, needs hands-on testing
  • DISPROVED: Initially suspected, tested, confirmed not exploitable

The frontend bucket write went from CRITICAL (CONFIRMED) to DISPROVED in one curl command. That’s humbling. But it’s better to be humbled by your own testing than by your team’s code review.

The Irony

The thing that makes --dryrun dangerous is exactly what makes it useful: it’s fast and safe. You can check a hundred operations in seconds without modifying anything. That speed creates a false sense of thoroughness. “I tested it!” No, you tested the syntax. You tested the plumbing. You didn’t turn on the water.

In security testing, safe is the enemy of accurate. If you’re not willing to actually execute the operation (in staging!), you don’t have a finding. You have a hypothesis.

Hypotheses are great. They tell you where to dig. But they don’t go in the report.

The Rule

Here it is, one more time, for anyone who needs to tape it to their monitor:

--dryrun checks if the key fits. It doesn’t check if the door is locked.

Turn the handle. Every time.


This is part of the Breaking My Own Infrastructure series, where I document pentesting our own systems – including the mistakes. Especially the mistakes.

Previous: The Load Balancer That Trusted Everyone

Find me on LinkedIn or GitHub.

This post is Copyrighted by the author.