From DevOps to DevSecOps: Why I Started Breaking Things on Purpose
I’ve been a DevOps engineer for a long time. Nearly two decades, if you count the years before we called it DevOps. Back when it was just “the infrastructure guy” or “the Linux admin” or “the person who knows what Terraform does.”
Most of that time has been in small-to-mid-sized organizations — the kind where one person manages the CI/CD pipelines, the Terraform modules, the monitoring stack, and the “why is staging down?” Slack messages. Not FAANG. Not regulated finance. Just teams shipping software and trying not to break production.
I built CI/CD pipelines. I wrote Ansible playbooks. I argued about Terraform module structure. I was good at it. Comfortable.
And then I started reading breach reports at 2 AM.
The Night Everything Changed
It was March of last year. I couldn’t sleep. I’d had coffee too late in the afternoon and my brain was spinning.
I found myself on Hacker News, reading about a company that had lost customer data. Not a breach — a misconfiguration. An S3 bucket. Public read access. Millions of records.
“That could never happen to us,” I thought. “We have policies. We check these things.”
But I couldn’t shake it. I got up. Opened my laptop. Started checking.
Four hours later, I found three public S3 buckets. None of them had customer data, thankfully. But they had logs. Config files. Enough information for an attacker to map our infrastructure.
How did they get public? The Terraform that created them predated our bucket policy module. They were provisioned in the early days, before we had standards, and nobody had gone back to audit them. No checkov in the pipeline. No AWS Config rules. Just terraform apply and move on.
That’s the uncomfortable part. These weren’t someone else’s mistake. They were my infrastructure. My Terraform. My lack of a review process. Finding them wasn’t just a security win — it was a critique of how I’d been working for years.
I fixed them. Wrote an incident report. Told the team.
And then I couldn’t stop.
The Shift
I started spending my nights differently. Instead of reading about new Terraform providers or AWS service updates, I was reading about:
- JWT vulnerabilities — specifically how stateless tokens make revocation nearly impossible
- SQL injection techniques that still work in 2026 — parameterized queries solve the classic case, but ORMs like TypeORM have their own foot-guns (raw queries,
addSelecton relations,QueryBuilderwith string interpolation) - How to bypass CORS — wildcard subdomain matching, reflected origins, preflight caching abuse
- What
trust proxyactually does in Express — and why getting the hop count wrong means your rate limiter trusts attacker-supplied IPs (this matters if you’re behind CloudFront + ALB)
I’m not claiming expertise in all of these. I’m claiming curiosity. I went deep enough on each to spot patterns in our own code and ask better questions during reviews. That’s different from being a pentester, and I try not to confuse the two.
I started looking at our code differently. Every endpoint was a question: “What if someone sends weird input?”
Every configuration was a suspicion: “Is this actually secure, or does it just look secure?”
I started running curl commands against our staging environment at odd hours. Creating test invoices. Seeing what error messages revealed. Mapping out what endpoints existed.
I was becoming… paranoid.
But in a good way. Mostly.
The tricky part about paranoia is knowing when to stop. I’ve lost sleep over things that turned out to be fine. I’ve filed false positives with embarrassing confidence (I wrote a whole post about that). The paranoia is useful, but it needs guardrails — a checklist, a second opinion, and the willingness to say “I was wrong.”
Why Not Just Use Tools?
People ask me why I do this manually. Why not use Burp Suite? Or OWASP ZAP? Or one of the enterprise scanners?
I do use those tools. They’re great for coverage. A DAST scanner will methodically test every parameter on every endpoint. A dependency scanner will catch known CVEs in your package-lock.json. These are table stakes — you should run them regardless.
But automated tools find known patterns. What I find manually is different:
- The CORS misconfiguration that accepts any subdomain — technically valid regex, but allows attacker-controlled origins
- The scientific notation that creates million-dollar invoices —
1e6passesIsNumber()validation because it is a number, but the business logic doesn’t expect six-figure quantities. The root cause is treating type checking as input validation. - The internal fields that shouldn’t be updatable but are —
Object.assign(user, req.body)merges everything, includingisDeletedandrole - The webhook that accepts anything because validation was deferred to an async queue — a design tradeoff that became a security gap when the queue processor didn’t validate either
These aren’t in the CVE database. They’re business logic vulnerabilities — the result of how our specific code, our specific ORM configuration, and our specific architectural decisions interact. Automated tools don’t know your business rules.
That said, manual testing doesn’t scale. I can’t personally curl every endpoint before every release. The real answer is both: automated tools for breadth, manual testing for depth. I use the manual findings to write custom rules that catch similar patterns automatically next time.
The DevOps to DevSecOps Transition
Making the official shift to DevSecOps wasn’t a single moment. It was gradual.
First, I started including security checks in my Terraform modules. Automated S3 bucket policy validation. IAM policy scanning. Things that would have caught those public buckets before they reached production.
Then, I started doing pre-deployment security reviews. Not formal audits — just “let me look at this endpoint before it goes live.”
Then, I found that critical vulnerability. The one that would have let anyone create invoices without authentication. No login. No rate limiting. Just curl -X POST and a JSON payload.
That was the inflection point. Not because it was the scariest finding, but because it had a dollar sign attached. The team could see: “If an attacker found this, it would cost us real money.” Abstract security risks are easy to deprioritize. A vulnerability that generates fake invoices in your accounting system is not.
Now, I’m the person who:
- Reviews new endpoints for auth issues
- Checks input validation logic
- Runs fuzzing tests before releases
- Writes threat models for new features
- Panics at 3 AM about configurations
It’s a different job. Same skills, different focus.
What I Learned
1. It’s Usually a System Problem, Not a People Problem
Developers aren’t reckless. They’re working under deadline pressure, trusting that:
- DTOs will protect them
- Frameworks handle security
- Users will send reasonable input
- “It works” means “it’s secure”
None of these are true. But the solution isn’t “developers should be more careful.” The solution is building systems that make the secure path the easy path: validation pipes that reject unknown fields by default, auth decorators that are required rather than opt-in, CI checks that flag endpoints without authorization.
2. Infrastructure and Application Security Overlap More Than You Think
You can’t fully secure the app without understanding the infrastructure. You can’t fully secure the infrastructure without understanding the app.
That ALB security group misconfiguration? That allowed the X-Forwarded-For bypass, which broke rate limiting, which enabled brute-force attacks on the login endpoint. Infrastructure problem → application vulnerability → business risk. The chain only makes sense if one person understands both layers.
They’re not the same discipline. But in a small team, they need to be the same person.
3. Paranoia Is a Feature (With Limits)
I used to think being paranoid about security was a character flaw. Now I think it’s a job requirement.
Assume everything is broken until proven otherwise. Trust no input. Verify every configuration.
But healthy paranoia needs boundaries. I time-box my testing. I write up findings and move on instead of spiraling. I accept that some risk is tolerable. The goal isn’t zero vulnerabilities — it’s knowing where the vulnerabilities are and making informed decisions about which ones to fix first.
4. Documentation Is Security
The most secure system is one where the next person can understand what you built and why. Undocumented magic becomes undocumented vulnerability.
Specifically, what helps:
- Threat models — what are we protecting, from whom, and what’s the worst case?
- Runbooks — when something goes wrong, what’s the step-by-step response?
- Architecture decision records — why did we choose this auth flow? What tradeoffs did we accept?
- Security invariants — “all endpoints require
@Authorized()unless explicitly listed in the public endpoint registry”
I write more docs now than ever. Not because I like it (I don’t), but because I’ve seen what happens when tribal knowledge leaves the building.
The Tools I Use
I said I do most of this manually, and I do. But I also use tools — and I should be honest that my stack is AWS-centric. If you’re on GCP or Azure, the infrastructure tools change, but the application testing approach is the same.
For infrastructure:
checkov— Terraform security scanning (catches public buckets, overly permissive IAM)tfsec— Another Terraform scanner (I run both; they catch different things)- AWS Config rules — Continuous compliance checking
For applications:
curl— Still my go-to for simple requests. For complex flows with auth state, cookies, or multi-step sequences, I write bash scripts that chain requests together. It’s not pretty, but it’s reproducible.jq— JSON manipulation and analysis- Custom scripts — For specific tests I run repeatedly (BOLA checks, input fuzzing loops)
For monitoring:
- CloudTrail logs — Watching for weird API calls
- Application logs — Looking for error patterns
- Custom alerts — For things that shouldn’t happen
What I don’t use yet but probably should: a proper SAST tool for the application code, and runtime protection (WAF rules beyond the basics). Those are next on my list.
The Mindset Shift
The biggest change wasn’t technical. It was mental.
As a DevOps engineer, my goal was: “Make it work. Make it reliable. Make it scalable.”
As a DevSecOps engineer, my goal is: “Make it work. Make it reliable. Make it scalable. And make sure an attacker can’t abuse it.”
That last part changes everything. You start doing informal threat modeling without even calling it that:
- What happens if this endpoint is called 1000 times per second? (abuse case, not scaling — does the rate limiter actually work, or is it counting the wrong IPs?)
- What happens if someone sends a 10MB JSON payload? (resource exhaustion — does the parser have a size limit?)
- What happens if the database connection fails mid-transaction? (error disclosure — does the 500 response leak connection strings?)
- What happens if someone sets
isDeleted: trueon their own account? (mass assignment — does the DTO reject unknown fields?)
Edge cases become primary concerns. Failure modes become features.
Advice for DevOps Engineers Making the Shift
If you’re reading this and thinking “I should do more security,” here’s what I’d tell you:
1. Start with your own infrastructure
Before you touch application code, audit what you already own. Run checkov on your Terraform. Check your S3 bucket policies. Review your IAM roles. You’ll probably find something, and it’ll motivate everything that follows.
2. Learn one attack vector deeply
Don’t try to learn everything. Pick one that’s relevant to your stack:
- Mass assignment — if you use an ORM with auto-mapping (TypeORM, Sequelize, ActiveRecord)
- BOLA/IDOR — if your API uses sequential IDs in URLs
- Authentication bypass — if you manage your own auth flow
- SSRF — if your app makes outbound HTTP requests based on user input
Learn it well enough to spot it in the wild. Then pick another.
3. Read breach post-mortems
Not the sensational ones. The detailed ones. Cloudflare’s blog posts. GitLab’s incident reports. The ones that explain the root cause, the timeline, and what they changed. Understand how real attacks happen in production.
4. Break something (in staging, with permission)
Find a vulnerability. Exploit it responsibly. Document the steps. See how it feels.
That feeling — the mix of excitement and dread — is how you know you’re thinking like an attacker.
Always test against systems you own or have explicit authorization to test. “Staging” doesn’t mean “someone else’s staging.”
5. Build a checklist, then automate it
Every manual finding should become a check that runs automatically next time. Found a public S3 bucket? Add a checkov rule. Found missing auth on an endpoint? Add a test that verifies @Authorized() is present. The goal is to make yourself unnecessary for the easy stuff so you can focus on the hard stuff.
Where I’ve Been Wrong
I should be honest: I’ve made mistakes in this process.
I shipped a false positive report about an S3 bucket being writable. I’d tested with --dryrun, seen “success,” and written it up as critical. The actual upload failed with AccessDenied. The --dryrun flag doesn’t check IAM permissions — it just validates syntax. I had to correct my own report.
I’ve also over-prioritized. Not everything is critical. A stored XSS in a contact form that only renders in an internal admin panel is not the same as unauthenticated invoice creation. Learning to triage — to say “this is medium, not critical” — took longer than learning to find the bugs in the first place.
What’s Next
I’m not stopping. The landscape keeps changing. New frameworks, new vulnerabilities, new attack vectors.
AI is changing both sides of the equation:
- Attackers benefit: AI can scan JavaScript source, extract API keys and Cognito pool IDs, and test every permission automatically. The time between “vulnerability deployed” and “vulnerability exploited” is shrinking.
- Defenders benefit too: AI-assisted code review can flag insecure patterns before they ship. Copilot-style tools can suggest the secure version of what you’re trying to write. I use AI to help write fuzzing payloads and review Terraform for misconfigurations.
The net effect is that the baseline for security is rising. What was “good enough” two years ago isn’t anymore.
Which means I need to keep learning. Keep testing. Keep breaking things on purpose so they don’t get broken by someone else.
If you’re a DevOps engineer reading this, consider making the shift. You don’t need a certification. You don’t need to become a pentester. You just need to start asking “what if?” about the systems you already manage.
The paranoia helps. The checklists help more.
If you’re on a similar journey, or thinking about starting one, I’d love to hear from you. LinkedIn or GitHub. Let’s be paranoid together.