Why DevSecOps is the new bottleneck for cloud-native teams

Cloud DevSecOps Bottleneck

This is the first of two posts on why DevSecOps can end up backfiring within cloud-native organizations and what you can do about it. In part two, we’ll share how we’ve seen some cloud-native teams overcome that cloud DevSecOps bottleneck.

DevSecOps and the “shift left” movement need no introduction in today’s cloud-native landscape. DevOps, engineering, and security have all been talking about embedding security earlier in the development lifecycle for years now. And there are some successes—AppSec and container security have certainly had their “shift left” moments, with tools like Snyk spearheading the way to put security tooling in the hands of developers.

Cloud security, on the other hand, has lagged on the DevSecOps front. Without confusing correlation with causation, we’re dealing with the consequences. Cloud misconfiguration is the number one cause of data breaches—which are just getting more expensive, while executives are only getting more concerned about them.

In the past year, while building the Bridgecrew platform, we’ve talked with many teams to understand their cloud security wins and challenges. Collaboration between security and DevOps was a common—if not the most common—topic of discussion. We heard how some cloud-native engineering and security teams are incorporating far-left DevSecOps in the cloud super successfully. We also heard about how DevSecOps has backfired for others, creating almost insurmountable bottlenecks. In this two-part series of posts, we’ll share the common themes we heard on both sides.

Conflicting goals and processes

It’s no secret that the motivations of DevOps and security are often at odds. DevOps is motivated by working iteratively and moving fast, while security gets a bad rap for being a hindrance.

DevSecOps is supposed to fix this—integrating security into the development lifecycle by embedding into code review processes via CI/CD—but cloud security isn’t quite there. The current cloud security model has traditionally been concerned with security and compliance of already deployed resources. While effective up to a certain point, that model happens by design outside of the development lifecycle. It’s reactive rather than proactive, and is often made redundant by the agile processes it’s moving asynchronously with. The same goes for GRC tooling and auditing, wherein point-in-time analysis of cloud environments occurs outside of engineering sprints and DevOps processes.

These points of friction happen partly because the security mindset has mainly been a protective versus a preventative one. DevSecOps in the cloud is also difficult because of the overall lack of processes for developing scalable infrastructure via code—let alone securing it.

Unclear and inconsistent policy governance

Infrastructure as code (IaC) is an answer to that call, alleviating the performance and cost-related challenges of deploying infrastructure at scale. IaC frameworks like CloudFormation and Terraform bring tremendous scalability and repeatability to cloud infrastructure. They also come with their own drawbacks.

IaC creates additional permutations of cloud infrastructure, managed by different teams with varying workflows. Those workflows create layers of complexity and ambiguity as to where and when policies are being enforced. At best, that ambiguity creates redundancy; but at worst, it can lead to risk. Imagine what happens when engineering assumes that security and compliance policies are being enforced at the resource level, by cloud provider tools or other security-bought solutions. Meanwhile, security doesn’t even know about all of the provisioning frameworks in place and whether or not the resources they are deploying have the proper policies in place.

Again, with DevSecOps and IaC, these issues should be alleviated. But without a concerted effort to define where and how policies are governed, more work ends up being created. Until that happens, governance will always be a somewhat manual and therefore inconsistent and time-consuming process. For issues (like missing encryption on an S3 bucket, or overly permissive security group rules) that do get fixed in runtime when IaC is in use, we have found that there is a 70% chance it will reoccur down the line. Without a workflow to fix issues at the source, more work ends up being created when inherently misconfigured code deploys cloud resources.

Skills and access gap

Automated tools—including our own open-source IaC scanner, Checkov—have made it far easier to enforce policies at different stages of development and deployment. They surface everything from critical security issues down to informational best practice violations. On the surface, it may seem like the answer to our cloud DevSecOps woes—but it can often contribute to the very bottleneck it is trying to alleviate.

When misconfigurations are identified, that means work needs to be done. Security and compliance typically don’t have the necessary access to code repos or cloud consoles to implement fixes. Even if they do, they likely lack the know-how and full application or infrastructure awareness to fix cloud misconfigurations themselves.

So, they open a ticket, assign it to a team lead who maybe is or isn’t working on the team for which the issue is related. Then that remediation gets slotted into some numbed sprint. Meanwhile, ten more “P0” tickets have been created. With automated scanning in place, those tickets easily snowball into hundreds more and then get shuffled from team to team to start piling up in the backlog.

From the opposite end, you have the well-documented cybersecurity skills shortage—which exacerbates the issues above. As a result, security and compliance responsibilities get put on the plates of developers who are often not adequately prepared. Whether there are security resources at hand or not, it is unrealistic to expect developers to have comprehensive knowledge of all possible configuration issues they may face when building cloud infrastructure. This makes it easy for misconfigurations to be deployed. Those unaddressed misconfigurations don’t just represent risk, they become exponentially more costly to address in production. And then the cycle repeats.

DevSecOps aims to bring security into the fold, moving it directly into the software development lifecycle in order to avoid misconfigurations or weak implementations that can leave the app, service, and infrastructure vulnerable. As we continue to shoehorn cloud security into DevSecOps, we have to be aware of these gotchas, or risk slowing down the process and incurring unnecessary costs along the way—which is the exact opposite of what DevOps aims to do.

In part two, we’ll share how we’ve seen some cloud-native teams overcome that cloud DevSecOps bottleneck.

This post originally appeared on The New Stack.