Addressing security throughout the infrastructure DevOps lifecycle

Addressing security throughout the infrastructure DevOps lifecycle

No, this isn’t another post about the Secure Development Lifecycle. This is a practical post on why and how to address cloud security at each step of the infrastructure development lifecycle, from infrastructure as code in your IDE to running cloud resources.

The idea of addressing cloud security best practices earlier in the development lifecycle is catching on, as cloud security posture management (CSPM) vendors and cloud workload protection platforms (CWPPs) expand their offerings and acquire developer-first companies like ours. We think it’s for a good reason.

Whether you call it developer-first security, DevSecOps, or “shift left,” this approach provides a solution to two pervasive challenges with DevOps and cloud native methodologies:

  1. The further software flaws get from developers’ workstations, the more expensive they become. Research suggests that it can be up to 100x more expensive to find and fix a software flaw in production.
  2. Misconfigurations due to human error account for 95% of cloud security incidents. The cloud provides increased scalability, mobility and flexibility, while raising the stakes for mistakes. Although cloud service providers (CSPs) themselves are super secure, the complexity of APIs and services dramatically increases the risk of security misconfigurations.

Piggy-backing off the increased use of cloud configuration frameworks and infrastructure as code (IaC), the “shift left” approach promises improved efficiency and decreased risk. But as is true with most new and disruptive ideas, it’s essential to view it within the context of real cloud environments, real developer workflows, and real business risk.

Misconfigurations indeed become more risky and expensive the further along in the lifecycle they get. However, it’s also true that the earlier in the lifecycle you are, the less information is available about what infrastructure will look like and, thus, what misconfigurations may be present.

That’s why it’s important to address infrastructure security in both runtime and build-time. We would take that a step further and say that depending on the nature of your team, environment and processes, there are nuances between each phase of the development lifecycle that are important to consider.

Runtime/Production

Keep in mind that developer-first security doesn’t preclude “traditional” cloud security methods—namely monitoring running cloud resources for security and compliance misconfigurations.

First of all, unless you’ve achieved 100% parity between IaC and the cloud (unlikely), runtime scanning is essential for complete coverage. You probably still have teams or parts of your environment—maybe legacy resources—that are still being manually provisioned via legacy systems or directly in your console and so need to be continuously monitored.

Even if you are mostly covered by IaC, humans make mistakes and SREmergencies are bound to happen. We recently wrote about the importance of cloud drift detection to catch manual changes that result in unintentional deltas between code configuration and running cloud resources. Insight into those resources in production is essential to determine those potentially risky gaps.

Runtime scanning also has some advantages. Because it follows the actual states of configurations, it’s the only viable way of evaluating configuration changes over time when managing configuration in multiple methods.

Relying solely on build-time findings without attributing them to actual configuration states in runtime could result in configuration clashes. For example, attempting to encrypt a previously unencrypted DB instance could fail to provision a change, or purge current instances and replace them by new empty ones—since some managed DB services, like RDS, do not permit encryption after the fact.

It’s also a great way to satisfy compliance audits that require continuous change-control auditing and tracing. By mapping to compliance benchmarks and sections, you can use the scan reports as baseline evidence to meet most industry-specific requirements and audits.

CI/CD and Pull Request

The beauty of IaC is that it allows for preemptive and even proactive monitoring of infrastructure—before it’s deployed. CI/CD pipelines are critical in compiling infrastructure and testing that compiled code before it’s delivered and deployed. At this phase, IaC is quite representative of how it will be provisioned once it’s deployed.

For Terraform code, for example, terraform plan output contains a much higher integrity of detail than it does in its previous stages. Scanning this level of abstraction is important, to see if misconfigurations exist in the core resources about to be provisioned and in its variables and dependent modules.

The other benefit of embedding infrastructure security into the CI/CD pipeline is that it’s automated and can be fully customized for your workflow. You can determine what kinds of checks fail builds and surface feedback directly in your CI/CD provider. It’s also a collaborative process that your entire team is using to review, reject, and approve changes.

Similarly, for teams who live and die by their version control systems (VCSs), embedding security into pull requests and sanctioned code review processes has many benefits. Depending on what triggers your CI/CD builds, this approach can be used in tandem with—or instead of‚ scanning via CI/CD.

Your VCS may also afford some unique controls for implementing internal controls. All three major platforms (GitHub, GitLab, and Bitbucket) offer versioning, branching, and embedded code review and authorization controls, that allow developers to test without compromising running production systems and to continue making code changes before merging.

For example, you may want to set up your code review settings to block merges when checks fail, further ensuring that misconfigured IaC doesn’t make its way to your master branch.

Regardless of whether you’re scanning in CI/CD or your VCS, this level of control promotes collaboration and accessibility to developers. If you can provide fixes, in addition to detection to and from the same stage, your task is that much easier.

Pre-Commit and IDE

One of the major benefits of scanning integrated code before a CI/CD gets triggered is the ability to make changes without waiting for your build to finish. For many of us, waiting for builds to finish is the bane of our existence. Re-running builds that failed due to a security misconfiguration is a huge time suck that can be avoided by shifting security a step left. Pre-commit unit and integration testing is a generally accepted best practice, but pre-commit security testing may not yet be on your list of priorities as an infrastructure developer.

Scanning IaC locally for misconfigurations is the best way to address errors in the safety of your own workspace, without having to waste time failing builds (and delaying your teammates’ builds) or tripping the wire on pull request checks.

Raising flags before it even gets to your shared repositories is the epitome of shift-left security, but the onus is on the developer to run the scans and make the changes.

The only thing better than scanning your IaC before committing it is getting in-line feedback wherever you’re actually writing code. Because it minimizes context switching (there’s no need to run ad hoc scans), it’s the cheapest and most foolproof way to identify misconfigurations.

To shift cloud security as far “left” as possible (besides the design and plan phases), embedding guardrails into your IDE is the best way to go. This likely will come in the form of a plugin or extension, such as the Checkov VS Code extension. The Checkov extension makes that scanning process even more seamless, by providing real-time and continuous scanning and in-line fixes as you code. Whether you’re taking the feedback and implementing changes in the moment, or you’re using IDE suggestions as an educational tool for writing secure infrastructure, there are very few downsides.

With the right balance of passive analysis and actionable feedback, you can become a better security advocate without an outsized amount of effort.

At the end of the day, that’s what will move the needle when it comes to making infrastructure security and compliance accessible to developers. Making these tools accessible while enforcing guardrails collectively and at different checkpoints is the true manifestation of shifting security left.

This post originally appeared in The New Stack.