Part 3: Top trends from analyzing the security posture of open-source Helm charts

So far, in this three-part series on the open-source Helm security state, we’ve discussed:

  • The importance of addressing security and compliance best practices when leveraging open-source Helm charts
  • Why we put together this research and high-level findings
  • The part dependencies play in the reusable Kubernetes manifest ecosystem

This section will take our analysis a step further to examine a single Helm chart from our Artifact Hub dataset. To illustrate the importance of addressing security configuration of open-source Helm charts when leveraging for your own purposes, we’ll examine one of the most popularly used dependencies on Artifact Hub, bitnami/postgresql. This chart bootstraps a PostgreSQL deployment on a Kubernetes cluster using the Helm package manager.

We leveraged Checkov to scan bitnami/postgresql, omitting policies for default namespaces as they don’t make sense in a Helm context. We’ll analyze the failing policies below and provide guidance on ensuring our charts conform to security and compliance best practices.

Before we look at the failed policies, we will compare the posture of the chart with the posture of the deployed Kubernetes manifest running on a cluster. To ensure the issues are consistent and we didn’t lose any resolution by scanning the template versus its runtime configuration, we compared the output from our scanner with output from Bridgecrew’s Kubernetes runtime integration.

High severity policies

There was only one high severity failed check for this chart. 

Ensure Containers Do Not Run With AllowPrivilegeEscalation

Check: CKV_K8S_20: "Containers should not run with allowPrivilegeEscalation"
FAILED for resource: StatefulSet.postgresql-postgresql.default (container 0) - postgresql
	File: /runtime.statefulsets.yaml:256-332
	Guide: https://docs.bridgecrew.io/docs/bc_k8s_19

This check failed because allowPrivilegeEscalation = false was not specified in the container’s spec. The CIS Kubernetes guidelines v1.6 recommends a Pod Security Policy to enforce security controls on the container at the cluster level by setting AllowPrivilegeEscalation to False. 

By default, it is set to true, so we should set it to false to ensure RunAsUser commands cannot bypass their existing sets of permissions. The guidance provides context and helps with remediation:

The chart templates the securityContext section of the pod spec into Values.yaml, so we can add our new entry here:

Once that remediation is made, we now pass CKV_K8S_20:

Medium severity policies

After the high severity issue, we took a look at two medium severity policies.

Ensure that the seccomp profile is set to docker/default or runtime/default

Check: CKV_K8S_31: "Ensure that the seccomp profile is set to docker/default or runtime/default"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default
        File: /postgresql/templates/statefulset.yaml:3-144
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_29

This issue relates to a recently released feature in Kubernetes to enforce seccomp kernel protections for Kubernetes workloads. Restricting kernel access presented to a vulnerable application limits the privilege escalation blast radius. 

Because the changes are so new and unfamiliar to us, we had to do a deep dive into the related Kubernetes PRs for documentation. In our research, we found this comment on the issue, said like a true security professional: “Short term, the default profile should be that of the underlying container runtime installation. While a constrained seccomp profile could break backward compatibility, critical safety features should be opt-out, not opt-in.”  

The fix will differ depending on the Kubernetes version as seccomp configuration went GA and configuration moved to spec > securityContext > seccompProfile in version 1.19 and later. 

Use read-only filesystem for containers where possible

Check: CKV_K8S_22: "Use read-only filesystem for containers where possible"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default (container 0) - RELEASE-NAME-postgresql
        File: /postgresql/templates/statefulset.yaml:63-144
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_21

Although not covered as part of the CIS Kubernetes 1.6 guidelines, this check suggests that a read-only root filesystem helps enforce an immutable infrastructure strategy. 

Because a container may serve many different requests for many users in its life, ensuring that the state of the container does not change during its lifetime is a sensible security precaution. Initial attempts to chain vulnerabilities to gain a foothold on a compromised system may be harder to achieve if there is no writable location.

With the expectation that containers are short-lived, all data should be persisted to an external database or a persistent volume mount as part of the application design. That way, functionality isn’t affected by enforcing the root filesystem to be read-only.

However, a lot of initialization scripts run inside containers at startup, edit or create configuration files, or download necessary components. While not best practice for a container image, images set up this way will break when enforcing readOnlyRootFilesystem

Low severity policies

We will round off this section by looking at lower severity policies, some of which we covered in the “Low Hanging Fruit” section in part one of this series.

Image Pull Policy should be Always

Check: CKV_K8S_15: "Image Pull Policy should be Always"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default 
        File: /postgresql/templates/statefulset.yaml:56-130
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_14

This check ensures that new pods pull images from the source with valid credentials every time. The side effect of not enabling this is that any pod in a multi-tenant cluster, even those with no reason to, could pull the image locally without credentials. Simply changing pullPolicy to Always fixes the misconfiguration. This check increases some network load as it disables image caching, but increases security.

Containers should run as a high UID to avoid host conflict

Check: CKV_K8S_40: "Containers should run as a high UID to avoid host conflict" 
FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default
       File: /postgresql/templates/statefulset.yaml:3-144
       Guide: https://docs.bridgecrew.io/docs/bc_k8s_37

For this check, the guidance suggests setting UIDs over 10,000 to prevent a container escape running processes on the host with a valid user id. The default UID for this chart, however, is 1001. Some environments may consider this to be high enough, in which case this check could be skipped on a per-environment basis. For good measure, we will update the chart to use UID 10,001.

Ensure that Service Account Tokens are only mounted where necessary

Check: CKV_K8S_38: "Ensure that Service Account Tokens are only mounted where necessary"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default
        File: /postgresql/templates/statefulset.yaml:3-144
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_35

While it would make more sense to apply this on the default service account objects themselves, there is nothing wrong with being verbose and ensuring that our charts won’t mount the secret regardless of the destination clusters’ configuration.

This change goes directly into the template itself as there are no existing values.yaml section we can customize. The template is edited for both StatefulSet definitions:

Image should use digest

Check: CKV_K8S_43: "Image should use digest"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default (container 0) - RELEASE-NAME-postgresql
        File: /postgresql/templates/statefulset.yaml:60-135
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_39

This policy suggests specifying the digest of each image used. Unlike tags, digests cannot be changed or linked to a different image, either by accident, losing track of a repeatable version for your deployment, or maliciously by an attacker taking control of the container registry and causing your deployment to run their code. 

Versioned containers are still a vast improvement to using :latest for which there is another check, which this chart passed.

Memory and CPU limits should be set

Check: CKV_K8S_13: "Memory limits should be set"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default (container 0) - RELEASE-NAME-postgresql
        File: /postgresql/templates/statefulset.yaml:56-130
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_12
Check: CKV_K8S_11: "CPU limits should be set"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default (container 0) - RELEASE-NAME-postgresql
        File: /postgresql/templates/statefulset.yaml:56-130
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_10

Because the chart is already defining the resource requests, we can tackle CPU and memory limits together and set the limits to the same. Resource requests are used for placing the workload on a node and enforcing usage limits. If the addition of limits causes issues with the workload, then the requests themselves are proven incorrect and also need updating.

Minimize the admission of containers with capabilities assigned

Check: CKV_K8S_37: "Minimize the admission of containers with capabilities assigned"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default (container 0) - RELEASE-NAME-postgresql
        File: /postgresql/templates/statefulset.yaml:60-138
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_34

This is another failsafe check, highlighted in CIS Kubernetes section 5.2.9, which may be better implemented as a PodSecurityPolicy by the owner of the cluster. However, because we are a Helm chart and we do not know the destination, let’s ensure we’re dropping all admin capabilities as an explicit default.

By dropping all capabilities, we also resolve another check, CKV_K8S_28, which suggests minimizing the admission of containers with the NET_RAW capability. Allowing NET_RAW explicitly in this configuration would bring back this failure.

Prefer using secrets as files over secrets as environment variables

Check: CKV_K8S_35: "Prefer using secrets as files over secrets as environment variables"
        FAILED for resource: StatefulSet.RELEASE-NAME-postgresql.default (container 0) - RELEASE-NAME-postgresql
        File: /postgresql/templates/statefulset.yaml:63-144
        Guide: https://docs.bridgecrew.io/docs/bc_k8s_33

Due to applications commonly logging their environment (including environment variables) for debugging purposes, CIS Kubernetes guidelines 5.4.1 recommends mapping secrets into Kubernetes pods as files rather than environment variables. That way, secrets are protected from accidentally ending up in log files.

Complying with this guideline may require application code changes to read in the file to consume the secret, the issue will trigger anywhere a secret store is being used to provide an environment variable to a pod:

High severity issues and Kubernetes machinery

Although the dependency we’ve been exploring, postgresql, is a developer application, Helm is often used to package and deploy components for the management, maintenance, and extendability of the Kubernetes cluster itself.

This poses some challenges when considering the security posture needed (read, achievable) with current Kubernetes constructs.

The need to define “end-user” vs. “operational” Kubernetes workloads

While considering high-priority security issues such as shared namespaces, host access, and admin capabilities, we need to consider the context of the chart’s application. It’s especially important to differentiate between an end-user workload and a workload designed to be part of the operational lifecycle of the Kubernetes cluster itself.

Some examples of “operational” lifecycle workloads include:

  • Applications to monitor the health of the Kubernetes cluster (i.e., Prometheus).
  • Applications critical to a pluggable subsystem (i.e., Calico for Networking).
  • Applications extending Kubernetes itself (i.e., Operators for CRD’s).

For these workloads to operate correctly, they often require extra capabilities and deeper underlying system access than an end-user application and, thus, could skew the results of such a broad collection of Helm charts.

···

At Bridgecrew, our community depends on reusable pieces of IaC, whether they’re Terraform modules, CloudFormation templates, or Helm charts. 

We hope that by putting together this research and making Helm Scanner accessible, we can continue emphasizing the importance and nuances of open-source IaC security. We’d love to make the information in this report available—per repository and per chart—as information for the community when looking for Kubernetes resources.

We’d also be interested in extending the scanner to a wider range of tools allowing for deeper Helm and Kubernetes coverage and providing an API for the results for automated visibility from within the chart git repositories or Artifact Hub itself. We’re tracking all of these in GitHub issues here.

For further discussion or to get involved, join our #CodifiedSecurity Slack channel and say hi in #helm-scanner!