We all know how easy it is to accidentally hard code an AWS credential or Stripe payment key during development, add the wrong directory to .gitignore
, or forget you’ve included hard-coded secrets altogether.
One git push
and it’s too late.
If important credentials are exposed online, bad actors can use those keys to grab sensitive information, shut down services, or run up exorbitant fees.
AWS provides a great service where they will detect and block access for exposed credentials in under fifteen minutes. The problem is, it takes threat actors just one minute to find and exploit a key found on GitHub.
That’s why one of the early features we added to the Bridgecrew platform was identifying secrets exposed in runtime environments.
However, deleting git history is not easy, and we all know that whatever shows up on the internet stays forever. So, we leveled up our Secrets Scanner and added it to Checkov to find secrets earlier in the development lifecycle.
What to do if you find your secrets exposed
Before we talk about how and where Checkov identifies secrets, I want to cover a very important topic—what to do when secrets are exposed.
If you find your secrets exposed anywhere online, follow these steps to remediate:
- Disable and revoke the key ASAP
- Ascertain which services have been accessed by the keys and start relevant infosec/data leakage processes
- Review logs for nefarious activities, such as VMs you didn’t spin up
- Clean your git history or log history where possible (it’s not easy)
- Create a new key and distribute it to the various services that need it
- Monitor your services for any that break because you haven’t updated the key for that service
Now, back to our very exciting announcement.
Secrets Scanning in code with Checkov
Including secrets in infrastructure as code (IaC) templates simplifies the development process where web-based secrets stores aren’t easy to access. However, it isn’t acceptable to keep those secrets in files pushed to a web-based service, such as version control systems (VCS) without using a secure secrets store.
We teamed up with Kevin Hock, creator of Yelp detect-secrets to add Secrets Scanning to Checkov to prevent such behavior.
Checkov now uses three techniques to identify secrets in code:
- Regular expression scanning. Does the string follow the pattern of other secrets of that type such as an AWS Access key?
- Keyword-based scanning. Does the key contain a word associated with a secret such as “password?”
- Entropy-based scanning. Does the value differ enough from a real language that it’s most likely a secret such as
EuN21!HHvaS%JPTQU&cx
?
Let’s review each of these methods.
Regular expression secrets identification
As of today, Checkov comes out of the box with the following regular expression based secrets finders:
- Artifactory credentials
- AWS access keys
- Azure storage account access keys
- Basic auth credentials
- Cloudant credentials
- IBM Cloud IAM keys
- IBM COS HMAC credentials
- JSON web tokens
- MailChimp access keys
- NPM tokens
- Slack tokens
- SoftLayer credentials
- Square OAuth secrets
- Stripe access keys
- Twilio API keys
For example, below is the result of a Terraform template with AWS Access Keys in the provider file. Checkov will scan IaC templates for strings that follow the patterns of these various secrets and flag them for review.
Want to help out? If you’re a Checkov user, suggest your own secrets or contribute back to Checkov with signatures for secrets you worry about.
A new secret detection plugin combining keyword and entropy
Patterns that match common secrets such as password
, pwd
, api*_key
, etc. have a high likelihood of being secret. Similarly, if a string contains a word that is very far from any word in the English language, based on the Shannon entropy calculator, it’s likely a password or secret.
After scanning thousands of repositories, we’ve seen that both keyword and entropy-based scanners can provide a ton of value. The downside of these types of checks is that they can provide some false positives. Sometimes I like to use very long names for variables such as redirectForOptimization
. That’s not a secret, but it’s not a word.
To address that noise, we consulted with the expert and maintainer of detect-secrets, Kevin Hock. He suggested combining the two detectors to lower the false-positive ratio. We’ve made that change and contributed it back upstream to detect-secrets (open thread here).
The result still finds secrets that regular expressions would not pick up on, with less noise. For example, Checkov identified a popular password in this Kubernetes manifest:
Tune your Secrets Scanning
We’re continuing to tune our Secrets Scanner to provide the most accurate feedback to developers. Remember that if Checkov provides any false positives, you can add a skip comment to that resource.
We’ve also added a baselining functionality that will only identify new policy violations, such as detected secrets, after an initial Checkov scan. The idea is that your existing IaC templates may be battle-hardened and all secrets looking strings may already be vetted. So, if Checkov flags a few high entropy strings that aren’t actually secrets or that you don’t care about being exposed, they will be skipped in the next Checkov run. This feature requires very judicious use and requires some fine-tuning to ensure you don’t skip over backlog issues, but can be helpful if done well. Check out our baseline blog for more.
Secrets Scanning in the Bridgecrew platform
This Secrets Scanning for code capability is now available in the Bridgecrew platform as well. Bridgecrew will now identify secrets wherever you have integrations and surface them in the tool, as well as in the Bridgecrew UI (such as on the Projects page).
We’re very excited to bring this functionality to Checkov users and hope to lower the number of breaches caused by secrets theft. Try it out for yourself with Checkov and if you identify a secret in your code, please share your story (without the secret) with us in our Codified Community channel in Slack!