Our commitment to open source goes beyond contributing to our own open source tools—Checkov, AirIAM, and Yor—we’re also proud to contribute back to the tools that we leverage within the Bridgecrew platform.
One such project is Terraformer, a powerful reverse infrastructure as code (IaC) tool. It is a Go based tool created by Waze (a subsidiary of Google) in 2018 that takes existing infrastructure and tool configurations and converts them into Terraform and JSON files. While producing the IaC files, it also records existing states in
Think of it as
terraform import that includes Terraform files or as a reverse
terraform apply command. As of today, Terraformer supports 15 cloud providers such as AWS, Azure, and Google Cloud, and 16 other providers such as monitoring tools, networking tools, streams, and orchestrators.
To support this powerful tool, we made two recent additions—more resources and a new initialization method for performance improvement.
Before we dive into those contributions, here’s a little bit more on Terraformer.
How does Terraformer work?
Terraformer at its core follows a few steps:
- Use the infrastructure or tool provider’s SDK or API to initialize all of the resources selected by the Terraformer command.
- Gather the list of all resources.
- Iterate over resources and take the ID for each resource.
- Call the provider’s API for all fields for each resource.
- Use Terraform’s existing providers to convert the state of each resource from the tool’s language into Terraform files or JSON files and
tfstatein a directory structure
You can even run
terraformer plan to see what you’ll generate as code before it’s generated, similar to
To show how this works, let’s use an example. If you call:
terraformer import aws --resources=vpc,subnet --regions=eu-west-1 --profile=prod
- Call the AWS API using AWS’ GoLang SDK to gather all VPCs and subnets from the eu-west-1 region with the profile prod.
- For each VPC and subnet, it will gather the resource ID.
- For each resource ID, it will gather all configuration details in AWS JSON format.
- Take the Terraform AWS provider and convert those configurations into HCL and append it to a file per type (vpc.tf and subnet.tf) with a map of the files in
Turning around and entering
terraform apply or
terraform plan returns “No changes. Infrastructure is up-to-date.” because the
tfstate file includes the running cloud state.
In the following terraform state you can see a code resource named
aws_vpc.tfer--vpc-000--00000 and its matching cloud provisioned resource id under the “
arn” field pointing us to the live instance. Now you can analyze both build-time and runtime configurations.
Pretty cool, right?
What did we contribute to Terraformer?
Two of the recent contributions we’ve made are adding more resources such as Azure blob store and containers, and a new initialization method for performance improvement. The second one really improved the logic of how the Terraformer process works.
While working with Terraformer, we noticed that certain processes could and should be parallelized. The initialize resource step was running over each resource one at a time in a single thread. This had the dual impact of taking a long time for large environments while leaving compute resources idle and hitting AWS rate limits. We took that process and parallelized it to reduce “hot spotting,” or hitting the same API frequently.
After the pull request is accepted, Terraformer still follows the steps in the previous section, but when it gathers the resources, it randomizes the resources and adds them one by one to as many threads as the processing unit can handle. As one resource is initialized, another resource is added to that thread. Where before there was one line of resources being initialized, now there can be multiple, speeding up the initialization process and slipping in under AWS rate limits.
We also changed the retry logic and timeouts to be configurable so they can be tuned per machine. That is especially useful when threading tasks as one can fail for reasons beyond an error. Adding in a retry ensures the whole Terraformer process doesn’t fail because of something solved by retrying.
The result is a significant performance improvement. It’s not quite a 1-to-1 thread comparison, but for very large environments expect the process to take significantly less time.
How do we use Terraformer at Bridgecrew?
You can imagine all of the use cases for Terraformer are nearly limitless. For example, if you’ve set up a cloud environment purely from that cloud provider’s UI or CLI and you want to move to a GitOps model, you could use Terraformer to start that journey with your existing infrastructure configurations.
Bridgecrew uses Terraformer in two places—code to cloud fix suggestion and drift detection.
Code to cloud scanning and fix suggestions
Terraformer allows us to take running cloud infrastructure and convert it into HCL.
This creates a common language and consistent view of cloud configurations in Terraform and deployed cloud services. That way you can start to translate runtime misconfigurations into IaC fixes. It also creates a representation of the runtime environment that Checkov can scan.
Another powerful thing this combination provides is drift detection. With a little mapping, we take the produced
tfstate representation of what is running and compare it with the
tfstate from the
terraform apply to identify any differences. Then we can tell you not only what has changed in your cloud environment that doesn’t match your code, but also what the difference looks like in code. Then you can take the appropriate action to revert changes made in runtime or match the Terraform files to the runtime environment.
If you want to see Terraformer and Checkov in action, sign up Bridgecrew!