The premise of AWS Identity & Access Management (IAM) is simple. You have users (person to machine) and roles (machine to machine) that need controlled access to certain services. Using IAM, you assign policies that determine whether each user and role can access certain services or not.
As you probably know full well, IAM becomes complex very quickly as the number of IAM objects in an environment grows.
If you’re not asking yourself questions like:
- Is the policy assigned to a user?
- To a group? Is it an inline policy?
- Who can assume the roles of other users and principals?
You’ll probably run into gotchas like:
- Implicit denies, where users/groups are neither granted nor denied explicit permissions
- Service control policies that may come from outside the scope of your AWS account
- Resource-based policies and IAM permission boundaries that may constrain a resource that appears to be allowed, and so on
Much like other DevOps challenges, IAM has only increased in complexity as we’ve moved to microservices and now, functions. S3 Buckets and SNS Queues weren’t really necessary when you could throw that data around within your monolith. With fewer components requiring their own security policies, access management wasn’t as much of an issue as it is today.
As a response to this challenge, we released our open source tool, AirIAM.
AirIAM manages, audits, and right-sizes IAM sprawl using just an IAM Read-Only permission for any given AWS account. By leveraging IAM and IAM Access Advisor data, AirIAM rapidly produces a list of unused keys, old accounts, and unbound roles within your IAM configuration. It also provides recommendations for reducing role permissions based on your actual historical usage:
While extremely useful in its own right, those aren’t AirIAM’s only party tricks.
AirIAM takes your existing run-time configuration from your AWS account and converts it into usable Terraform, including a Terraform state file. From there, a
terraform plan allows you to immediately manage your IAM configuration in a version-controlled, automated way!
Even if you don’t use infrastructure as code today, AirIAM gives you an offline, DevOps-friendly capture of your IAM configuration state with a single command.
Automating IAM drift detection with GitHub Actions
Codifying your run-time IAM configuration with a Terraform state file allows you to not only programmatically govern your IAM but also detect IAM drift. Paired with an automated workflow, this can be extremely useful as your infrastructure and team grow.
Using GitHub Actions which, we 💜 (who doesn’t love free compute!?), we’re going to set up a simple workflow that automatically notifies our developers if a new commit changes our infrastructure IAM setup.
The GitHub Action we wrote for this demo will:
- Use AirIAM to take a snapshot of our pre-deploy IAM configuration
- Use AirIAM to take a snapshot of our post-deploy IAM configuration
- Send a notification via Slack if there are differences between the two
To try this out for yourself, use the Github Action and configure the following GitHub Secrets in your repository:
- AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY: These can be a user or role with just the IAM Read-Only managed policy; nothing else is required
- SLACK_WEBHOOK_URL: The webhook URL of a slack app configured in your slack workspace to allow messages into a given channel for updates
Although your Slack webhook URL and AWS keys are protected from appearing in the public CI/CD run logs by GitHub Secrets, the artifacts we produce (Terraform snapshots of your IAM configuration) will be publicly available to anyone signed into GitHub. This data contains sensitive information, such as your AWS account ID used in ARNs. It can also give an attacker complete visibility into the layout, policies, and permissions of your IAM configuration.
For public repositories, we strongly advise using a private object store for generated data. Please consider this public-repo approach for demonstration purposes only!
Now, let’s walk through the workflow step by step.
Our demo environment has a simple Serverless Framework deployment, a single Python-based function, and some inline IAM configuration for access to an S3 Bucket dependency. This same setup would work for any IaC framework, such as Terraform or AWS CloudFormation.
To show the workflow in action, we’ve introduced a change to our
serverless.yml, attaching a new role to our function:
Using CI/CD—Github Actions in this case—we’ve automated the
sls deploy to push these changes to production (notwithstanding a load of tests, of course).
Using the same pipeline, we use AirIAM to take a Terraform snapshot and upload the output into a directory named after the pre-commit SHA:
airiam terraform --no-cache --without-import -l 30 -d pre-$GITHUB_SHA
We then save this directory structure as a
.tgz uploaded as a Github Action artifact for our post-deployment workflow job.
Following a normal
sls deploy, we repeat the process, taking another snapshot and saving the output as an artifact. We then diff the two snapshots and use the exit status of
diff and if-then logic to trigger a Slack update if changes are present.
The “action” link in the Slack notification above provides access to the specific run in question, containing a simple
.patch style diff file and access to download both the pre- and post- IAM snapshots generated by AirIAM.
This allows you to easily investigate the change and provides a handy archive of point-in-time snapshots to be used for future audit, drift, and investigation purposes. Below, we find examples of these resources.
The diff opened in a text editor:
The expanded archive of the pre-deployment AirIAM snapshot for this CI/CD run:
roles.tf file showing our policy and role name before our latest commit:
Catching drift not caused by deployments
By generating a pre- and post-deployment snapshot of IAM into Terraform—potentially mere seconds apart—our diff is likely caused by our deployment. But by keeping the snapshots as artifacts, we can also catch other forms of drift such as manual changes made via AWS APIs, the AWS web console, and even potential changes resulting from unauthorized activity.
To achieve that, the pipeline would look very similar. Instead of taking a pre-deploy snapshot, the baseline artifact would be retrieved from the previous job’s run—either generated by the previous commit or a separate workflow configured to run on a given schedule. For the latter use case, scheduling GitHub Actions can be used to check for IAM drift on a regular cadence.
We would then
diff the baseline against a new set of Terraform state file snapshots, freshly generated by AirIAM, and alert on any changes.
download-artifact Github Action we’re using does not currently support retrieving artifacts from previous workflow runs within the Github APIs, and there aren’t any fixed paths to easily access an artifact by name from a certain SHA or commit. You’ll need a mixture of the dynamic job-id of the run and the artifact’s name to request via the get-artifact API. If you’re interested in this much-requested feature, you can follow the discussion in detail here, which includes workaround scripts for using the API-filtering approach.