Nov 25, 2024

#12. Implementing Governance through Policy: Part One - Infrastructure as Code (IaC)

Implementing governance through cloud policy can be done at two levels. This article series discusses these options in detail.

In article #08, Fix the root cause, not the symptoms: Cloud Governs vs Cost Optimizations, I discussed the importance of cloud governance through policy. There are multiple ways to implement such governance, which I'll explore in this two-part article series.

In this first article, we'll examine the best practice of implementing governance using IaC (Infrastructure as Code), with Terraform as an example.

TL;DR

  • Use input Validation Blocks in Terraform to enforce Governance
  • Utilize governance frameworks (policy as Code) such as Sentinel or Open Policy Agent (OPA)
  • Use a professional enterprise grade FinOps governance tooling (e.g., Adaptive 6).

Enforce your Governance Through Code

It is an absolute necessity for organizations to deploy their cloud infrastructure through code. Manual deployments are error-prone and lead to configuration drift, security vulnerabilities, and compliance issues, making it difficult to maintain consistency across environments. But here is another reason why you should use it: Governance of cloud resources.

Governance through policy is an important part of FinOps, ensuring proactive measures against cloud waste. Instead of creating oversized resources and going through endless cycles of cost optimizations through rightsizing, it's more efficient to prevent cloud waste from the start. This can be done by introducing cloud policies that allow or disallow specific categories, classes, properties, regions, and sizes of cloud resources.

So now we have the formula to eliminate cloud waste:

Cloud Waste Prevention = Cloud Infrastructure as Code * Governance Policies

How to implement Governance in IaC?

One of the most common tools for cloud infrastructure deployment is Terraform. That's why I will use Terraform scripts for most examples.

Governance should be a straightforward mechanism to prevent non-compliant infrastructure from being deployed. Terraform handles this effectively through its validation system. Here are some suggestions:

1. Input Validation

Within your Terraform module or resource definitions, use validation blocks to constrain the allowed values for specific arguments.

variable "instance_type" {
 type = string
 validation {
   condition     = contains(["t3.micro", "t3.small", "m5.large"], var.instance_type)
   error_message = "Instance type must be one of: t3.micro, t3.small, m5.large."
 }
}

You may also define these validations as lists or maps of permissible values. This makes your code more readable and maintainable, especially with larger sets of allowed options.

variable "allowed_regions" {
 type = list(string)
 default = ["us-east-1", "us-west-2", "eu-central-1"]
}

resource "aws_instance" "example" {
 # ... other configurations ...
 region = var.region
 validation {
   condition     = contains(var.allowed_regions, var.region)
   error_message = "Region must be one of the allowed regions."
 }
}

Another way is to create reusable modules that encapsulate your governance rules. This promotes consistency and reduces the need to repeat validation logic across different Terraform configurations.

# In your module:
variable "instance_type" {
 # ... validation as shown above ...
}

resource "aws_instance" "example" {
 instance_type = var.instance_type
 # ... other configurations ...
}

# In your main Terraform file:
module "my_governed_instance" {
 source = "./modules/governed_instance"
 instance_type = "t3.micro" # Valid
}

2. Policy as Code with Sentinel or Open Policy Agent (OPA)

For more complex governance requirements, integrate policy-as-code tools like HashiCorp Sentinel or Open Policy Agent (OPA). These tools allow you to define policies in a declarative language and enforce them during Terraform plan or apply phases.

import "tfplan"
# Example using  Sentinel
main = rule {
 all tfplan.resources as _, resources {
   all resources as _, r {
     r.type is "aws_instance" and
     r.attributes.instance_type else "t3.micro" is "t3.micro"
   }
 }
}

If you're using Terraform Cloud or Enterprise, you can enforce policies and run validation checks within your workspaces. This is can be done through validation of pull requests in the CI/CD pipelines.

3. Consider Enterprise FinOps Governance Tools

There is a more convenient way than writing your own code: paying for an enterprise tool. The market offers a variety of tools that govern cloud policies. While I aim to remain unbiased, some good examples include Stacklet or Apadptive6. These tools offer full-stack cloud governance mechanisms and even help execute recommendations.

Summary

It is an absolute necessity to deploy cloud infrastructure as code. This has many advantages, particularly in facilitating governance through policy.

You can implement governance through validation blocks, use Governance Policy Agents (e.g., OPA), or employ an enterprise FinOps governance tool.

However, what if you don't use IaC? While this isn't ideal, the next article (Part 2 in this series) will discuss a solution for that.

Thanks for reading! Share if you found it helpful. Have questions or suggestions for future topics? We'd love to hear from you!