Azure Cloud Adoption Framework in Real Infrastructure Projects
Last reviewed: June 2026
Scope
This article explains how I use the Azure Cloud Adoption Framework in real infrastructure projects, and it gets specific about the part that actually decides whether adoption succeeds: turning governance into code. CAF as a set of phase names is just vocabulary. CAF as a Terraform module that enforces a tagging standard before a single workload lands — that is delivery. The example below is a working landing-zone guardrail, with the tests and CI that make it safe to ship.
Where I would start
CAF is not the project; it is a structure for making better decisions, and it only helps when it becomes a delivery model with clear ownership, standards, guardrails, and an operational handover. Left as a slide deck, it does nothing for the migration. In real projects I use CAF to force the awkward conversations early: who owns subscriptions, who approves network architecture, who manages RBAC, how costs are controlled, what must be monitored, and who supports the workloads after go-live.
How Azure adoption fails quietly
Azure adoption rarely fails with a bang. It fails quietly: subscriptions appear without owners, networks grow without standards, RBAC drifts into inconsistency, costs surprise the business three months in, and operations inherit workloads with no monitoring or backup. CAF gives a structure to prevent that, but only if the team turns it into concrete, enforced deliverables — and "enforced" is the word that matters. A tagging standard in a wiki is a suggestion; a tagging policy assigned at the management group is a control.
Guardrails as code
The "Ready" and "Govern" areas of CAF are where I spend the most effort, because they are now code. Here is a real, if compact, landing-zone guardrail as a Terraform module: it defines a custom Azure Policy that denies resource groups created without a required tag, assigns that policy at a management group so it applies to every subscription beneath it, and stands up a central Log Analytics workspace for diagnostics.
# guardrails/main.tf
terraform {
required_version = ">= 1.6.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
variable "management_group_id" {
type = string
}
variable "required_tag" {
type = string
default = "CostCenter"
}
variable "location" {
type = string
default = "westeurope"
}
# Deny resource groups that are missing the required tag
resource "azurerm_policy_definition" "require_rg_tag" {
name = "require-rg-tag-${lower(var.required_tag)}"
policy_type = "Custom"
mode = "All"
display_name = "Require tag '${var.required_tag}' on resource groups"
policy_rule = jsonencode({
if = {
allOf = [
{ field = "type", equals = "Microsoft.Resources/subscriptions/resourceGroups" },
{ field = "tags['${var.required_tag}']", exists = "false" }
]
}
then = { effect = "deny" }
})
}
resource "azurerm_management_group_policy_assignment" "require_rg_tag" {
name = "require-rg-tag"
management_group_id = var.management_group_id
policy_definition_id = azurerm_policy_definition.require_rg_tag.id
description = "Landing-zone guardrail: resource groups must carry ${var.required_tag}."
}
# Central workspace that diagnostic-settings policies target
resource "azurerm_log_analytics_workspace" "core" {
name = "law-platform-core"
resource_group_name = "rg-platform-monitoring"
location = var.location
sku = "PerGB2018"
retention_in_days = 90
}
This is the difference between describing governance and enforcing it. The policy is deny, not audit, so a non-compliant resource group is never created in the first place. The assignment is at the management group, so it covers every current and future subscription without anyone remembering to re-apply it. And it is in version control, so a change to the tagging standard is a reviewed pull request, not an undocumented portal click.
Testing and shipping guardrails
Infrastructure code that enforces policy has to be tested like code, on three levels. Static checks catch mistakes before anything reaches Azure: format, validate, and a security/policy scan.
terraform fmt -check -recursive
terraform validate
checkov -d . # scans IaC for misconfigurations and policy violations
Plan-as-review turns the change into something a human approves: terraform plan -out tfplan runs in the pull request, and the reviewer reads the plan — what will be created, changed, or destroyed — before merge. Destruction of a policy assignment in a plan is exactly the kind of thing you want a second pair of eyes on. A minimal CI stage:
# .github/workflows/guardrails.yml (excerpt)
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init -backend=false
- run: terraform fmt -check -recursive
- run: terraform validate
- run: terraform plan -out tfplan # reviewed before a separate, approved apply stage
Post-deploy compliance confirms the control is actually doing its job. After apply, I check policy state with Az PowerShell rather than trusting that "the apply succeeded" means "the estate is compliant":
Connect-AzAccount -Identity | Out-Null # managed identity in CI
$state = Get-AzPolicyState -PolicyAssignmentName 'require-rg-tag' |
Where-Object ComplianceState -eq 'NonCompliant'
if ($state) {
$state | Select-Object ResourceId, PolicyDefinitionName | Format-Table
throw "$($state.Count) resource(s) are non-compliant with the tagging guardrail."
}
That last check is the one teams skip and regret: a deployed policy and a compliant estate are not the same thing, and the gap is where the audit finding lives.
Where CAF helps
Each phase becomes a concrete deliverable:
| CAF area | Real deliverable |
|---|---|
| Strategy | Business drivers, success criteria, and migration rationale |
| Plan | Workload backlog, dependency map, and wave plan |
| Ready | Landing zone, identity, connectivity, policy, and subscription design (as IaC) |
| Adopt | Migration factory, workload onboarding, and validation |
| Govern | RBAC, Azure Policy, tagging, cost controls, and exception handling (as IaC) |
| Manage | Monitoring, backup, patching, incident process, and handover |
| Secure | Security baseline, Defender, logging, and access-review process |
Where CAF often fails
The recurring gaps, every one of which is an ownership question in disguise:
- No subscription ownership model.
- No management-group structure.
- No RBAC standard or access-review process.
- No landing-zone network standard.
- No Azure Policy baseline.
- No cost tagging or budget alerts.
- No logging and monitoring model.
- No migration-wave plan.
- No operational handover.
Technical decision points
- Management groups: how subscriptions are organised and governed.
- Subscriptions: who owns them and how their lifecycle is managed.
- Networking: hub-and-spoke, virtual WAN, firewall, DNS, private endpoints, and on-premises connectivity.
- Identity: Microsoft Entra ID, privileged access, RBAC, managed identities, and Conditional Access dependencies.
- Policy: required guardrails, deny vs audit effects, and the exception process.
- Operations: monitoring, backup, patching, incident response, and service ownership.
- Cost: budgets, tags, reserved capacity, ownership reports, and chargeback/showback expectations.
A worked example: bringing workloads into Azure
When an organisation wants to migrate a set of workloads to Azure, I don't start by moving virtual machines. I start with the platform — subscription design, management groups, naming, tagging, RBAC, connectivity, firewall rules, backup, monitoring, logging, and cost alerts — expressed as IaC like the module above so the standard is enforced rather than described. Only then do I group workloads into migration waves by business criticality, dependencies, data sensitivity, and support readiness. The order is the whole point: build the landing zone first and each workload lands into a governed home; move the workloads first and you spend the rest of the program retrofitting governance onto running systems.
Operating model
The questions every CAF engagement has to answer out loud: Who owns the Azure platform? Who owns each subscription? Who approves network design? Who manages RBAC and privileged access? Who owns the monthly cost review? Who owns backup and monitoring? What is the handover process from project team to operations? And which decisions require architecture-board or CAB approval?
Practical CAF checklist
- Define business outcomes and success criteria.
- Build a workload inventory and dependency map.
- Define the management-group and subscription structure.
- Express landing-zone networking, policy, and RBAC as version-controlled IaC.
- Use
denypolicies for the guardrails that must never be violated,auditfor the rest. - Run
fmt,validate, and a policy scanner (e.g. checkov) in CI; review theplanbefore apply. - Verify compliance after deployment with
Get-AzPolicyState, not just a green apply. - Configure logging, monitoring, backup, and alert ownership.
- Create migration waves and validation criteria.
- Agree the operational handover and support model.
Final recommendation
Use Azure CAF as a delivery-control model, not a document exercise. The value is not in naming the phases — it is in converting the framework into accountable owners and reusable landing-zone standards expressed as tested, version-controlled code. Get the guardrails in as IaC, prove them with static checks and a post-deploy compliance check, and review every change as a plan before it applies. Do that and most of CAF's "quiet failures" never get the chance to happen, because the estate refuses to drift in the first place.
References
- Microsoft Learn: Cloud Adoption Framework overview
- Microsoft Learn: Management groups in Azure landing zones
- Microsoft Learn: Azure governance design area