September 9, 2024 Clouds

Azure Cloud Adoption Framework in Real Infrastructure Projects

Last reviewed: June 2026

Scope

This article explains how I use the Azure Cloud Adoption Framework in real infrastructure projects, and it gets specific about the part that actually decides whether adoption succeeds: turning governance into code. CAF as a set of phase names is just vocabulary. CAF as a Terraform module that enforces a tagging standard before a single workload lands — that is delivery. The example below is a working landing-zone guardrail, with the tests and CI that make it safe to ship.

Where I would start

CAF is not the project; it is a structure for making better decisions, and it only helps when it becomes a delivery model with clear ownership, standards, guardrails, and an operational handover. Left as a slide deck, it does nothing for the migration. In real projects I use CAF to force the awkward conversations early: who owns subscriptions, who approves network architecture, who manages RBAC, how costs are controlled, what must be monitored, and who supports the workloads after go-live.

How Azure adoption fails quietly

Azure adoption rarely fails with a bang. It fails quietly: subscriptions appear without owners, networks grow without standards, RBAC drifts into inconsistency, costs surprise the business three months in, and operations inherit workloads with no monitoring or backup. CAF gives a structure to prevent that, but only if the team turns it into concrete, enforced deliverables — and "enforced" is the word that matters. A tagging standard in a wiki is a suggestion; a tagging policy assigned at the management group is a control.

Guardrails as code

The "Ready" and "Govern" areas of CAF are where I spend the most effort, because they are now code. Here is a real, if compact, landing-zone guardrail as a Terraform module: it defines a custom Azure Policy that denies resource groups created without a required tag, assigns that policy at a management group so it applies to every subscription beneath it, and stands up a central Log Analytics workspace for diagnostics.

# guardrails/main.tf
terraform {
  required_version = ">= 1.6.0"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

variable "management_group_id" {
  type = string
}

variable "required_tag" {
  type    = string
  default = "CostCenter"
}

variable "location" {
  type    = string
  default = "westeurope"
}

# Deny resource groups that are missing the required tag
resource "azurerm_policy_definition" "require_rg_tag" {
  name         = "require-rg-tag-${lower(var.required_tag)}"
  policy_type  = "Custom"
  mode         = "All"
  display_name = "Require tag '${var.required_tag}' on resource groups"

  policy_rule = jsonencode({
    if = {
      allOf = [
        { field = "type", equals = "Microsoft.Resources/subscriptions/resourceGroups" },
        { field = "tags['${var.required_tag}']", exists = "false" }
      ]
    }
    then = { effect = "deny" }
  })
}

resource "azurerm_management_group_policy_assignment" "require_rg_tag" {
  name                 = "require-rg-tag"
  management_group_id  = var.management_group_id
  policy_definition_id = azurerm_policy_definition.require_rg_tag.id
  description          = "Landing-zone guardrail: resource groups must carry ${var.required_tag}."
}

# Central workspace that diagnostic-settings policies target
resource "azurerm_log_analytics_workspace" "core" {
  name                = "law-platform-core"
  resource_group_name = "rg-platform-monitoring"
  location            = var.location
  sku                 = "PerGB2018"
  retention_in_days   = 90
}

This is the difference between describing governance and enforcing it. The policy is deny, not audit, so a non-compliant resource group is never created in the first place. The assignment is at the management group, so it covers every current and future subscription without anyone remembering to re-apply it. And it is in version control, so a change to the tagging standard is a reviewed pull request, not an undocumented portal click.

Testing and shipping guardrails

Infrastructure code that enforces policy has to be tested like code, on three levels. Static checks catch mistakes before anything reaches Azure: format, validate, and a security/policy scan.

terraform fmt -check -recursive
terraform validate
checkov -d .            # scans IaC for misconfigurations and policy violations

Plan-as-review turns the change into something a human approves: terraform plan -out tfplan runs in the pull request, and the reviewer reads the plan — what will be created, changed, or destroyed — before merge. Destruction of a policy assignment in a plan is exactly the kind of thing you want a second pair of eyes on. A minimal CI stage:

# .github/workflows/guardrails.yml (excerpt)
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform init -backend=false
      - run: terraform fmt -check -recursive
      - run: terraform validate
      - run: terraform plan -out tfplan      # reviewed before a separate, approved apply stage

Post-deploy compliance confirms the control is actually doing its job. After apply, I check policy state with Az PowerShell rather than trusting that "the apply succeeded" means "the estate is compliant":

Connect-AzAccount -Identity | Out-Null      # managed identity in CI
$state = Get-AzPolicyState -PolicyAssignmentName 'require-rg-tag' |
    Where-Object ComplianceState -eq 'NonCompliant'

if ($state) {
    $state | Select-Object ResourceId, PolicyDefinitionName | Format-Table
    throw "$($state.Count) resource(s) are non-compliant with the tagging guardrail."
}

That last check is the one teams skip and regret: a deployed policy and a compliant estate are not the same thing, and the gap is where the audit finding lives.

Where CAF helps

Each phase becomes a concrete deliverable:

CAF area	Real deliverable
Strategy	Business drivers, success criteria, and migration rationale
Plan	Workload backlog, dependency map, and wave plan
Ready	Landing zone, identity, connectivity, policy, and subscription design (as IaC)
Adopt	Migration factory, workload onboarding, and validation
Govern	RBAC, Azure Policy, tagging, cost controls, and exception handling (as IaC)
Manage	Monitoring, backup, patching, incident process, and handover
Secure	Security baseline, Defender, logging, and access-review process

Where CAF often fails

The recurring gaps, every one of which is an ownership question in disguise:

No subscription ownership model.
No management-group structure.
No RBAC standard or access-review process.
No landing-zone network standard.
No Azure Policy baseline.
No cost tagging or budget alerts.
No logging and monitoring model.
No migration-wave plan.
No operational handover.

Technical decision points

Management groups: how subscriptions are organised and governed.
Subscriptions: who owns them and how their lifecycle is managed.
Networking: hub-and-spoke, virtual WAN, firewall, DNS, private endpoints, and on-premises connectivity.
Identity: Microsoft Entra ID, privileged access, RBAC, managed identities, and Conditional Access dependencies.
Policy: required guardrails, deny vs audit effects, and the exception process.
Operations: monitoring, backup, patching, incident response, and service ownership.
Cost: budgets, tags, reserved capacity, ownership reports, and chargeback/showback expectations.

A worked example: bringing workloads into Azure

When an organisation wants to migrate a set of workloads to Azure, I don't start by moving virtual machines. I start with the platform — subscription design, management groups, naming, tagging, RBAC, connectivity, firewall rules, backup, monitoring, logging, and cost alerts — expressed as IaC like the module above so the standard is enforced rather than described. Only then do I group workloads into migration waves by business criticality, dependencies, data sensitivity, and support readiness. The order is the whole point: build the landing zone first and each workload lands into a governed home; move the workloads first and you spend the rest of the program retrofitting governance onto running systems.

Operating model

The questions every CAF engagement has to answer out loud: Who owns the Azure platform? Who owns each subscription? Who approves network design? Who manages RBAC and privileged access? Who owns the monthly cost review? Who owns backup and monitoring? What is the handover process from project team to operations? And which decisions require architecture-board or CAB approval?

Practical CAF checklist

Define business outcomes and success criteria.
Build a workload inventory and dependency map.
Define the management-group and subscription structure.
Express landing-zone networking, policy, and RBAC as version-controlled IaC.
Use deny policies for the guardrails that must never be violated, audit for the rest.
Run fmt, validate, and a policy scanner (e.g. checkov) in CI; review the plan before apply.
Verify compliance after deployment with Get-AzPolicyState, not just a green apply.
Configure logging, monitoring, backup, and alert ownership.
Create migration waves and validation criteria.
Agree the operational handover and support model.

Final recommendation

Use Azure CAF as a delivery-control model, not a document exercise. The value is not in naming the phases — it is in converting the framework into accountable owners and reusable landing-zone standards expressed as tested, version-controlled code. Get the guardrails in as IaC, prove them with static checks and a post-deploy compliance check, and review every change as a plan before it applies. Do that and most of CAF's "quiet failures" never get the chance to happen, because the estate refuses to drift in the first place.

References

Related infrastructure guides

For Help, press F1 1491 words Ln 1, Col 1

Azure Cloud Adoption Framework in Real Infrastructure Projects

Scope

Where I would start

How Azure adoption fails quietly

Guardrails as code

Testing and shipping guardrails

Where CAF helps

Where CAF often fails

Technical decision points

A worked example: bringing workloads into Azure

Operating model

Practical CAF checklist

Final recommendation

References

Related infrastructure guides

Contents

Welcome to the Ilya Win98 shell

Categories

What's New on the Web

Member Services

AltaVista

Popular Searches

Featured Categories

Arts & Humanities

Business

Computers

Education

Entertainment

News

Recreation

Science

Shopping

Society

Cool Dude's Lair

My Favorite Links

Sign My Guestbook

Favorites

History