Infrastructure teams face a challenge: supporting rapid software delivery while protecting their cloud deployments against breaking changes. As infrastructure code grows increasingly complex, teams often struggle to coordinate changes, manage state files, and ensure rigorous testing before production deployments. For example, when a developer modifies an autoscaling group’s launch template, they need confidence that their changes won’t inadvertently impact related resources or conflict with another team member’s updates.
Traditional Git workflows often prove inadequate for managing Terraform or OpenTofu configurations. Challenges such as Terraform state file locking, environment promotion, and validating changes across interconnected resources like VPCs, security groups, and IAM policies require more sophisticated approaches. GitFlow, with its well-defined branching patterns and merge strategies, offers a structured solution. However, adapting GitFlow for infrastructure code demands careful modifications to address backend configurations, concurrent development, and automated testing.
Before implementing GitFlow for infrastructure management, teams need to understand:
This guide explores how teams can leverage GitFlow to improve infrastructure management. It covers branching strategies, concurrent development solutions, and automation approaches to prevent conflicting deployments. Finally, we’ll evaluate where GitFlow’s structure benefits infrastructure teams and when simpler workflows may be more effective. We’ll be using simplified technical examples to help explain the core concepts. Examples are based on Terraform, with AWS as the cloud provider and GitHub/GitHub Actions for VCS and CI/CD, respectively.
GitFlow implementation for infrastructure code means mapping your branch hierarchy to actual infrastructure environments. Your production infrastructure code lives in the main branch, serving as the source of truth for what's actively running in your cloud environment. Changes flow through the develop branch first, which typically connects to a staging environment where teams can validate infrastructure modifications before they reach production. Setting this up successfully means tackling state file isolation, configuring granular access, and coordinating automated deployments.
The way you configure your state backend directly impacts how well GitFlow works for infrastructure code. Each environment needs its own state configuration to prevent any chance of staging changes accidentally affecting production resources. Many teams start with a straightforward S3 and DynamoDB combination for state management, though larger organizations often move to platforms like TACOS that handle the growing complexity of state management and access controls. Here's what a typical S3 backend configuration might look like:
This kind of setup provides state file encryption with KMS, prevents conflicting changes with DynamoDB locking, and keeps environments separate using distinct state file paths. The S3 backend also gives you regional replication capabilities, which becomes important as your infrastructure footprint grows across multiple regions or accounts.
Feature branches need isolated state management to safely test infrastructure changes without affecting other environments. Teams typically choose between two main approaches: Terraform workspaces or branch-specific state files. Terraform workspaces provide a lightweight way to manage multiple states within the same backend configuration. Each feature branch maps to its own workspace, allowing isolated testing while sharing the same backend infrastructure. Here's how teams might implement workspace-based isolation:
The second approach uses completely separate state files for each feature branch, providing stronger isolation by maintaining distinct paths in the backend storage. This method works well for teams that need complete separation of state data:
When using branch-specific state files, teams typically generate the branch_name variable through their CI/CD pipeline or local development scripts, and apply them through an interpolated variable in the locals block. This approach requires more setup but gives you full control over state isolation and makes it easier to clean up resources when feature branches are merged or deleted.
When releasing infrastructure changes, teams need a well-defined approach for promoting code through environments while keeping tight control over state. Release branches use dedicated state configurations that track changes as they progress through testing environments before reaching production. Here's a typical release branch state configuration:
Release validation starts with automated security scans using tools like tfsec, checkov, or OPA to verify infrastructure compliance. Teams run terraform plan against each target environment, capturing the expected state changes in version control for review. This creates an audit trail of infrastructure modifications and helps catch potential issues like state drift early in the release cycle.
Before production deployment, teams typically perform a final drift detection to identify any out-of-band changes that could affect the release. The production deployment itself often requires coordination with change management processes, including scheduling maintenance windows and notifying stakeholders of potential service impacts. Some teams use open-source tools like Terrateam to automate this orchestration, ensuring consistent release execution and proper state handling across environments.
Rollback planning is also critical - teams maintain snapshots of their state files before major releases and document the specific steps needed to revert changes if issues arise. This might include keeping the previous release branch active until the new release proves stable in production.
When production issues require immediate attention, teams create hotfix branches directly from main. These branches need special handling for state management to ensure emergency fixes don't cause additional problems. A typical hotfix state configuration looks like this:
Emergency changes still require key safety measures. Before applying hotfixes, teams capture the current state file as a backup and document specific rollback steps. Most organizations require explicit approval from service owners and security teams, even during incidents. The state backend's locking mechanism becomes especially important here - it prevents multiple engineers from making concurrent changes that could compound the issue.
Many teams set up dedicated CI/CD pipelines for emergency changes that run abbreviated but essential checks. These typically include basic security scans and policy validation while skipping longer tests that might delay critical fixes. Once the hotfix proves successful in production, teams backport these changes to the develop branch to maintain consistency across environments.
GitFlow brings structure to infrastructure management, but it takes planning and the right automation to make it work. Terraform state, parallel development, and CI/CD all add complexity that needs to be handled properly. This isn’t the only way to manage infrastructure. Some teams prefer simpler branching models or trunk-based development because they’re easier to work with.
The right workflow depends on how your team operates and how complex your Terraform setup is. If GitFlow makes sense, open-source tools like Terrateam can help automate the process and keep things running smoothly.
In the next part of this series, we’ll break down where GitFlow helps, how it improves feature branch isolation and security, and where it might add unnecessary overhead.
Trusted by the world’s best DevOps and security teams. Doppler is the secrets manager developers love.