Giter Club home page Giter Club logo

tf-demo-globalbackup's Introduction

Global Backup Cloud Storage

Introduction

The Global Backup Cloud Storage solution provides coldline storage for long-term retention of backup archives. The Global Backup solution uses locally installed appliances from Actifio, which are configured to store backup archives on Google Cloud Storage (GCS).

This Terraform Stack deploys the cloud storage, service accounts, and access controls needed to support the Global Backup solution. Credentials for service accounts are also uploaded to Vault for downstream configuration of backup appliances.

Solution Overview

To support the setup of long term retention for backup data, the following design has been implemented:

Global Backup Cloud Storage Architecture Overview Diagram

  1. Each Actifio Sky Appliance is associated with a dedicated GCS bucket per Business Unit (each Actifio appliance may perform backups for one or more Business Units in a single location);
  2. Each GCS bucket is associated with a dedicated Service Account, used to grant access via Cloud IAM;
  3. Each Service Account has a dedicated Service Account Key which is used for authentication;
  4. Details relating to the GCS bucket, Service Account and Service Account Key are stored in a JSON variable referred to as the Service Account Secret;
  5. The Service Account Secret is uploaded to HashiCorp Vault as a generic secret;
  6. The Service Account Secret is pulled from HashiCorp Vault, used to generate a Client Authentication certificate, and to enable configuration of an OnVault Pool in the Actifio Sky Appliance;
  7. The Actifio Sky Appliance uses the OnVault Pool feature to upload backup data (archives for long term retention) to the GCS bucket;

The main reason for creating multiple GCS bucket mapped 1-to-1 with dedicated Service Accounts is to improve isolation between data from different locations and for different Business Units. As step 6 is performed manually and local teams are responsible for administering individual Actifio Sky Appliances, this design helps to reduce data exposure in the event of a compromised Service Account Key. Further protection is provided by the Actifio Sky Appliances, which shard and encrypt the data before uploading to the cloud storage.

Each Business Unit has been assigned a dedicated Landing Zone (GCP Project). This is to enable a future requirement to associate each Business Unit with a dedicated Billing Account (all Projects are currently linked to a central Billing Account for Global Backup). This also provides the ability to easily delegate administrative access on a per Business Unit level should the need arise.

Future sites are expected to use Storage Accounts hosted on Microsoft Azure to meet specific data residency requirements. These will be configured using the same design pattern and process, but using equivalent Azure Resources. This Terraform Stack should be extended to cover this requirement as appropriate.

Terraform Stack Overview

File Structure

All infrastructure is deployed as part of a single Terraform Stack, which consists of the following files:

File Description
backend.tf Describes the backend configuration used to store the Terraform State file. In this case, we use GCS for this.
main.tf Describes the configuration values used by the deployment, including:
  • General variables used for naming conventions;
  • Variables used for Vault;
  • Map of Project ID to Business Unit names;
  • Map containing configuration values for Global Backup instances;
outputs.tf Describes the outputs generated by the stack and printed to console upon successful completion.
providers.tf Describes the providers used by the stack.
resources.tf Describes the resources deployed by the stack, including:
  • resource "google_storage_bucket" "global_backup_gcp_storage_coldline";
  • resource "google_service_account" "global_backup_gcp_service_account";
  • resource "google_service_account_key" "global_backup_gcp_service_account_key";
  • resource "vault_generic_secret" "global_backup_gcp_service_account_key";
  • data "google_iam_policy" "global_backup_gcp_service_account_iam_binding_policy";
  • resource "google_storage_bucket_iam_policy" "global_backup_gcp_service_account_iam_binding";

Deployment Logic

The Terraform Stack deploys resources based on the following high-level logic:

  1. Cloud Storage instances are created for each Actifio SKY Appliance;
  2. Cloud Storage instances are dedicated to a specific Business Unit and are located in an Business Unit specific Landing Zone;
  3. A single Actifio SKY Appliance may perform backups locally for one or more Business Units, so may be associated to multiple Cloud Storage instances;
  4. Service Accounts are created for each Cloud Storage instance to ensure isolation between both Business Unit and individual Actifio SKY Appliance location to minimise attack surface area for a given set of credentials;
  5. All Cloud Storage configuration data needed to setup the Actifio SKY Appliance is stored in Vault to enable secure sharing with local teams responsible for configuring the Actifio SKY Appliances;

Stack Components

Target Landing Zones

The following Landing Zones are used by the Global Backup solution:

Business Unit Cloud Platform Landing Zone
Business Unit 1 Google organizations/{{organization_id}}/clz-bu1-global-backup
Business Unit 2 Google organizations/{{organization_id}}/clz-bu2-global-backup
Business Unit 3 Google organizations/{{organization_id}}/clz-bu3-global-backup
Business Unit 4 Google organizations/{{organization_id}}/clz-bu4-global-backup
Business Unit 5 Google organizations/{{organization_id}}/clz-bu5-global-backup
Business Unit 6 Google organizations/{{organization_id}}/clz-bu6-global-backup

These are all pre-created and just referenced by this Terraform Stack.

Cloud Storage

The code to generate GCS buckets is stored in resources.tf and uses the google_storage_bucket resource type from the Google Provider in Terraform.

GCS buckets are created using a for_each loop from the local variable global_backup_gcp_storage_config defined in main.tf using the following naming convention:

{{appliance_location}}.{{project_id}}.gcp.mydomain.com

These variables were chosen to create the naming convention as the clearly identify what the bucket is being used for and by whom through the following components of each variable value:

Variable Name Example Components
appliance_location na-ca-toronto-0218 Region (na)
Country (ca)
City (toronto)
Location ID (0218)
project_id clz-bu4-global-backup Generic Identifier (clz)
Business Unit (bu4)
Product / Service (global-backup)

Because this solution uses domain-named buckets, the identity used to run the deployment must be a verified owner of the domain. For more information relating to domain-named buckets, please refer to the Domain-named bucket verification article.

Service Accounts

The code to generate the service accounts is stored in resources.tf and uses the google_service_account resource type from the Google Provider in Terraform.

Service Accounts are created using a for_each loop from the local variable "global_backup_gcp_storage_config" defined in main.tf using the following naming convention:

{{appliance_location}}@{{project_id}}.iam.gserviceaccount.com

Similar to the GCS buckets, this naming convention was used as it clearly identifies what the service account is being used for and by whom through the components of each variable value (please refer to table above in Cloud Storage)

Service Account Keys

The code to generate the service account keys is stored in resources.tf and uses the google_service_account_key resource type from the Google Provider in Terraform.

Service Account Keys are created using a for_each loop from the google_service_account resources defined in resources.tf.

Service Account Keys are auto-assigned a GUID so are not "named" like other Resource types.

Service Account Keys are used to authenticate to the Google APIs under the identity of a Service Account. The Service Account Key is basically a Client Authentication Certificate which is only generated and available to download at the time of creation.

The Google Console supports generating these in PKCS12 format (where the public and private key are bundled in a single password-protected binary file), or as a JSON output containing various connection details including the public and private keys in BASE64 encoded format.

The Actifio SKY Appliance only supports uploading Google credentials in PKCS12 format (with a default password of notasecret). As Terraform uses the Google API, the only supported format is JSON. This is also the only format supported for uploading the key to HashiCorp Vault. As such, a conversion process needs to be followed to convert the BASE64 encoded public and private keys into a PKCS12 certificate file. This process has been documented separately, and shared with TCS/IBM.

When Terraform first creates the Service Account Key, the secret values are stored in the Terraform State file. This allows Terraform to process the Service Account Key data on subsequent runs, but also means this highly sensitive data is stored in the Terraform State file. As such, access to this must be carefully controlled and audited.

Vault Secrets

The Service Account Key and other data relating to setting up the OnVault Pool needs to be made available for local support teams to upload to the Actifio SKY Appliances.

Vault Secrets are created using a for_each loop from the google_service_account_key resources defined in resources.tf, but also pull in the following additional information:

Custom Data Example Source
appliance_location na-ca-toronto-0218 global_backup_gcp_storage_config Local Value
bucket_id na-ca-toronto-0218.clz-bu4-global-backup.gcp.mydomain.com google_storage_bucket Resource
bucket_location NORTHAMERICA-NORTHEAST1 global_backup_gcp_storage_config Local Value
project_id clz-bu4-global-backup global_backup_gcp_storage_config Local Value
op_group_name Business Unit 4 global_backup_gcp_storage_config Local Value
public_key -----BEGIN CERTIFICATE-----
{{public_key_value}}
-----END CERTIFICATE-----
google_service_account_key Resource

When creating the Service Account Key, the public_key and private_key values are output as a Base64 character sequence. Terraform provides a base64decode() Function which is used to convert these to the original string values. As the public_key contains just the public key value no further action is required, but the decoded private_key produces a JSON object containing the actual private_key value along with a number of other fields.

To create the final JSON object to upload to Vault, we perform the following steps for each Service Account Key instance:

  1. Create a secret_custom_values_map object containing the custom values listed in the above table, including the Base64 decoded public_key value;
  2. Create a secret_private_key_map object by extracting the private_key value from the Service Account Key, converting from Base64, and then from JSON object to map object;
  3. Merge the secret_custom_values_map and secret_private_key_map objects into a single object, and then convert to JSON;

The generated JSON data object is uploaded to Vault using the vault_generic_secret resource type from the Vault Provider in Terraform. Each secret is stored using the following path / naming convention:

Actifio Backup Recovery/{{service_account_id}}

The service_account_id is generated automatically as part of the Service Account resource creation, and takes the format:

projects/{{project_id}}/serviceAccounts/{{appliance_location}}@{{project_id}}.iam.gserviceaccount.com

This format provides a useful structure for identifying everything you need to know about what the secret is used for.

The following is an example Vault Secret path generated by the code:

Actifio Backup Recovery/projects/clz-bu4-global-backup/serviceAccounts/[email protected]

IAM Bindings

In order to grant access for the Service Account to the associated GCS bucket, a set of IAM policies need to be created to assign permissions. This is achieved via two steps:

  1. google_iam_policy data objects are created using a for_each loop from the google_service_account resources defined in resources.tf;
  2. google_storage_bucket_iam_policy IAM bindings are created using a for_each loop from the google_iam_policy data objects created above;

This combination of creating a custom IAM Policy, and then applying it to a Storage Bucket ensures that each GCS bucket is only accessible from the associated Service Account using minimal permissions needed by the Actifio SKY Appliance.

For Actifio to work, the appliance must be granted storage.objectAdmin rights on the bucket but also needs the ability to read the GCS Bucket metadata (storage.buckets.get). As the latter is usually only granted via overly permissive predefined roles, a custom role named Custom Storage Bucket Metadata Viewer has been created in the GCP Organization. This role provides the necessary additional permissions and is added as part of the google_iam_policy definition.

Each IAM binding is applied directly to the GCS bucket rather than at the Project level. This is by design to prevent Service Accounts from being able to access GCS buckets intended for other Actifio SKY Appliances.

Operations

Add/Update/Remove Global Backup Cloud Storage Instances

When adding or removing Actifio appliances to the solution (this may be to support configuration of a new Actifio SKY Appliance, adding a new Business Unit for an existing appliance), or updating an existing instance, you must perform the following steps:

  1. Open the main.tf file
  2. Locate the global_backup_gcp_storage_config object - this is defined in Terraform using a Local Values map() object type
  3. To add a new Global Backup Cloud Storage instance, add a new entry in the global_backup_gcp_storage_config object using the following format:
    map_item_name = {                   # Must be a unique index within the list
        appliance_location  = string    # Source location for the backup appliance (as per CMDB format)
        bucket_location     = string    # Target location for the Google Storage Bucket
        project_id          = string    # Project ID for where to create the bucket
    }
    

    It should be possible to create a meaningful and unique value for the map_item_name by joining the Business Unit ID (taken from the associated project_id) with the appliance_location value in the format:

    {{op_group_id}}_{{appliance_location}}

    If not, you probably don't need to create a new instance for this Business Unit / Site combination.

  4. To update an existing Global Backup Cloud Storage instance, find the corresponding entry in the global_backup_gcp_storage_config object and update the necessary values within the map_item_name object
  5. To delete an existing Global Backup Cloud Storage instance, find the corresponding entry in the global_backup_gcp_storage_config object and remove the map_item_name object
  6. Having completed your updates to the global_backup_gcp_storage_config object, run terraform plan to ensure you'll get the expected result;
  7. Once happy with the proposed changes, run terraform apply to updated the Cloud Storage infrastructure;

Permissions Required

In order to run the Terraform deployment, the following permissions are required:

Platform Permissions Scope
Google Search Console Verified/Delegated Owner Domain property: gcp.mydomain.com
For more information: https://support.google.com/webmasters/answer/9008080
GCP roles/storage.objectAdmin Bucket Object (project_id = mydomain-com-globalbackup-automation): gs://tfstate.automation.eu.mydomain.com/mydomain_com/global_backup/default.tfstate
GCP roles/storage.admin
roles/iam.serviceAccountAdmin
roles/iam.serviceAccountKeyAdmin
Google Projects:
  • clz-bu1-global-backup
  • clz-bu2-global-backup
  • clz-bu3-global-backup
  • clz-bu4-global-backup
  • clz-bu5-global-backup
  • clz-bu6-global-backup
Vault create
read
update
delete
list
secret/Actifio Backup Recovery/*
Vault create
read
update
list
sudo
auth/token/*

The above permissions are believed to be correct, however the Service Account used to deploy this stack was also used to create the Projects via a separate Terraform Stack. As such, this Service Account also had the primitive roles/owner permission on the Project. This isn't believed to be necessary to deploy this stack but is not tested due to time constraints.

Terraform State (Remote Backend)

To ensure anyone deploying the Terraform Stack is working from the latest state, a remote backend has been configured. This uses the GCS Backend type.

To run this stack, you must be granted access to read and write against the following GCS bucket object:

Field Value
Project ID mydomain-com-globalbackup-automation
Bucket path (gsutil) gs://tfstate.automation.eu.mydomain.com
Bucket path (URL) https://console.cloud.google.com/storage/browser/tfstate.automation.eu.mydomain.com

Additional Guidance

Throughout the Terraform template we have used a for_each loop to ensure resources can be easily added/removed without accidentally recreating resources. This is because the count method uses positional index to determine the resource ID used to identify a resource in the state file which can make the code sensitive to list ordering.

By taking this approach, we reduce the risk of causing unexpected behavior when updating entries in the local variable global_backup_gcp_storage_config.

For more information see:
https://blog.gruntwork.io/terraform-tips-tricks-loops-if-statements-and-gotchas-f739bbae55f9

Useful Links

The following is a list of external links used throughout this guide:

tf-demo-globalbackup's People

Contributors

krowlandson avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.