image image 2nd July 2021

Snowflake in Azure using Terraform - How we fixed a client problem by creating our first Auto-Ingest Snowpipes!

Data Architect and Tech Lead at The Data Shed, George Verney, has been working with one of our clients, a retail, surveillance and IoT provider based in Europe, migrating their services into the cloud.

In this blog, he shares his experience of resolving a challenging problem by implementing Auto-Ingest Snowpipes in Azure using Terraform.

Recently, we’ve been working with a client who needed to migrate some of their on-premise services to Azure, and their analytical pipeline was leveraging Snowflake. My main role on the project was to get the infrastructure side of things sorted.

Wielding the powerful infrastructure-as-code (IaC) tool Terraform, I set out to make a repeatable and reliable repository of code.

The Terraform provider for Snowflake is adequate. It does the job, but the documentation is very limited. However, once you combine it with the documentation about Automating Snowpipe for Microsoft Azure Blob Storage things become a lot simpler.

Anyway, I’m getting side-tracked.

The main purpose of this post is to highlight, and show you, how we fixed some frustrating client problems by creating our first auto-ingest snowpipes! So, let’s get into it…

Granting Snowflake Permission to Azure

Initial Set Up

The first time you set things up, you will need to grant your Snowflake instance permission to your Azure Subscription. This is done manually by following a “consent URL” – you can find out how to obtain one here. If you know an automated way of doing this, please do share it!

What this essentially does is create an Enterprise Application in your Azure Active Directory. How we use this is akin to a service account, meaning a non-interactive “user” that can be granted permissions to Azure resources.

Once you have done this, you should find a new application with a name starting with “Snowflake”:

Important note: You’ll need one Enterprise Application per Snowflake instance. If you use a separate Snowflake instance in development, test, and production then you will need to perform this authorisation step three times. If you share a single Snowflake instance for multiple environments, you only need to create one Enterprise Application.

Terraform

You can get the consent_url via Terraform, parse it, and create a data resource configured to your Enterprise Application.

 

data "azuread_service_principal" "snowflake" {
  // Parsing the consent URL for the `client_id` (ApplicationId) is not ideal, but works!
  application_id = (
   replace(
    regex(
     "\\A?client_id=[^&]*",
     snowflake_storage_integration.integration.azure_consent_url
     ),
     "client_id=",
     ""
    )
   )
 }

 

This can then be used to grant Snowflake access to your Azure resources, such as the required permissions on your storage account:

 

resource "azurerm_role_assignment" "snowflake_storage" {
 scope = azurerm_storage_account.snowflake.id
 role_definition_name = "Storage Blob Data Reader"
 principal_id = data.azuread_service_principal.snowflake.object_id
}

resource "azurerm_role_assignment" "snowflake_queue" {
 scope = azurerm_storage_account.snowflake.id
 role_definition_name = "Storage Queue Data Contributor"
 principal_id = data.azuread_service_principal.snowflake.object_id
}

The ‘Gotcha’

The Terraform code above works…sometimes.

You can regularly receive this hair-pulling error:

 

snowflake_pipe.main: Creating...
╷
│ Error: error creating pipe TEST_PIPE: 090040 (XX000): Pipe Notifications bind failure "Could not access queue:https://testresource.queue.core.windows.net/snowqueue; Message: This request is not authorized to perform this operation using this permission. (Status Code:403; Error Code:AuthorizationPermissionMismatch)"
│
│ with snowflake_pipe.main,
│ on snow-pipe.tf line 5, in resource "snowflake_pipe" "main":
│ 5: resource "snowflake_pipe" "main" {
│

It rarely works on the first application against an empty environment, but it would then quite often work without modification a few minutes later. Once it has succeeded once, it seems to work forevermore; or at least until you have destroyed your Terraform resources anyway!

After much head-scratching, six cups of coffee, two naps and a third of a sleeve of McVitie’s Hobnobs, the penny finally dropped.

What appeared to be happening is that the role assignments in Azure were being provisioned (you could go and view them in the Azure Portal), but they were not immediately available for use, by Snowflake at least. Waiting a few minutes and trying again meant that the 403-status previously received when provisioning the Snowpipe was gone.

So, what’s the fix?

We need to tell Snowpipe to not be so eager and give Azure a few minutes to propagate the role changes.

Hashicorp (the folks behind Terraform) have created a handy provider for performing this kind of action, conveniently called time.

 

terraform {
 required_providers {
  time = {
   source = "hashicorp/time"
   version = "0.7.1"
  }
 }
}

 

And within this provider there is a resource called time_sleep!

Here is how we have used it to get stable, consistent and (so far) error free Snowpipe creation:

 

// There appears to be a kind of timing issue when creating the necessary storage
// permissions in Azure that the Snowpipe depends on.
// The intention of this timer is to allow sufficient time for the initial resource
// creation to succeed, avoiding transient errors.
resource "time_sleep" "wait_5_minutes" {
 depends_on = [
  azurerm_role_assignment.snowflake_storage,
  azurerm_role_assignment.snowflake_queue
 ]

 create_duration = "5m"
}

resource "snowflake_pipe" "pipe" {
 ...
 depends_on = [
  time_sleep.wait_5_minutes
 ]
}

Some Caveats..

This approach is not ideal but is good enough.

Reminder that the Terraform documentation for time_sleep does state:

In many cases, this resource should be considered a workaround for issues that should be reported and handled in downstream Terraform Provider logic. Downstream resources can usually introduce or adjust retries in their code to handle time delay issues for all Terraform configurations or upstream resources can be improved to better wait for a resource to be fully ready and available.

We Love Data

Want to know more?

Drop us a line – we’re always happy
to chat – we promise we’ll keep the
geek speak to a minimum (unless
that’s your bag in which case we’ll
happily comply)!