Send data to Google Cloud Storage

Google Cloud Storage is an online file storage web service for storing and accessing data on Google Cloud Platform infrastructure.

This topic describes the steps that are required to send files to Google Cloud Storage from Amperity:

  1. Get details

  2. Add destination

  3. Add data template

Get details

Google Cloud Storage requires the following configuration details:

  1. A Google Cloud Storage service account key that is configured for the Storage Object Admin role.

  2. The name of the Google Cloud Storage bucket to which Amperity will send data and its prefix.

  3. The public key to use for PGP encryption.

Filedrop requirements

A Google Cloud Storage location requires the following:

Options

The following sections describe additional workflows that are available. After Amperity sends data to Cloud Storage, you can configure downstream applications to consume that data and make it available to additional workflows.

Google BigQuery

Google BigQuery is a fully-managed data warehouse that provides scalable, cost-effective, serverless software that can perform fast analysis over petabytes of data and querying using ANSI SQL.

Amperity can be configured to send data to Google Cloud Storage, after which the data can be transferred to Google BigQuery. Applications can be configured to connect to Google BigQuery and use the Amperity output as a data source.

You must configure Amperity to send data to a Cloud Storage bucket that your organization manages directly.

The following applications can be configured to load data from Google BigQuery as a data source:

Dataflow, Pub/Sub

Dataflow is a fully-managed service for transforming and enriching data using stream (real-time) and/or batch modes that can be configured to use Pub/Sub to stream messages from Cloud Storage.

Note

Google Pub/Sub is a low-latency messaging service that can be configured within Google Cloud to stream data (including real-time) to Google Cloud Storage.

Service account

A service account must be configured to allow Amperity to pull data from the Cloud Storage bucket:

  1. A service account key must be created, and then downloaded for use when configuring Amperity.

  2. The Storage Object Admin role must be assigned to the service account.

Service account key

A service account key must be downloaded so that it may be used to configure the courier in Amperity.

To configure the service account key

  1. Open the Cloud Platform console.

  2. Click IAM, and then Admin.

  3. Click the name of the project that is associated with the Cloud Storage bucket from which Amperity will pull data.

  4. Click Service Accounts, and then select Create Service Account.

  5. In the Name field, give your service account a name. For example, “Amperity GCS Connection”.

  6. In the Description field, enter a description that will remind you of the purpose of the role.

  7. Click Create.

    Important

    Click Continue and skip every step that allows adding additional service account permissions. These permissions will be added directly to the bucket.

  8. From the Service Accounts page, click the name of the service account that was created for Amperity.

  9. Click Add Key, and then select Create new key.

  10. Select the JSON key type, and then click Create.

    The key is downloaded as a JSON file to your local computer. This key is required to connect Amperity to your Cloud Storage bucket. If necessary, provide this key to your Amperity representative using Snappass .

    SnapPass allows secrets to be shared in a secure, ephemeral way. Input a single or multi-line secret, along with an expiration time, and then generate a one-time use URL that may be shared with anyone. Amperity uses SnapPass for sharing credentials to systems with customers.

Example

{
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "client_email": "<<GCS_BUCKET_NAME>>@<<GCS_PROJECT_ID>>.iam.gserviceaccount.com",
  "client_id": "redacted",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/<<GCS_BUCKET_NAME>>%40<<GCS_PROJECT_ID>>.iam.gserviceaccount.com",
  "private_key_id": "redacted",
  "private_key": "redacted",
  "project_id": "<<GCS_PROJECT_ID>>",
  "token_uri": "https://oauth2.googleapis.com/token",
  "type": "service_account"
}

Service account role

The Storage Object Admin role must be assigned to the service account.

To configure the service account role

  1. Open the Cloud Platform console.

  2. Click Storage, and then Browser.

  3. Click the name of the bucket from which Amperity will pull data.

  4. Click the Permissions tab, and then click Add.

  5. Enter the email address of the Cloud Storage service account.

  6. Under Role, choose Storage Object Admin.

    Important

    Amperity requires the Storage Object Admin role for the courier that is assigned to pull data from Cloud Storage.

  7. Click Save.

Add destination

Google Cloud Storage is a destination that may be configured directly from Amperity.

Important

The bucket name must match the value of the <<GCS_BUCKET_NAME>> placeholder shown in the service account key example.

To add a destination

  1. From the Destinations tab, click Add Destination. This opens the Add Destination dialog box.

  2. Enter the name of the destination and a description. For example: “Google Cloud Storage” and “This sends query results to Google Cloud Storage”.

  3. From the Plugin drop-down, select Google Cloud Storage.

  4. The “gcs-service-account-key” credential type is selected automatically.

  5. From the Credential drop-down, select a credential that has already been configured for this destination or click Create a new credential, which opens the Create New Credential dialog box. For new credentials, enter a name for the credential and the service account key. Click Save.

    Note

    The service account key is the contents of the JSON file downloaded from Cloud Storage. Open the JSON file in a text editor, select all of the content in the JSON file, copy it, and then paste it into the Service Account Key field.

  6. Under Google Cloud Storage settings, add the prefix you want to use for exported files.

  7. From the File Format drop-down, select Apache Parquet (recommended), CSV, TSV, or PSV.

  8. Add a single character to be used as an escape character in the output file.

    Note

    If an escape character is not specified and quote mode is set to “None” this may result in unescaped, unquoted files. When an escape character is not specified, you should select a non-“None” option from the Quote Mode setting.

  9. Specify the encoding method. Encoding method options include “Tar”, “Tgz”, “Zip”, “GZip”, and “None”.

  10. Add the PGP public key that is used to encrypt files sent to Google Cloud Storage.

  11. Set the quote mode.

    Note

    If the quote mode is set to “None” and the Escape Character setting is empty this may result in unescaped, unquoted files. When quote mode is not set to “None”, you should specify an escape character.

  12. Optional. Select Include success file upon completion to add a .DONE file to indicate when an orchestration has finished sending data.

    Tip

    If a downstream sensor is listening for files sent from Amperity, configure that sensor to listen for the presence of the .DONE file.

  13. Optional. Select Include header row in output files if headers are included in the output.

  14. Select Allow customers to use this destination.

  15. Select Allow orchestrations from users with limited PII access. (A user with limited PII access has been assigned the Restrict PII Access policy option.)

  16. Click Save.

Add data template

A data template defines how columns in Amperity data structures are sent to downstream workflows. A data template is part of the configuration for sending query and segment results from Amperity to an external location.

You have two options for setting up data templates for Google Cloud Storage:

  1. For use with campaigns

  2. For use with orchestrations

for campaigns

You can configure Amperity to send campaigns to Google Cloud Storage. These results are sent from the Campaigns tab. Results default to a list of email addresses, but you may configure a campaign to send additional attributes to Google Cloud Storage.

To add a data template for campaigns

  1. From the Destinations tab, open the menu for a destination that is configured for Google Cloud Storage, and then select Add data template.

    This opens the Add Data Template dialog box.

  2. Enter the name of the data template and a description. For example: “Google Cloud Storage email list” and “Send email addresses to Google Cloud Storage.”

  3. Enable the Allow customers to use this data template option, and then enable the Make available to campaigns option. This allows users to send campaign results from Amperity to Google Cloud Storage.

  4. Verify all template settings and make any required updates.

  5. Click Save.

for orchestrations

You can configure Amperity to send query results to Google Cloud Storage. These results are sent using an orchestration and will include all columns that were specified in the query.

To add a data template for orchestrations

  1. From the Destinations tab, open the menu for a destination that is configured for Google Cloud Storage, and then select Add data template.

    This opens the Add Data Template dialog box.

  2. Enter the name of the data template and a description. For example: “Google Cloud Storage customer profiles” and “Send email addresses and customer profiles to Google Cloud Storage.”

  3. Enable the Allow customers to use this data template option. This allows users to build queries, and then configure orchestrations that send results from Amperity to a configured destination.

  4. Optional. Enable the Allow orchestrations from customers with limited PII access option. This allows users who have been assigned the Restrict PII Access policy option to send results from Amperity.

  5. Verify all template settings and make any required updates.

  6. Click Save.