Send data to Google Cloud Storage

Note

This topic contains information about configuring a destination that sends query results to Google Cloud Storage using orchestrations. To configure a destination that sends audiences to Google Cloud Storage using campaigns see this topic .

Google Cloud Storage is an online file storage web service for storing and accessing data on Google Cloud Platform infrastructure.

Amperity can be configured to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Google Cloud Storage bucket.

Get details

Review the following details before configuring credentials for Google Cloud Storage and before configuring Amperity to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Google Cloud Storage bucket.

Detail 1.

Google Cloud Storage bucket details

You will need to know the following details about the Google Cloud Storage bucket to which Amperity will send data.

  1. The name of the Google Cloud Storage bucket. An object prefix is sometimes required.

    Important

    The bucket name must match the value of the <<GCS_BUCKET_NAME>> placeholder shown in the service account key example.

Detail 2.

Credential types and settings

A Google Cloud Storage service account key must be configured for the Storage Object Admin role.

Detail 3.

Required configuration settings

File format

Configure Amperity to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Google Cloud Storage bucket.

Some file formats allow a custom delimiter. Choose the “Custom delimiter” file format, and then add a single character to represent the custom delimiter.

Note

All other Amperity file format settings for Google Cloud Storage are optional.

Configure credentials

Configure credentials for Google Cloud Storage before adding a destination.

An individual with access to Google Cloud Storage should use SnapPass to securely share “gcs-service-account-key” details with the individual who will configure Amperity.

To configure credentials for Google Cloud Storage

Step 1.

From the Settings page, select the Credentials tab, and then click the Add credential button.

Step 2.

In the Credentials settings dialog box, do the following:

From the Plugin dropdown, select Google Cloud Storage.

Assign the credential a name and description that ensures other users of Amperity can recognize when to use this destination.

Step 3.

The settings that are available for a credential are determined by the credential type. For the “gcs-service-account-key” credential type, configure settings, and then click Save.

Bucket name

Required

Buckets are basic containers that hold data in Google Cloud Storage. Use buckets to organize storage locations for your data, and then configure Amperity to send data to that bucket.

Important

The bucket name must match the value of the <<GCS_BUCKET_NAME>> placeholder shown in the service account key example.

Service account key

Required

Google Cloud uses service account key-pairs for authentication. A public service account key is stored in Google Cloud; a private service account key allows applications access to your instance of Google Cloud Storage.

The value of the private service account key is the contents of the JSON file downloaded from Google Cloud after creating the service account key-pair . Open the JSON file in a text editor, select all of the content in the JSON file, copy it, and then paste it into the Service account key field.

About service accounts

A service account must be configured to allow Amperity to send data to the Google Cloud Storage bucket:

  1. A service account key must be created, and then downloaded for use when configuring Amperity.

  2. The Storage Object Admin role must be assigned to the service account role.

Service account key

A service account key must be downloaded so that it may be used to configure the destination in Amperity.

To configure the service account key

  1. Open the Google Cloud Platform console.

  2. Click IAM, and then Admin.

  3. Click the name of the project that is associated with the Google Cloud Storage bucket to which Amperity will send data.

  4. Click Service Accounts, and then select Create Service Account.

  5. In the Name field, give your service account a name. For example, “Amperity GCS Connection”.

  6. In the Description field, enter a description that will remind you of the purpose of the role.

  7. Click Create.

    Important

    Click Continue and skip every step that allows adding additional service account permissions. These permissions will be added directly to the bucket.

  8. From the Service Accounts page, click the name of the service account that was created for Amperity.

  9. Click Add Key, and then select Create new key.

  10. Select the JSON key type, and then click Create.

    The key is downloaded as a JSON file to your local computer. This key is required to connect Amperity to your Google Cloud Storage bucket. If necessary, provide this key to your Amperity representative using Snappass.

    SnapPass allows secrets to be shared in a secure, ephemeral way. Input a single or multi-line secret, along with an expiration time, and then generate a one-time use URL that may be shared with anyone. Amperity uses SnapPass for sharing credentials to systems with customers.

Example

{
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "client_email": "<<GCS_BUCKET_NAME>>@<<GCS_PROJECT_ID>>.iam.gserviceaccount.com",
  "client_id": "redacted",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/<<GCS_BUCKET_NAME>>%40<<GCS_PROJECT_ID>>.iam.gserviceaccount.com",
  "private_key_id": "redacted",
  "private_key": "redacted",
  "project_id": "<<GCS_PROJECT_ID>>",
  "token_uri": "https://oauth2.googleapis.com/token",
  "type": "service_account"
}

Service account role

The Storage Object Admin role must be assigned to the service account.

To configure the service account role

  1. Open the Google Cloud Platform console.

  2. Click Storage, and then Browser.

  3. Click the name of the bucket from which Amperity will pull data.

  4. Click the Permissions tab, and then click Add.

  5. Enter the email address of the Google Cloud Storage service account.

  6. Under Role, choose Storage Object Admin.

    Important

    Amperity requires the Storage Object Admin role for the courier that is assigned to pull data from Google Cloud Storage.

  7. Click Save.

Add destination

Use a sandbox to configure a destination for Google Cloud Storage. Before promoting your changes, send a test audience, and then verify the the results in Google Cloud Storage. After the end-to-end workflow has been verified, push the destination from the sandbox to production.

To add a destination for Google Cloud Storage

Step 1.

Open the Destinations page, and then click the Add destination button.

Add

To configure a destination for Google Cloud Storage, do one of the following:

  1. Click the row in which Google Cloud Storage is located. Destinations are listed alphabetically and you can scroll up and down the list.

  2. Search for Google Cloud Storage. Start typing “google”. The list will filter to show only matching destinations. Select “Google Cloud Storage”.

Step 2.

Select the credential for Google Cloud Storage from the Credential drop-down, and then click Continue.

Tip

Click the “Test connection” link on the “Configure destination” page to verify that Amperity can connect to Google Cloud Storage.

Step 3.

In the “Destination settings” dialog box, assign the destination a name and description that ensures other users of Amperity can recognize when to use this destination.

Configure business user access

By default a destination is available to all users who have permission to view personally identifiable information (PII).

Enable the Admin only checkbox to restrict access to only users assigned to the Datagrid Operator and Datagrid Administrator policies.

Enable the PII setting checkbox to allow users with limited access to PII access to this destination.

Restricted PII access is enabled when the Restrict PII access policy option that prevents users who are assigned to that option from viewing data that is marked as PII anywhere in Amperity and from sending that data to any downstream workflow.

Step 4.

Configure the following settings, and then click “Save”.

Compression

The compression format to apply to the file. May be one of “GZIP”, “None”, “TAR”, “TGZ”, or “ZIP”.

Escape character

The escape character to use in the file output. Applies to CSV, TSV, PSV, and custom delimiter file types.

When an escape character is not specified and the quote mode is “None” files may be sent with unescaped and unquoted data. When an escape character is not specified, you should select a non-“None” option as the quote mode.

File format

Required

Configure Amperity to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Google Cloud Storage bucket.

Some file formats allow a custom delimiter. Choose the “Custom delimiter” file format, and then add a single character to represent the custom delimiter.

Apache Parquet files only

The extension for Apache Parquet files may be excluded from the directory name.

Filename template

A filename template defines the naming pattern for files that are sent from Amperity. Specify the name of the file, and then use Jinja-style string formatting to append a date or timestamp to the filename.

Header

Enable to include header rows in output files.

Object prefix

Required. The prefix for the name of the cloud storage object for your instance of Google Cloud Storage.

PGP public key

The PGP public key that Amperity will use to encrypt files.

Quote mode

The quote mode to use within the file. May be one of “all fields”, “all non-NULL fields”, “fields with special characters only”, “all non-numeric fields” or “None”.

Unescaped, unquoted files may occur when quote mode is set to “None” and an escape character is not specified.

Success file

Enable to send a “.DONE” file when Amperity has finished sending data.

If a downstream sensor is listening for files sent from Amperity, configure that sensor to listen for the presence of the “.DONE” file.

Use Zip64?

Enable to apply Zip64 data compression to very large files.

Row Number

Select to include a row number column in the output file. Applies to CSV, TSV, PSV, and custom delimiter file types.

If Row Number is enabled you may use the Column name setting to specify the name of the row number column in the output file. The name of this column must be less than 1028 characters and may only contain numbers, letters, underscores, and hyphens. Default value: “row_number”.

Step 5.

After this destination is configured, users may configure Amperity to:

  • Use orchestrations to send query results

  • Use orchestrations and campaigns to send audiences

  • Use orchestrations and campaigns to send offline events

to Google Cloud Storage.