Send data to Amazon S3

Note

This topic contains information about configuring a destination that sends query results to Amazon S3 using orchestrations. To configure a destination that sends audiences to Amazon S3 using campaigns see this topic .

Amazon Simple Storage Service (Amazon S3) can store data files of any size for any file format that is supported by Amperity.

Amperity can be configured to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Amazon S3 bucket.

Get details

Review the following details before configuring credentials for Amazon S3 and before configuring Amperity to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Amazon S3 bucket.

Detail 1.

Amazon S3 bucket details

You will need to know the following details about the Amazon S3 bucket to which Amperity will send data.

  1. The name of the Amazon S3 bucket. An S3 prefix is sometimes required.

Detail 2.

Credential types and settings

Amperity supports the following credential types for Amazon S3:

  1. IAM role-to-role (recommended)

    For cross-account role assumption you will need the value for the IAM role ARN that allows Amperity to add data to an Amazon S3 bucket that is managed by your brand.

    The values for the Amperity Role ARN and the External ID fields are provided by Amperity.

  2. IAM credentials

Detail 3.

Required configuration settings

File format

Configure Amperity to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Amazon S3 bucket.

Some file formats allow a custom delimiter. Choose the “Custom delimiter” file format, and then add a single character to represent the custom delimiter.

Note

All other Amperity file format settings for Amazon S3 are optional.

Configure credentials

Configure credentials for Amazon S3 before adding a destination.

Amperity supports the following credential types for Amazon S3:

  1. IAM role-to-role (recommended)

  2. IAM credentials

An individual with access to Amazon S3 should use SnapPass to securely share “iam-credential” or “iam-role-to-role” details with the individual who will configure Amperity.

IAM role-to-role

Amperity prefers to pull data from and send data to customer-managed cloud storage.

Amperity recommends using cross-account role assumption to manage access to Amazon S3. This ensures that your brand manages the security policies that control access to your data.

Using cross-account role assumption helps ensures that customers can:

  • Directly manage the IAM policies that control access to data

  • Directly manage the files that are available within the Amazon S3 bucket

  • Modify access without requiring involvement by Amperity; access may be revoked at any time by either Amazon AWS account, after which data sharing ends immediately

  • Directly troubleshoot incomplete or missing files

Note

After setting up cross-account role assumption, a list of files (by filename and file type), along with any sample files, must be made available to allow for feed creation. These files may be placed directly into the shared location after cross-account role assumption is configured.

Can I use an Amazon AWS Access Point?

Yes, but with the following limitations:

  1. The direction of access is Amperity access files that are located in a customer-managed Amazon S3 bucket

  2. A credential-free role-to-role access pattern is used

  3. Traffic is not restricted to VPC-only

To configure an S3 bucket for cross-account role assumption

The following steps describe how to configure Amperity to use cross-account role assumption to pull data from (or push data to) a customer-managed Amazon S3 bucket.

Important

These steps require configuration changes to customer-managed Amazon AWS accounts and must be done by users with administrative access.

Step 1.

From the Settings page, select the Credentials tab, and then click the Add credential button.

Step 2.

In the Credentials settings dialog box, do the following:

From the Plugin dropdown, select Amazon S3.

Assign the credential a name and description that ensures other users of Amperity can recognize when to use this destination.

From the Credential type drop-down, select iam-role-to-role.

Step 3.

The settings that are available for a credential are determined by the credential type. For the iam-role-to-role credential type, configure the following settings, and then click Save.

Name, description, choose plugin.

You must provide the values for the Target Role ARN and S3 Bucket Name fields. Enter the target role ARN (Amazon Resource Name) for the IAM role that Amperity will use to access the customer-managed Amazon S3 bucket, and then enter the name of the Amazon S3 bucket.

Amazon S3 bucket name

Required

Required. The name of the Amazon S3 bucket.

Note

The complete trust policy is availabe from a link at the bottom of the credential configuration page.

Target role ARN

Required

The IAM role ARN (Amazon Resource Name) that is used by Amperity to access a customer-managed Amazon S3 bucket.

The values for the Amperity Role ARN and External ID fields – the Amazon Resource Name (ARN) for your Amperity tenant and its external ID – are provided automatically.

Amperity role ARN

Provided by Amperity

The intermediate IAM role ARN (Amazon Resource Name) that is used to assume the target role. Amperity provides this value.

External ID

Provided by Amperity

The external ID that is used to assume the target IAM role.

An external ID is an alphanumeric string between 2-1224 characters (without spaces) and may include the following symbols: plus (+), equal (=), comma (,), period (.), at (@), colon (:), forward slash (/), and hyphen (-).

Step 4.

Review the following sample policy, and then add a similar policy to the customer-managed Amazon S3 bucket that allows Amperity access to the bucket. Add this policy as a trusted policy to the IAM role that is used to manage access to the customer-managed Amazon S3 bucket.

The policy for the customer-managed Amazon S3 bucket is unique, but will be similar to:

{
  "Statement": [
    {
      "Sid": "AllowAmperityAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::account:role/resource"
       },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
           "sts:ExternalId": "01234567890123456789"
        }
      }
    }
  ]
}

The value for the role ARN is similar to:

arn:aws:iam::1234567890:role/prod/amperity-plugin
Step 5.

Click Continue to test the configuration (and validate the connection) to the customer-managed Amazon S3 bucket, after which you will be able to continue the steps for adding a courier.

IAM credentials

IAM credentials require an access key, which is in two parts:

  1. An access key ID

  2. A secret access key

Both parts are required to authenticate requests to Amazon AWS resources.

To configure an S3 bucket for IAM credentials

Step 1.

From the Settings page, select the Credentials tab, and then click the Add credential button.

Step 2.

In the Credentials settings dialog box, do the following:

From the Plugin dropdown, select Amazon S3.

Assign the credential a name and description that ensures other users of Amperity can recognize when to use this destination.

From the Credential type drop-down, select iam-credential.

Step 3.

The settings that are available for a credential are determined by the credential type. For the iam-credential credential type, configure the following settings, and then click Save.

Name, description, choose plugin.
IAM access key

Required

The IAM access key is one part (of two) that allows Amperity to autheticate to an Amazon S3 bucket. The value for this part of the access key is the access key ID. For example: “AKIAIOSFODNN7EXAMPLE”.

IAM secret key

Required

The IAM secret key is one part (of two) that allows Amperity to autheticate to an Amazon S3 bucket. The value for this part of the access key is the secret access key. For example: “wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY”.

IAM role ARN

The IAM role ARN (Amazon Resource Name) that is used by Amperity to access a customer-managed Amazon S3 bucket.

Amazon S3 bucket name

Required

Required. The name of the Amazon S3 bucket.

Add destination

Use a sandbox to configure a destination for Amazon S3. Before promoting your changes, send a test audience, and then verify the the results in Amazon S3. After the end-to-end workflow has been verified, push the destination from the sandbox to production.

To add a destination for Amazon S3

Step 1.

Open the Destinations page, and then click the Add destination button.

Add

To configure a destination for Amazon S3, do one of the following:

  1. Click the row in which Amazon S3 is located. Destinations are listed alphabetically and you can scroll up and down the list.

  2. Search for Amazon S3. Start typing “amaz”. The list will filter to show only matching destinations. Select “Amazon S3”.

Step 2.

Select the credential for Amazon S3 from the Credential drop-down, and then click Continue.

Tip

Click the “Test connection” link on the “Configure destination” page to verify that Amperity can connect to Amazon S3.

Step 3.

In the “Destination settings” dialog box, assign the destination a name and description that ensures other users of Amperity can recognize when to use this destination.

Configure business user access

By default a destination is available to all users who have permission to view personally identifiable information (PII).

Enable the Admin only checkbox to restrict access to only users assigned to the Datagrid Operator and Datagrid Administrator policies.

Enable the PII setting checkbox to allow users with limited access to PII access to this destination.

Restricted PII access is enabled when the Restrict PII access policy option that prevents users who are assigned to that option from viewing data that is marked as PII anywhere in Amperity and from sending that data to any downstream workflow.

Step 4.

Configure the following settings, and then click “Save”.

Compression

The compression format to apply to the file. May be one of “GZIP”, “None”, “TAR”, “TGZ”, or “ZIP”.

Escape character

The escape character to use in the file output. Applies to CSV, TSV, PSV, and custom delimiter file types.

When an escape character is not specified and the quote mode is “None” files may be sent with unescaped and unquoted data. When an escape character is not specified, you should select a non-“None” option as the quote mode.

File format

Required

Configure Amperity to send Apache Parquet (recommended), CSV, JSON, NDJSON, PSV, or TSV files to any Amazon S3 bucket.

Some file formats allow a custom delimiter. Choose the “Custom delimiter” file format, and then add a single character to represent the custom delimiter.

Apache Parquet files only

The extension for Apache Parquet files may be excluded from the directory name.

Filename template

A filename template defines the naming pattern for files that are sent from Amperity. Specify the name of the file, and then use Jinja-style string formatting to append a date or timestamp to the filename.

Header

Enable to include header rows in output files.

PGP public key

The PGP public key that Amperity will use to encrypt files.

Quote mode

The quote mode to use within the file. May be one of “all fields”, “all non-NULL fields”, “fields with special characters only”, “all non-numeric fields” or “None”.

Unescaped, unquoted files may occur when quote mode is set to “None” and an escape character is not specified.

S3 prefix

Required. The S3 prefix is a string that is used to filter results to include only objects whose names begin with this prefix. When this value is set, the names of objects that may be returned in the response are relative to the root of the bucket.

Success file

Enable to send a “.DONE” file when Amperity has finished sending data.

If a downstream sensor is listening for files sent from Amperity, configure that sensor to listen for the presence of the “.DONE” file.

Use Zip64?

Enable to apply Zip64 data compression to very large files.

Row Number

Select to include a row number column in the output file. Applies to CSV, TSV, PSV, and custom delimiter file types.

If Row Number is enabled you may use the Column name setting to specify the name of the row number column in the output file. The name of this column must be less than 1028 characters and may only contain numbers, letters, underscores, and hyphens. Default value: “row_number”.

Step 5.

After this destination is configured, users may configure Amperity to:

  • Use orchestrations to send query results

  • Use orchestrations and campaigns to send audiences

  • Use orchestrations and campaigns to send offline events

to Amazon S3.