Send data to Amazon S3

Amazon Simple Storage Service (Amazon S3) stores customer data files of any size in many file formats.

This topic describes the steps that are required to send files to Amazon S3 from Amperity:

  1. Get details

  2. Add destination

  3. Add data template

Get details

Amperity can be configured to send data to Amazon S3. This may be done using cross-account role assumption (recommended) or by using IAM credentials.

Use cross-account roles

Amperity prefers to send data to customer-managed cloud storage. This approach ensures that customers can:

  • Use cross-account role assumption to manage access to data

  • Directly manage the files that are made available

  • Modify access without requiring involvement by Amperity; access may be revoked at any time by either Amazon AWS account, after which data sharing ends immediately

  • Directly troubleshoot incomplete or missing files

Amperity recommends to use cross-account role assumption to manage access to customer-managed cloud storage in Amazon S3. This allows managed security policies to control access to data.

Note

If you have already configured cross-account role assumption for an Amazon S3 data source you may use the “Amperity S3 Cross-Account Storage” credential for this destination. If you have not configured cross-account role assumption, ask your Amperity representative to help you with those configuration steps.

Use credentials

Amazon S3 requires the following configuration details:

  1. The IAM access key.

  2. The IAM secret key.

  3. The Amazon Resource Name (ARN) for a role with cross-account access. (This is the recommended way to define access to customer-managed Amazon S3 buckets.)

  4. The name of the Amazon S3 bucket to which Amperity will send data and its prefix.

  5. The public key to use for PGP encryption.

Optional workflows

The following sections describe additional workflows that are available. After Amperity sends data to an Amazon S3 bucket, you can configure downstream applications to consume that data and make it available to additional workflows.

Amazon Redshift

Amazon RedShift is a data warehouse located within Amazon Web Services that can handle massive sets of column-oriented data.

Amperity can be configured to send data to an Amazon S3 bucket, after which Amazon Redshift can be configured to load that data. Applications can be configured to connect to Amazon Redshift and use the Amperity output as a data source.

You may use the Amazon S3 bucket that comes with your Amperity tenant for the intermediate step (if your Amperity tenant is running on Amazon AWS). Or you may configure Amperity to send data to an Amazon S3 bucket that your organization manages directly.

The following applications can be configured to load data from Amazon Redshift as a data source:

AWS Lambda

AWS Lambda runs code for any type of application or backend service that can be configured to run automatically from within Amazon Web Services to support any downstream workflow.

For example, you can configure Amperity to send CSV data to Amazon S3, after which AWS Lambda processes that output prior to loading to Amazon Redshift using the COPY command .

Add destination

Amazon S3 is a destination that may be configured directly from Amperity.

To add a destination

  1. From the Destinations tab, click Add Destination. This opens the Add Destination dialog box.

  2. Enter the name of the destination and a description. For example: “Amazon S3” and “This sends query results to Amazon S3”.

  3. From the Plugin drop-down, select Amazon S3.

  4. The “iam-credential” credential type is selected automatically.

  5. From the Credential drop-down, select a credential that has already been configured for this destination or click Create a new credential, which opens the Create New Credential dialog box. For new credentials, enter a name for the credential, the IAM access key, and the IAM secret key. Click Save.

  6. Under Amazon S3 settings, add the name of the Amazon S3 bucket and the prefix. For example: “Amazon S3” and “upload”.

  7. From the File Format drop-down, select Apache Parquet (recommended), CSV, TSV, or PSV.

  8. Add a single character to be used as an escape character in the output file.

    Note

    If an escape character is not specified and quote mode is set to “None” this may result in unescaped, unquoted files. When an escape character is not specified, you should select a non-“None” option from the Quote Mode setting.

  9. Specify the encoding method. Encoding method options include “Tar”, “Tgz”, “Zip”, “GZip”, and “None”.

  10. Add the PGP public key that is used to encrypt files sent to Amazon S3.

  11. Set the quote mode.

    Note

    If the quote mode is set to “None” and the Escape Character setting is empty this may result in unescaped, unquoted files. When quote mode is not set to “None”, you should specify an escape character.

  12. Optional. Select Include success file upon completion to add a .DONE file to indicate when an orchestration has finished sending data.

    Tip

    If a downstream sensor is listening for files sent from Amperity, configure that sensor to listen for the presence of the .DONE file.

  13. Optional. Select Include header row in output files if headers are included in the output.

  14. Select Allow customers to use this destination.

  15. Select Allow orchestrations from users with limited PII access. (A user with limited PII access has been assigned the Restrict PII Access policy option.)

  16. Click Save.

Add data template

A data template defines how columns in Amperity data structures are sent to downstream workflows. A data template is part of the configuration for sending query and segment results from Amperity to an external location.

You have two options for setting up data templates for Amazon S3:

  1. For use with campaigns

  2. For use with orchestrations

for campaigns

You can configure Amperity to send campaigns to Amazon S3. These results are sent from the Campaigns tab. Results default to a list of email addresses, but you may configure a campaign to send additional attributes to Amazon S3.

To add a data template for campaigns

  1. From the Destinations tab, open the menu for a destination that is configured for Amazon S3, and then select Add data template.

    This opens the Add Data Template dialog box.

  2. Enter the name of the data template and a description. For example: “Amazon S3 email list” and “Send email addresses to Amazon S3.”

  3. Enable the Allow customers to use this data template option, and then enable the Make available to campaigns option. This allows users to send campaign results from Amperity to Amazon S3.

  4. Verify all template settings and make any required updates.

  5. Click Save.

for orchestrations

You can configure Amperity to send query results to Amazon S3. These results are sent using an orchestration and will include all columns that were specified in the query.

To add a data template for orchestrations

  1. From the Destinations tab, open the menu for a destination that is configured for Amazon S3, and then select Add data template.

    This opens the Add Data Template dialog box.

  2. Enter the name of the data template and a description. For example: “Amazon S3 customer profiles” and “Send email addresses and customer profiles to Amazon S3.”

  3. Enable the Allow customers to use this data template option. This allows users to build queries, and then configure orchestrations that send results from Amperity to a configured destination.

  4. Optional. Enable the Allow orchestrations from customers with limited PII access option. This allows users who have been assigned the Restrict PII Access policy option to send results from Amperity.

  5. Verify all template settings and make any required updates.

  6. Click Save.