Send data to Amazon S3

Amazon Simple Storage Service (Amazon S3) stores customer data files of any size in any file formats.

This topic describes the steps that are required to send files to Amazon S3 from Amperity:

  1. Get details

  2. Add destination

  3. Add data template

Get details

The Amazon S3 destination requires the following configuration details:

Detail one.

The name of the S3 bucket to which Amperity will send data.

Detail two.

For cross-account role assumption you will need the value for the Target Role ARN, which enables Amperity to access the customer-managed Amazon S3 bucket.

Note

The values for the Amperity Role ARN and the External ID fields are provided automatically.

Review the following sample policy, and then add a similar policy to the customer-managed Amazon S3 bucket that allows Amperity access to the bucket. Add this policy as a trusted policy to the IAM role that is used to manage access to the customer-managed Amazon S3 bucket.

The policy for the customer-managed Amazon S3 bucket is unique, but will be similar to:

{
  "Statement": [
    {
      "Sid": "AllowAmperityAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::account:role/resource"
       },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
           "sts:ExternalId": "01234567890123456789"
        }
      }
    }
  ]
}

The value for the role ARN is similar to:

arn:aws:iam::123456789012:role/prod/amperity-plugin

An external ID is an alphanumeric string between 2-1224 characters (without spaces) and may include the following symbols: plus (+), equal (=), comma (,), period (.), at (@), colon (:), forward slash (/), and hyphen (-).

Configure cross-account roles

Amperity prefers to pull data from and send data to customer-managed cloud storage.

Amperity requires using cross-account role assumption to manage access to Amazon S3 to ensure that customer-managed security policies control access to data.

This approach ensures that customers can:

  • Directly manage the IAM policies that control access to data

  • Directly manage the files that are available within the Amazon S3 bucket

  • Modify access without requiring involvement by Amperity; access may be revoked at any time by either Amazon AWS account, after which data sharing ends immediately

  • Directly troubleshoot incomplete or missing files

Note

After setting up cross-account role assumption, a list of files (by filename and file type), along with any sample files, must be made available to allow for feed creation. These files may be placed directly into the shared location after cross-account role assumption is configured.

Can I use an Amazon AWS Access Point?

Yes, but with the following limitations:

  1. The direction of access is Amperity access files that are located in a customer-managed Amazon S3 bucket

  2. A credential-free role-to-role access pattern is used

  3. Traffic is not restricted to VPC-only

To configure an S3 bucket for cross-account role assumption

The following steps describe how to configure Amperity to use cross-account role assumption to pull data from (or push data to) a customer-managed Amazon S3 bucket.

Important

These steps require configuration changes to customer-managed Amazon AWS accounts and must be done by users with administrative access.

Step 1.

Open the Destinations tab to configure credentials for Amazon S3.

Click the Add destination button to open the Add destination dialog box.

Name, description, choose plugin.

Select Amazon S3 from the Plugin drop-down.

Step 1.

From the Credentials dialog box, enter a name for the credential, select the iam-role-to-role credential type, and then select “Create new credential”.

Select the iam-role-to-role credential type.
Step 2.

Next configure the settings that are specific to cross-account role assumption.

Name, description, choose plugin.

The values for the Amperity Role ARN and External ID fields – the Amazon Resource Name (ARN) for your Amperity tenant and its external ID – are provided automatically.

You must provide the values for the Target Role ARN and S3 Bucket Name fields. Enter the target role ARN for the IAM role that Amperity will use to access the customer-managed Amazon S3 bucket, and then enter the name of the Amazon S3 bucket.

Step 3.

Review the following sample policy, and then add a similar policy to the customer-managed Amazon S3 bucket that allows Amperity access to the bucket. Add this policy as a trusted policy to the IAM role that is used to manage access to the customer-managed Amazon S3 bucket.

The policy for the customer-managed Amazon S3 bucket is unique, but will be similar to:

{
  "Statement": [
    {
      "Sid": "AllowAmperityAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::account:role/resource"
       },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
           "sts:ExternalId": "01234567890123456789"
        }
      }
    }
  ]
}

The value for the role ARN is similar to:

arn:aws:iam::123456789012:role/prod/amperity-plugin

An external ID is an alphanumeric string between 2-1224 characters (without spaces) and may include the following symbols: plus (+), equal (=), comma (,), period (.), at (@), colon (:), forward slash (/), and hyphen (-).

Step 4.

Click Continue to test the configuration (and validate the connection) to the customer-managed Amazon S3 bucket, after which you will be able to continue the steps for adding a courier.

Add destination

Configure Amperity to send files directly to Amazon S3.

To add a destination

Step 1.

Open the Destinations tab to configure a destination for Amazon S3. Click the Add Destination button to open the Destination dialog box.

Name, description, choose plugin.

Enter a name for the destination and provide a description. For example: “Amazon S3” and “This sends files to Amazon S3”.

From the Plugin drop-down, start typing “ama” to filter the list, and then select Amazon S3.

Step 2.

Credentials allow Amperity to connect to Amazon S3 and must exist before Amperity can be configured to send data to Amazon S3. Select an existing credential from the Credential drop-down, and then click Continue.

Step 3.

Each destination has settings that define how Amperity will deliver data to Amazon S3. These settings are listed under the Settings section of the Destination dialog box.

Settings for Amazon S3.

Complete the following Amazon S3 Settings:

  • The S3 prefix.

    The S3 prefix is a string that is used to filter results to include only objects whose names begin with this prefix. When this value is set, the names of objects that may be returned in the response are relative to the root of the bucket.

  • The File format. Select the file format – Apache Parquet (recommended), CSV, TSV, or PSV – from the drop-down list.

  • Optional. The Escape character that is required by Amazon S3.

    Note

    If an escape character is not specified and quote mode is set to “None” this may result in unescaped, unquoted files. When an escape character is not specified, you should select a non-“None” option from the Quote Mode setting.

  • Optional. The Compression format. Encoding method options include “Tar”, “Tgz”, “Zip”, “GZip”, and “None”.

  • Optional. The PGP public key that is used to encrypt files that are sent to Amazon S3.

  • Optional. The Quote mode that should be used within the file. From the drop-down, select one of “all fields”, “all non-NULL fields”, “fields with special characters only”, “all non-numeric fields” or “None”.

    Note

    If the quote mode is set to “None” and the Escape Character setting is empty this may result in unescaped, unquoted files. When quote mode is not set to “None”, you should specify an escape character.

  • Optional. Select Include success file upon completion to add a .DONE file that indicates when an orchestration has finished sending data.

    Tip

    If a downstream sensor is listening for files sent from Amperity, configure that sensor to listen for the presence of the .DONE file.

  • Optional. Select Include header row in output files if headers should be included in the output.

  • Optional. Select Row number to include a row number column in the output file. Applies to CSV, TSV, PSV, and custom delimiter file types.

    When enabled, you may specify the name of the row number column in the output file.

  • Optional. Select Exclude Parquet extension from the directory name for managing how Apache Parquet files are added to directories.

Step 4.

Business users are assigned to the Amp360 User and/or AmpIQ User policies. (Amp360 User allows access to queries and orchestrations and AmpIQ User allows access to segments and campaigns.) A business user cannot select a destination that is not visible to them.

Business users – including users assigned to the DataGrid Operator policy – may have restricted access to PII.

What is restricted access to PII?

Restricted PII access is enabled when the Restrict PII access policy option that prevents users who are assigned to that option from viewing data that is marked as PII anywhere in Amperity and from sending that data to any downstream workflow.

You can make this destination visible to orchestrations and allow users with restricted access to PII to use this destination by enabling one (or both) of the following options:

Allow business users access to this destination.

Note

To allow business users to use this destination with campaigns, you must enable the Available to campaigns option within the data template. This allows users to send campaign results from Amperity to Amazon S3.

The other two settings may be configured within the data template instead of the destination.

Step 5.

Review all settings, and then click Save.

Save the destination.

Important

You must configure a data template for this destination before you can send data to Amazon S3.

Add data template

A data template defines how columns in Amperity data structures are sent to downstream workflows. A data template is part of the configuration for sending query and segment results from Amperity to an external location.

To add a data template

Step 1.

From the Destinations tab, open the menu for a destination that is configured for Amazon S3, and then select Add data template.

This opens the Add Data Template dialog box.

Step 1

Enter the name of the data template and a description. For example: “Amazon S3” and “Send files to Amazon S3.”.

Step 2.

Verify business user access to queries and orchestrations and access to segments and campaigns.

A business user may also have restricted access to PII, which prevents them from viewing and sending customer profile data.

Step 2.

If business user access was not configured as part of the destination, you may configure access from the data template.

Important

To allow business users to use this destination with campaigns, you must enable the Available to campaigns option. This allows users to send campaign results from Amperity to Amazon S3.

If you enable this option, the data extension settings require using campaign name and group name template variables to associate the name of the data extension to your campaign.

Step 3.

Verify all configuration settings.

Verify settings for the data template.

Note

When the settings required by Amazon S3 were are not configured as part of the destination, you must configure them as part of the data template before making this destination available to campaigns.

Step 4.

Review all settings, and then click Save.

Save the data template.

After you have saved the data template, and depending on how you configured it, business users can send query results and/or send campaigns to Amazon S3.

Workflow actions

A workflow will occasionally show an error that describes what prevented a workflow from completing successfully. These first appear as alerts in the notifications pane. The alert describes the error, and then links to the Workflows tab.

Open the Workflows page to review a list of workflow actions, choose an action to resolve the workflow error, and then follow the steps that are shown.

Step one.

You may receive a notifications error for a configured Amazon S3 destination. This appears as an alert in the notifications pane on the Destinations tab.

Review a notifications error.

If you receive a notification error, review the details, and then click the View Workflow link to open this notification error in the Workflows page.

Step two.

On the Workflows page, review the individual steps to determine which step(s) have errors that require your attention, and then click Show Resolutions to review the list of workflow actions that were generated for this error.

The Workflow page, showing a workflow with errors.
Step three.

A list of individual workflow actions are shown. Review the list to identify which action you should take.

Choose a workflow action from the list of actions.

Some workflow actions are common across workflows and will often be available, such as retrying a specific task within a workflow or restarting a workflow. These types of actions can often resolve an error.

In certain cases, actions are specific and are shown when certain conditions exist in your tenant. These types of actions typically must be resolved and may require steps that must be done upstream or downstream from your Amperity workflow.

Amperity provides a series of workflow actions that can help resolve specific issues that may arise with Amazon S3, including:

Step four.

Select a workflow action from the list of actions, and then review the steps for resolving that error.

Choose a workflow action from the list of actions.

After you have completed the steps in the workflow action, click Continue to rerun the workflow.

Invalid bucket name

The name of the Amazon S3 bucket to which Amperity pushes data must be correctly specified in the configuration for the destination in the Destinations page.

To resolve this error, do the following.

  1. Open the AWS management console and verify the name of the Amazon S3 bucket.

  2. Open the Destinations page in Amperity, and then open the destination that is associated with this workflow.

  3. Update the destination for the correct Amazon S3 bucket name.

  4. Return to the workflow action, and then click Resolve to retry.

Invalid credentials

The credentials that are defined in Amperity are invalid.

To resolve this error, verify that the credentials required by this workflow are valid.

  1. Open the Credentials page.

  2. Review the details for the credentials used with this workflow. Update the credentials for Amazon S3 if required.

  3. Return to the workflow action, and then click Resolve to retry this workflow.