Pull from Heap¶
Heap is a digital insights platform that helps you understand how and why customers engage with your product. Heap automatically collects all customer data from your site or app, then gives direction on the improvements that you can make.
This topic describes the steps that are required to pull clickstream events for users, page views, and sessions to Amperity from Heap:
How this source works¶
Amperity can pull clickstream events for users, page views, and sessions from Heap.

A Heap data source works like this:
Clickstream data is generated when your customers visit your websites and apps.
Heap is configured to capture this data, and then make it available for use outside of Heap using the Heap Connector for S3.
Clickstream data is loaded to a customer-managed Amazon S3 bucket.
Amperity pulls data from the customer-managed Amazon S3 bucket, assigning semantic tags for clickstream events and for customer profile data.
Domain tables within Amperity are refreshed.
Customer profiles are made available to Stitch. All data is passed to your customer 360 database. The Amperity ID links records across data sources for each unique customer.
Get details¶
Amperity can be configured to pull data from Heap using Amazon S3. This requires the following configuration details:
Heap must be configured to use Heap Connect for S3 . This will send data from Heap to a customer-managed Amazon S3 bucket.
The Amazon Resource Name (ARN) for a role with cross-account access.
The name of the customer-managed Amazon S3 bucket.
A list of objects (by filename and file type) in the customer-managed Amazon S3 bucket to be pulled to Amperity.
A sample for each file to simplify feed creation.
Note
Amperity supports using cross-account role assumption with Amazon S3 buckets when Heap supports the use of cross-account roles and your tenant uses the Amazon S3 data source.
Add data source and feed¶
Add a data source that pulls data from Heap.
Configure Amperity to pull one or more files, and then for each file review the settings, define the schema, activate the courier, and then run a manual workflow. Review the data that is added to the domain table.
To add a data source for Heap
![]() |
Open the Sources page to configure Heap. Click the Add courier button to open the Add courier dialog box. Select Heap. Do one of the following:
|
![]() |
Credentials allow Amperity to connect to Heap and must exist before a courier can be configured to pull data from Heap. Select an existing credential from the Credential dropdown, and then click Continue. Tip A courier that has credentials that are configured correctly will show a “Connection successful” status, similar to: ![]() |
![]() |
Select the file that will be pulled to Amperity, either directly (by going into the SFTP site and selecting it) or by providing a filename pattern. ![]() Click Browse to open the File browser. Select the file that will be pulled to Amperity, and then click Accept. Use a filename pattern to define files that will be loaded on a recurring basis, but will have small changes to the filename over time, such as having a datestamp appended to the filename. Note For a new feed, this file is also used as the sample file that is used to define the schema. For an existing feed, this file must match the schema that has already been defined. ![]() Use the PGP credential setting to specify the credentials to use for an encrypted file. ![]() |
![]() |
Review the file. ![]() The contents of the file may be viewed as a table and in the raw format. Switch between these views using the Table and Raw buttons, and then click Refresh to view the file in that format. Note PGP encrypted files can be previewed. Apache Parquet PGP encrypted files must be less than 500 MB to be previewed. Amperity will infer formatting details, and then adds these details to a series of settings located along the left side of the file view. File settings include:
Review the file, and then update these settings, if necessary. |
![]() |
A feed defines the schema for a file that is loaded to Amperity, after which that data is loaded into a domain table and ready for use with workflows within Amperity. New feed To use a new feed, choose the Create new feed option, select an existing source from the Source dropdown or type the name of a new data source, and then enter the name of the feed. ![]() After you choose a load type and save the courier configuration, you will configure the feed using the data within the sample file. Existing feed To use an existing feed, choose the Use existing feed option to use an existing schema. ![]() This option requires this file to match all of the feed-specific settings, such as incoming field names, field types, and primary keys. The data within the file may be different Pull data Define how Amperity will pull data from Heap and how it is loaded to a domain table. ![]() Use the Upsert option to use the selected file update existing records and insert records that do not exist. Use the Truncate and upsert option to delete all records in the existing table, and then insert records. Note When a file is loaded to a domain table using an existing file, the file that is loaded must have the same schema as the existing feed. The data in the file may be new. |
![]() |
Use the feed editor to do all of the following:
When finished, click Activate. |
![]() |
Find the courier related to the feed that was just activated, and then run it manually. On the Sources page, under Couriers, find the courier you want to run and then select Run from the actions menu. ![]() Select a date from the calendar picker that is before today, but after the date on which the file was added to the Heap bucket. ![]() Leave the load options in the Run courier dialog box unselected, and then click Run. After the courier has run successfully, inspect the domain table that contains the data that was loaded to Amperity. After you have verified that the data is correct, you may do any of the following:
|