Pull to Amperity SFTP¶
Every Amperity tenant includes an SFTP site with a hostname of <tenant>.sftp.amperity.com
. For example, if your company name is ACME, then your tenant’s SFTP hostname is acme.sftp.amperity.com
. (The hostname is always all lowercase.)
The SFTP site is provisioned by an Amperity administrator after the initial tenant creation. As such, the SFTP site is not immediately available, but this is not a long process. When the SFTP site is ready, Amperity will use SnapPass to send you the connection details. If you wish to use RSA key-based authentication, please provide the public key to your Amperity administrator when requesting SFTP access.
Once provisioned, you may configure the SFTP site to support any desired SFTP workflow. External customer processes can be configured to connect to the site using SFTP, after which they can add data to or pick up data from the site to support any upstream or downstream workflow.
Note
The SFTP server has a 30-day limit on data and after 30 days, data may be moved automatically to an archive location. This location is still accessible to Amperity in case it becomes necessary to reuse it.
The hostname for the SFTP site is always <tenant-name>.sftp.amperity.com. Some older tenants may still be using the legacy address sftp.amperity.com
, if so, please contact your Amperity administrator about migrating.
Data sources¶
Any data source that can send data to an SFTP site can be configured to send data to the Amperity SFTP site. File paths must begin with the tenant name and must be lowercase. For example, if the tenant name is ACME then all file paths must be prefixed with /acme/
.
Get details¶
An Amperity SFTP site requires the following configuration details:
The username.
The passphrase.
The host public key.
The hostname. This is always
[tenant].sftp.amperity.com
. For example, if your tenant name is ACME the hostname isacme.sftp.amperity.com
.A list of objects (by filename and file type, e.g. “accounts.csv”, “customers.ndjson”, “email-list.tsv”, and so on) in the SFTP location to be sent to Amperity.
A sample for each file to simplify feed creation.
Hint
Ask your Amperity representative for the username, passphrase, and public key or ask your Amperity representative to configure a courier that uses the Amperity SFTP site on your behalf, after which you can copy the settings and add additional couriers for data sources as required.
Filedrop requirements¶
A SFTP location requires the following:
Credentials that allow Amperity to access, and then read data from a SFTP location
An optional RSA key for public key credentials
Files provided in a supported file format
Files provided with the correct date format
Support for the desired file compression and/or archive method
The ability to encrypt files before they are added to the location using PGP encryption; the encryption key must be configured so that files can be decrypted by Amperity prior to loading them
Tip
Use SnapPass to securely share your organization’s credentials and encryption keys with your Amperity representative.
Add courier¶
A courier brings data from an external system to Amperity. A courier relies on a feed to know which fileset to bring to Amperity for processing.
Tip
You can run a courier without load operations. Use this approach to get files to upload during feed creation, as a feed requires knowing the schema of a file before you can apply semantic tagging and other feed configuration settings.
Example entities list
An entites list defines the list of files to be pulled to Amperity, along with any file-specific details (such as file name, file type, if header rows are required, and so on).
For example:
[
{
"object/type": "file",
"object/file-pattern": "'/path/to/CustomerRecords.csv'",
"object/land-as": {
"file/header-rows": 1,
"file/tag": "customer-records-2019",
"file/content-type": "text/csv"
}
},
{
"object/type": "file",
"object/file-pattern": "'/path/to/TransactionRecords.csv'",
"object/land-as": {
"file/header-rows": 1,
"file/tag": "transaction-records-2019",
"file/content-type": "text/csv"
}
}
]
Note
You may configure files as required ("object/optional": false
) or optional ("object/optional": true
.) A courier will fail if a required file is not available or, if all files in the fileset are optional, at least one of those files is not available.
To add a courier
From the Sources tab, click Add Courier. The Add Source page opens.
Find, and then click the icon for SFTP. The Add Courier page opens.
This automatically selects passphrase as the Credential Type. Add the hostname for the location from which data is pulled. For example: <tenant-name>.sftp.amperity.com.
Enter the name of the courier. For example: “Amperity SFTP”.
From the Credential drop-down, select Create a new credential. This opens the Create New Credential page.
From the Credential drop-down, select Create a new credential. This opens the Create New Credential dialog box. Enter a name for the credential (typically “Amperity SFTP”), and then enter the username and password required to access this location.
Under Settings configure the list of files to pull to Amperity. Configure the Entities List for each file to be loaded to Amperity.
Note
If the file is contained within a ZIP archive, you may need to specify the fully qualified filename within the ZIP archive. For example, to import a file named “items.csv” you may need to specify “exportitems.csv”.
Under Settings set the load operations to a string that is obviously incorrect, such as
df-xxxxxx
. (You may also set the load operation to empty:{}
.)Tip
If you use an obviously incorrect string, the load operation settings will be saved in the courier configuration. After the feed is configured and activated you can edit the courier, and then update the feed ID with the correct identifier.
Caution
If load operations are not set to
{}
the validation test for the courier configuration settings will fail.Click Save.
Get sample files¶
Every Amperity SFTP file that is pulled to Amperity must be configured as a feed. Before you can configure each feed you need to know the schema of that file. Run the courier without load operations to bring sample files from Amperity SFTP to Amperity, and then use each of those files to configure a feed.
To get sample files
From the Sources tab, open the menu for a courier configured for Amperity SFTP with empty load operations, and then select Run. The Run Courier dialog box opens.
Select Load data from a specific day, and then select today’s date.
Click Run.
Important
The courier run will fail, but this process will successfully return a list of files from Amperity SFTP.
These files will be available for selection as an existing source from the Add Feed dialog box.
Wait for the notification for this courier run to return an error similar to:
Error running load-operations task Cannot find required feeds: "df-xxxxxx"
Add feeds¶
A feed defines how data should be loaded into a domain table, including specifying which columns are required and which columns should be associated with a semantic tag that indicates that column contains customer profile (PII) and transactions data.
Note
A feed must be added for each file that is pulled from Amperity SFTP, including all files that contain customer records and interaction records, along with any other files that will be used to support downstream workflows.
To add a feed
From the Sources tab, click Add Feed. This opens the Add Feed dialog box.
Under Data Source, select Create new source, and then enter “Amperity SFTP”.
Enter the name of the feed in Feed Name. For example: “CustomerRecords”.
Tip
The name of the domain table will be “<data-source-name>:<feed-name>”. For example: “Amperity SFTP:CustomerRecords”.
Under Sample File, select Select existing file, and then choose from the list of files. For example: “filename_YYYY-MM-DD.csv”.
Tip
The list of files that is available from this drop-down menu is sorted from newest to oldest.
Select Load sample file on feed activation.
Click Continue. This opens the Feed Editor page.
Select the primary key.
Apply semantic tags to customer records and interaction records, as appropriate.
Under Last updated field, specify which field best describes when records in the table were last updated.
Tip
Choose Generate an “updated” field to have Amperity generate this field. This is the recommended option unless there is a field already in the table that reliably provides this data.
For feeds with customer records (PII data), select Make available to Stitch.
Click Activate. Wait for the feed to finish loading data to the domain table, and then review the sample data for that domain table from the Data Explorer.
Add load operations¶
After the feeds are activated and domain tables are available, add the load operations to the courier used for Amperity SFTP.
Example load operations
Load operations must specify each file that will be pulled to Amperity from Amperity SFTP.
For example:
{
"CUSTOMER-RECORDS-FEED-ID": [
{
"type": "truncate"
},
{
"type": "load",
"file": "customer-records"
}
],
"TRANSACTION-RECORDS-FEED-ID": [
{
"type": "load",
"file": "transaction-records"
}
]
}
To add load operations
From the Sources tab, open the menu for the courier that was configured for Amperity SFTP, and then select Edit. The Edit Courier dialog box opens.
Edit the load operations for each of the feeds that were configured for Amperity SFTP so they have the correct feed ID.
Click Save.
Run courier manually¶
Run the courier again. This time, because the load operations are present and the feeds are configured, the courier will pull data from Amperity SFTP.
To run the courier manually
From the Sources tab, open the menu for the courier with updated load operations that is configured for Amperity SFTP, and then select Run. The Run Courier dialog box opens.
Select the load option, either for a specific time period or all available data. Actual data will be loaded to a domain table because the feed is configured.
Click Run.
This time the notification will return a message similar to:
Completed in 5 minutes 12 seconds
Add to courier group¶
A courier group is a list of one (or more) couriers that are run as a group, either ad hoc or as part of an automated schedule. A courier group can be configured to act as a constraint on downstream workflows.
To add the courier to a courier group
From the Sources tab, click Add Courier Group. This opens the Create Courier Group dialog box.
Enter the name of the courier. For example: “Amperity SFTP”.
Add a cron string to the Schedule field to define a schedule for the orchestration group.
A schedule defines the frequency at which a courier group runs. All couriers in the same courier group run as a unit and all tasks must complete before a downstream process can be started. The schedule is defined using cron.
Cron syntax specifies the fixed time, date, or interval at which cron will run. Each line represents a job, and is defined like this:
┌───────── minute (0 - 59) │ ┌─────────── hour (0 - 23) │ │ ┌───────────── day of the month (1 - 31) │ │ │ ┌────────────── month (1 - 12) │ │ │ │ ┌─────────────── day of the week (0 - 6) (Sunday to Saturday) │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ * * * * * command to execute
For example,
30 8 * * *
represents “run at 8:30 AM every day” and30 8 * * 0
represents “run at 8:30 AM every Sunday”. Amperity validates your cron syntax and shows you the results. You may also use crontab guru to validate cron syntax.Set Status to Enabled.
Specify a time zone.
A courier group schedule is associated with a time zone. The time zone determines the point at which a courier group’s scheduled start time begins. A time zone should be aligned with the time zone of system from which the data is being pulled.
Note
The time zone that is chosen for an courier group schedule should consider every downstream business processes that requires the data and also the time zone(s) in which the consumers of that data will operate.
Set SLA? to False. (You can change this later after you have verified the end-to-end workflows.)
Add at least one courier to the courier group. Select the name of the courier from the Courier drop-down. Click + Add Courier to add more couriers.
Click Add a courier group constraint, and then select a courier group from the drop-down list.
A wait time is a constraint placed on a courier group that defines an extended time window for data to be made available at the source location.
A courier group typically runs on an automated schedule that expects customer data to be available at the source location within a defined time window. However, in some cases, the customer data may be delayed and isn’t made available within that time window.
For each courier group constraint, apply any offsets.
An offset is a constraint placed on a courier group that defines a range of time that is older than the scheduled time, within which a courier group will accept customer data as valid for the current job. Offset times are in UTC.
A courier group offset is typically set to be 24 hours. For example, it’s possible for customer data to be generated with a correct file name and datestamp appended to it, but for that datestamp to represent the previous day because of the customer’s own workflow. An offset ensures that the data at the source location is recognized by the courier as the correct data source.
Warning
An offset affects couriers in a courier group whether or not they run on a schedule.
Click Save.
Workflow actions¶
A workflow will occasionally show an error that describes what prevented a workflow from completing successfully. These first appear as alerts in the notifications pane. The alert describes the error, and then links to the Workflows tab.
Open the Workflows tab to review a list of workflow actions, choose an action to resolve the workflow error, and then follow the steps that are shown.
Bad archive¶
Sometimes the contents of an archive are corrupted and cannot be loaded to Amperity.
To resolve this error, do the following.
Upload a new file to Amperity.
After the file to the workflow action, and then click Resolve to retry this workflow.
Invalid credentials¶
The credentials that are defined in Amperity are invalid.
To resolve this error, verify that the credentials required by this workflow are valid.
Open the Credentials page.
Review the details for the credentials used with this workflow. Update the credentials for Amperity SFTP if required.
Return to the workflow action, and then click Resolve to retry this workflow.
Missing file¶
An archive that does not contain a file that is expected to be within an archive will return a workflow error; Amperity will be unable to complete the workflow until the issue is resolved.
To resolve this error, do the following.
Add the required file to the archive.
or
Update the configuration for the courier that is attempting to load the missing file to not require that file.
After the file is added to the archive or removed from the courier configuration, click Resolve to retry this workflow.
PGP error¶
A workflow action is created when a file cannot be decrypted using the provided PGP key.
To resolve this error, verify the PGP key.
Open the Sources page.
Review the details for the PGP key.
If the PGP key is correct, verify that the file that is associated with this workflow error was encrypted using the correct PGP key. If necessary, upload a new file.
Return to the workflow action, and then click Resolve to retry this workflow.
Unable to decompress archive¶
An archive that cannot be decompressed will return a workflow error; Amperity will be unable to complete the workflow until the issue is resolved.
This issue may be shown when the name of the archive doesn’t match the name of the configured archive or when Amperity is attempting to decompress a file (and not an archive). In some cases, the contents of the archive file may be the reason why Amperity is unable to decompress the archive.
To resolve this error, do the following.
Verify the configuration for the archive, and then verify the contents of the archive.
Update the configuration, if neccessary. For example, when Amperity is attempting to decompress a file, update the configuration to specify a file and not an archive.
In some cases, re-loading the archive to the location from which Amperity is attempting to pull the archive is necessary.
Return to the workflow action, and then click Resolve to retry this workflow.