About feeds

A feed defines how data should be loaded into a domain table, including specifying which columns are required and which columns should be associated with a semantic tag that indicates that column contains customer profile (PII) and transactions data.

What a feed does:

  1. Associates columns in customer data with semantics in Amperity.

  2. Assigns column renames, when necessary.

  3. Assigns the data source types Amperity needs for data consistency during the Stitch process.

  4. Are grouped by data source when more than one feed exists for a data source category

What a feed needs:

  1. An input data source, typically (but not always) managed by a courier.

  2. A 1:1 relationship with a courier for large data sources; couriers are run in parallel whereas feeds run sequentially.

  3. Correctly tagged foreign keys, especially if the field associated with that key is shared across feeds. (This is because most tenants deterministically match on foreign keys.)

    Tip

    You can have any number of feeds populated in a courier. However, it is recommended to keep one feed per courier if possible, because couriers can be parallelized, but feeds within a courier are run sequentially.

Feed Editor

The Feed Editor is the second step in the process of adding a new data source to Amperity. The Feed Editor is the interface in which you assign field types, apply semantic tags and primary keys, and indicate if the data associated with this feed should be made available to Stitch.

The Feed Editor

Field types

All columns from all data sources that are loaded to Amperity must be aligned to the following field types:

Boolean

Use the Boolean field type for incoming fields that contain either/or values, such as “true” or “false”, “yes” or “no”, “0” or “1”, or “NULL”.

Date

Use the date field type for incoming fields that contain ISO-8601 compliant date values, such as a birthdate. For example: “January 18, 1972” or “1972-01-18”.

Note

The date field uses the same formatting options as the datetime field.

Datetime

Use the datetime field type for incoming fields that contain ISO-8601 compliant date and time values, such as a purchase or transaction, the time at which data was last updated, or a campaign launch date.

For example, a datetime of “1970-01-15T08:00:00-08:00” is represented as “yyyy-MM-ddThh:mmzzzz”.

Datetime formats

The datetime format should be configured to match the pattern for datetimes are formatted in the raw data. This pattern is often similar to “yyyy-MM-dd-HH-mm-ss”, but these patterns do vary. Amperity uses Joda-Time pattern-based formatting to configure the exact datetime pattern that is present in your data.

  1. It is recommended to configure datetime values using Joda-Time pattern-based formatting for all datetime fields for all of your data sources.

    You can configure Amperity to apply the same pattern for all datetime fields in a feed or you can configure each datetime field individually.

  2. You may allow Amperity to infer the correct datetime, and then apply pattern-based formatting, but this is not the recommended approach.

    Tip

    This approach can help speed up initial implementations because Amperity will often infer the correct pattern.

    It is recommended to revisit all datetime fields for which automatic patterns are enabled, and then configure them for the correct Joda-Time pattern.

Recommended datetime format patterns

Datetime formats in Amperity use a combination of symbols that are described in the following table:

Symbol

Description

y

The year. Numeric presentation for year fields displays single-character outputs as a zero-based year:

  • y outputs 01

  • yy outputs 01

  • yyyy outputs 2001

D

The day of the year. For example, January 18th is 18 and February 1st is 32.

M

The month of the year. For example: August, Aug, or 8. Use text when 3 (or more) pattern characters are present, otherwise use numbers:

  • M outputs 8

  • MM outputs 08

  • MMM outputs Aug

  • MMMM (or more) outputs August

d

The day of the month. For example, January 18th is 18.

H

The hour of the day, between 0-23. 1:00 PM is 13.

  • H outputs 9

  • HH outputs 09

h

The clockhour of halfday, between 1-12. 1:00 PM is 1.

m

The minute of the hour. For example, 6:10 PM is 10.

s

The second of the minute.

S

The fraction of the second.

z

The time zone. For example: Pacific Standard Time or PST.

Note

Time zone names cannot be parsed.

Z

The time zone and offset.

  • Z outputs the offset without a colon. For example: -0800.

  • ZZ outputs the offset with a colon. For example: -08:00.

  • ZZZ (or more) outputs the zone ID. For example: America/Los_Angeles.

Note

An apostrophe (’) is an escape character in Joda time. To use a single quote you must use a double apostrophe (‘’).

Any characters in a pattern that are not in the ranges of [a-z] and [A-Z] will be treated as quoted text. For example, the following characters

  • : (colon)

  • . (period)

  • , (comma)

  • (space)

  • # (number sign or hash)

  • ? (question mark)

will appear as if they are not contained within single quotes.

Other symbols may be available. Refer to Joda-Time documentation for more information about symbols that are not described in the previous table.

Decimal

Use the decimal field type for fields that require precision, such as prices or message sizes. Decimal type precision is automatically set to 38. Scale is the number of digits following the decimal point and is configurable from 0 to 37.

You may configure a field for a specific precision scale; you may apply that scale to all decimal field types in the same feed. For example:

  • 1.50 (prices)

  • 1874.38 (message sizes)

  • 141.48 (order subtotals)

Note

Fields in which every value contains trailing zeroes, such as quantities and totals, are automatically converted to an integer field type.

Fields that end in non-zero decimals cannot be set to the integer field type.

Float

Use the float field type for fields that contain fractional parts, but do not require precision. For example:

  • 3.14

  • 3.14159

Important

Use decimal for fields that contain prices.

Integer

Use the integer field type for fields that contain numeral values, such as the number of items purchased. For example:

  • 1

  • 12345

Important

Fields that end in non-zero decimals, such as prices, should use the decimal field type.

String

Use the string field type for incoming fields that contain strings and that do not contain obvious matches with other field types, such as all identifiers (account IDs, loyalty IDs, customer IDs), first and last names, email addresses, physical addresses, UUIDs (and other IDs), phone numbers, zip codes, product names, descriptions, and so on. For example:

  • John

  • Smith

  • John Smith

  • johnsmith @ domain.com

  • 123 Main Street

  • 206-555-1111

  • 00002ac0-0915-3cb4-b7c7-5ee192b3bd49

  • ACME

  • pants

  • “ “ (may be empty)

  • “A data source that pulls from an Amazon S3 bucket.” (may contain punctuation)

Semantic tags

A semantic is a way to apply a common understanding to individual points of data across multiple data sources, even when data sources have different schemas, naming conventions, and levels of data quality.

A semantic tag standardizes profile (PII), transaction, and other important customer details across all columns in all data tables.

Semantic tagging is defined as part of every feed. This ensures that data from multiple sources is brought into Amperity in a consistent manner to improve the outcome of the Stitch process and identity resolution.

Semantic tagging works like this:

  1. A field in the customer’s system named “fname” stores an individual’s given name (or first name).

  2. A field in the customer’s system named “lname” stores the same individual’s surname (or last name).

  3. A field in the customer’s system named “primary-phone” stores a phone number.

  4. A field in the customer’s system named “date” stores an individual’s birthdate.

  5. A field in the customer’s system named “email_address1” stores an individual’s primary email address.

  6. And so on.

A feed would apply these semantic tags similar to:

Input field

Semantic tag

fname

given-name

lname

surname

primary-phone

phone

date

birthdate

email_address1

email

Amperity has built-in semantic tags for personally-identifiable information (PII), transactions, and behaviors. Custom semantic tags may be applied to specific fields to account for the unique data that may exist within your brand’s data.

Apply a similar pattern to every data source your tenant chooses to bring into Amperity. This will ensure that all semantically-tagged fields are analyzed by Amperity in exactly the same way.

Tip

Use semantic tags to define consistency across all of the data sources that your brand chooses to make available to Amperity.

Many brands often choose to provide a wide range of data sources, including those with quality and/or completeness issues; applying semantic tags to all data sources, including lower quality and/or incomplete data sources, makes them usable for building complete and accurate customer profiles.

Caution

The semantic tagging process is not strictly enforced. For example:

  • You may choose to not apply tag a semantic tag to a column that contains first names

  • You may use custom semantic tags instead of the default (and recommended) semantic tags. For example phone-1, phone-2, and phone-3 instead of phone.

Non-standard semantic tagging should be done carefully because it often lowers the accuracy of Amperity ID assignment and greater frequency of incomplete customer profiles.

Semantic types

Amperity supports the following types of semantic tags:

in addition to supporting the use of keys, Stitch labels, and blocklists to further define how data should be understood by Amperity.

Troubleshoot ingest errors

When a file shows an error on ingest you can troubleshoot those errors by downloading CSV files that contain error logs from the Ingest Details page.

The Ingest Details page is accessible from the Recent activity pane in the Sources page. Error logs are only available for download when errors are present.

There are two types of error log files:

  1. A feed-level log file that describes the types and frequency of errors.

    From the Ingest Details page, open the actions menu (  ) for the feed, and then click Download feed error log.

  2. A file-level log that describes each error.

    From the Ingest Details page, expand the table row to show the file, open the actions menu (  ) for the file, and then click Download file error log.

How-tos

The Sources page shows the status of all feeds, including when they last ran or updated, how many records were added, and their current status.

This section describes individual tasks that are related to managing feeds:

Activate feed

A feed must be activated before it can be used to populate domain tables and be associated with a courier. The Activate button is located in the upper right of the Feed Editor. Click the Activate button as the final step when adding or editing a feed.

Add feeds

Use the Add Feed button to add a feed to Amperity. A feed must be created for each individual data source that will be processed by Amperity. A feed may be based on a sample data file (recommended), be based on an existing file or be defined by hand.

Tip

Group feeds by source, where “source” represents a category more than the actual location from which data is made available to Amperity. Use the source as a folder name, under which any number of feeds that support individual data sources may be grouped.

For example:

  • A source named “Online” can be a category that represents all customer records for all online transactions. For example, a tenant could pull data from Shopify for Brand A and from Adobe Commerce for Brand B, but group both of them together under the “Online” source.

  • A source named “Mobile” can categorize customer records for more than one mobile application, such as Attentive or Listrak.

  • A source named “Purchases” can categorize all transaction records for all store brands, all catalogs, and all mobile applications.

  • A source named “Klaviyo” can categorize the variety of records that are provided to Amperity from Klaviyo.

There are two common scenarios for adding feeds to Amperity:

  1. Add a source grouping

  2. Use an existing source grouping

Add a source grouping

If a source category does not exist at the time a feed is added, you can add a new source at the same time you add the feed.

When adding a feed and a source, the sample file for the feed may be:

  1. A new sample file

  2. An existing sample file

  3. No sample file

New sample file

Every feed requires a schema. Apply semantic tags to individual fields in the schema. Use one (or more) individual fields in the schema to define the primary key.

To add a feed using a new sample file

  1. From the Sources page, click Add Feed. This opens the Add Feed dialog box.

  2. Under Data Source select Create new source, and then enter the name for the source.

  3. Under Sample File select Upload new file, and then choose a file.

    Information that shows the file type, compression type, and header rows is shown.

  4. Click Continue. This opens the Feed Editor page.

Existing sample file

Files that are used as sample files for feeds are uploaded to Amperity and persist in a location that is accessible to the Add Feed dialog box. You may use these files as sample files when adding feeds.

To add a feed using an existing sample file

  1. From the Sources page, click Add Feed. This opens the Add Feed dialog box.

  2. Under Data Source select Create new source, and then enter the name for the source.

  3. Under Sample File select Select existing file, and then choose a file.

    Information that shows the date and time at which the existing file was uploaded to Amperity is shown, along with the file ID.

  4. Click Continue. This opens the Feed Editor page.

No sample file

A sample file is not required to define a feed. You can add fields directly in the Feed Editor as if they were loaded from a sample file, and then apply semantic tags, field renames, and all of the rest of the possible feed configuration options.

To add a feed with no sample file

  1. From the Sources page, click Add Feed. This opens the Add Feed dialog box.

  2. Under Data Source select Create new source, and then enter the name for the source.

  3. Under Sample File select Don’t use sample file, and then choose a file.

  4. Click Continue. This opens the Feed Editor page.

Use existing source grouping

In many cases, a useful source category already exists at the time a feed is added. You should try to group feeds under an existing source category whenever possible.

When adding a feed to an existing source, the sample file for the feed may be:

  1. A new sample file

  2. An existing sample file

  3. No sample file

New sample file

Every feed requires a schema to which semantic tags are applied, primary keys are selected, and other configuration details can be specified. Often this requires adding a source and a feed.

To add a feed using a new sample file

  1. From the Sources page, click Add Feed. This opens the Add Feed dialog box.

  2. Under Data Source select Select existing source, and then choose a source.

  3. Under Sample File select Upload new file, and then choose a file.

    Information that shows the file type, compression type, and header rows is shown.

  4. Click Continue. This opens the Feed Editor page.

Existing sample file

Files that are used as sample files for feeds are uploaded to Amperity and persist in a location that is accessible to the Add Feed dialog box. You may use these files as sample files when adding feeds.

To add a feed using an existing sample file

  1. From the Sources page, click Add Feed. This opens the Add Feed dialog box.

  2. Under Data Source select Select existing source, and then choose a source.

  3. Under Sample File select Select existing file, and then choose a file.

    Information that shows the date and time at which the existing file was uploaded to Amperity is shown, along with the file ID.

  4. Click Continue. This opens the Feed Editor page.

No sample file

A sample file is not required to define a feed. You can add fields directly in the Feed Editor as if they were loaded from a sample file, and then apply semantic tags, field renames, and all of the rest of the possible feed configuration options.

To add a feed without a sample file

  1. From the Sources page, click Add Feed. This opens the Add Feed dialog box.

  2. Under Data Source select Select existing source, and then choose a source.

  3. Under Sample File select Don’t use sample file, and then choose a file.

  4. Click Continue. This opens the Feed Editor page.

Apply semantic tags

A data schema defines the structure of customer data. This is in the form of a data table with columns and rows. It is not uncommon for a collection of data sources to have completely different data schemas. Amperity maps certain columns within customer data into a set of categories–personally identifiable information (PII), transaction attributes, etc.–to ensure consistency for customer records across all data sources.

Customer profile (PII)

Personally identifiable information (PII) is any data that could potentially identify a specific individual. PII data includes details like names, addresses, email addresses, and other profile attributes, but can also include attributes like a loyalty number, customer relationship management (CRM) system identifiers, and foreign keys in customer data.

A PII semantic assigns consistency to customer data to ensure that PII data is more easily discovered across many sets of data.

Amperity will assign PII semantic tags automatically when the incoming field name closely matches a known pattern.

Review the complete list of semantics (including profile semantics and keys related to semantics) before verifying the PII semantics that were applied automatically to customer records.

Tip

Apply the update-dt semantic tag to columns in customer records that identify when the data was last updated in the source system. The field to which this semantic is applied must be a datetime field type. At least one customer record must have this semantic tag applied to ensure that the update_dt column is created in the Unified Coalesced table and to ensure that the Merged Customers table behaves correctly.

The field that is associated with the update-dt semantic tag should also be selected as the field that is associated with the LAST UPDATED FIELD drop-down list.

To apply profile semantics to a feed

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

    Note

    The Feed Editor page opens automatically after a file has completed loading when setting up a new feed.

  2. Review all of the profile semantics that were automatically tagged by Amperity to ensure that each tag was matched correctly.

    Tip

    Apply the update-dt semantic tag to columns in customer records that identify when the data was last updated in the source system. The field to which this semantic is applied must be a datetime field type. At least one customer record must have this semantic tag applied to ensure that the update_dt column is created in the Unified Coalesced table and to ensure that the Merged Customers table behaves correctly.

    The field that is associated with the update-dt semantic tag should also be selected as the field that is associated with the LAST UPDATED FIELD drop-down list.

  3. Click Activate.

Transactions

Review the complete list of semantics (including transactions semantics and foreign keys related to transactions semantics) before applying them to interaction records.

An itemized transactions semantic is a way to identify brands, channels, stores, orders, products, quantities, per-item costs, total costs, and so on. Use itemized transactions semantics when a data source contains one row per item.

To apply transactions semantics to a feed

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. Apply Transactions to any column in the data schema that has matching data.

  3. Click Activate.

Custom semantics

Custom semantic values may be applied to any column in Amperity. Amperity has many built-in semantics for PII, transactions, customer preferences, loyalty programs, and email attributes.

Amperity may not have all of the semantics required by a customer. Before adding custom semantics to Amperity you should work with the customer to define which custom semantic tags are required, what their purpose is, and what types of customized SQL queries will be required to present the outcome of the tagging effort in a way that is usable from a queries editor.

Caution

If a semantic tag is added to Amperity that contains a typo—such as emali vs. email—both will be treated by Amperity as if they are unique semantic tags.

To apply custom semantic tags

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. In the Semantic column, enter the names of custom semantics. They should use hyphens (and not underscores) and they should be entered the same way as a PII semantic, but without it appearing in the drop-down pick-list.

  3. Click Activate.

Note

This may be done during the process of adding a feed, also.

Copy feeds

Copy a feed on the Sources page.

To copy a feed

  1. From the Sources page, open the menu for a feed, and then select Make a copy. The Copy Feed page opens.

  2. Select the data source from the Data Source dropdown menu.

  3. Enter the feed name in the Feed Name field.

  4. Click Submit.

The copied feed appears in the Feeds section of the Sources page.

Configure feed schema

A feed defines the schema for that data source, associates semantic tags with specific columns in the data, and assigns a primary key. A feed must be activated, after which the feed loads the data to a domain table. A feed that contains customer records for PII data must be made available to the Stitch process.

If changes are made to a feed and/or the data schema in the data source itself changes, the feed must reload the data. In some cases, this also requires the domain table itself be purged, and then rebuilt upon feed reactivation.

The Feed Editor

The Feed Editor is the second step in the process of adding a new data source to Amperity. The Feed Editor is the interface in which you assign field types, apply semantic tags and primary keys, and indicate if the data associated with this feed should be made available to Stitch.

Add fields

You may add fields to a feed even when they are not present in a sample file.

To add a field to a feed

  1. From the Sources page, open the menu for a feed, and then select Edit.

  2. Click + Add New Field. A field is added to the bottom of the list of fields for the feed.

  3. Add the name of the field, select its field type, associate semantic tags, and apply any renames.

  4. Deselect Required.

  5. Click Activate.

Delete fields

A field that is deleted may also be deleted everywhere else it appears in Amperity. The process that updates the domain table occurs automatically; additional steps may be required to remove the field from a database table, query, or segment that uses the field. A field may not be deleted when it is a dependency for a primary key.

To delete a field

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. Click the trash can icon to delete the field.

    Warning

    Amperity does not ask for confirmation, so use this carefully.

Edit domain table name

You may edit the name of the domain table from the Feed Editor.

Warning

Changing the table name will reference the newly specified table for subsequent data ingestion. This table will be created if it does not exist. All data previously ingested will be in the prior table.

To edit the name of a domain table from the Feed Editor

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. Under Domain Table select Edit. This allows the text box that contains the name of the domain table to be editable.

  3. Enter the new name for the domain table with the pattern <source-name>:<feed-name>, e.g. “Customers:Online”.

  4. Click Activate.

Incoming field names

The Incoming Field column matches the column name in the source data file.

Amperity will apply the casing in the sample file to the values of the Incoming Field column when a sample file is used to define a feed schema. Changes to the casing of column names will have no effect when loading updates to Amperity.

For example, a customer_id column name in the sample file is added to the Incoming Field column as customer_id. A column name change to CUSTOMER_ID or Customer_ID will not affect how the feed loads data or updates to the domain table.

Rename fields

A field name may contain only letters, numbers, and underscores. If an incoming field contains unsupported characters, such as hyphens or spaces, use the Rename To column to replace the unsupported characters with letters, numbers, or underscores.

For example, a sample CSV file:

name,email,first-name,last-name,phone_home,phone_mobile
Justin Currie,justincurrie@mail.com,Justin,Currie,222-555-1212,505-404-1234
David Zearfoss,davidzearfoss@mail.com,David,Zearfoss,222-555-1313,505-606-8765

will display an alert in the Feed Editor similar to:

Field names may only contain letters, numbers, and underscores. If an
incoming field name has unsupported characters, add a rename.

This is because hyphens are present in two fields. Use the Rename to column to rename first-name to first_name and last-name to last_name.

Set primary key

A primary key is a column in a data table that uniquely identifies each row in a data source or data table.

At least one field must be set as a primary key. Any feed that contains customer records or interaction records must have a field that can be associated with a primary key. This is typically an obvious field, like a customer ID or transaction ID, but some data sources are not as clear. You may tag more than one field as the primary key.

To set the primary key

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. From the Primary Key drop-down, select a field from the list.

  3. Click Activate.

Tip

The number of records in a domain table may not match the number of records loaded by Amperity after loading data. Amperity uses an UPSERT process when loading data and determines priority based on the Last Updated Field. If a large difference exists take a close look at the primary key and determine if the primary key is the cause.

Delete feeds

Use the Delete option to remove a feed from Amperity. This should be done carefully. Verify that both upstream and downstream processes no longer depend on this feed prior to deleting it. This action will not delete the associated data file.

To delete a feed

  1. From the Sources page, open the menu for a feed, and then select Delete. The Delete Feed dialog box opens.

  2. Click Delete.

Edit feeds

Use the Edit option in the row for a specific feed to make changes, including to feed details, column names, column types, semantics, required columns, and so on. A very common scenario for editing a feed is to apply foreign key assignments to tables with customer records so that they may be associated to tables with interaction records.

Add columns

Warning

Adding columns to a feed should be done after considering the downstream effects of those changes. Changes may have unintended consequences, such as semantics, column names, pick-lists, and other changes may not be immediately available to a downstream process, which may cause that downstream process to fail.

A column may be added to a feed. Before adding columns, be sure to also add the column to any customer 360 database tables that are using the domain table associated with this feed. Columns on passthrough database tables are opt-in, which means you will need to update the passthrough table to ensure all columns are loaded; however, this should only be done after the feed has successfully loaded this data to its domain table.

Warning

Editing column types, column semantics, and column names without considering the downstream effects of those changes may have unintended consequences.

It is not always possible to update downstream processes in advance. For example, a query cannot be updated for changes until the data tables against which that query runs have been updated. Because of the additional complexity related to editing column types, column semantics, and column names, it is recommended to take steps to ensure that downstream processes will not run while making these changes, and then:

  1. Update the feed.

  2. Update the domain table, if necessary.

  3. Update Stitch configuration, if necessary.

  4. Update customer 360 database tables, if necessary.

  5. Update queries, if necessary.

  6. Update destinations, if necessary.

Add foreign keys

To associate interaction records that are identified by transactions and transaction items semantic tagging with an Amperity ID, you need to add a foreign key that associates primary keys in tables with customer records to the primary key in tables with interaction records.

Caution

You must apply the same foreign key to customer records and interaction records.

To add a foreign key

  1. From the Sources page, open the menu for a feed, and then select Edit. The Edit Feed page opens.

  2. For the field identified as the primary key, add a custom semantic that start with fk-, such as fk-customer.

  3. Click Activate.

Change primary keys

Warning

Be careful when making changes to columns in a feed to ensure that all keys–primary, foreign, and customer–are maintained correctly. In general, changing a key can have downstream implications for Stitch, including (but not limited to) potentially affecting stable Amperity IDs that depend on this primary key.

A primary key is a column in a data table that uniquely identifies each row in a data source or data table.

A primary key may be updated in situations that allow for better associations with other data sources and/or situations that result in improvements to the quality of the identity resolution process.

Changes to the primary key requires truncating data from the domain table upon feed reactivation.

You may edit the primary key. Pick a new one or add more fields to be primary keys.

Important

Changes to primary keys may have unintended downstream effects. If you change the primary key for data that has already been processed by Stitch, a mismatch may be created in the domain model definition and existing domain tables.

To edit the primary key

  1. From the Sources page, open the menu for a feed, and then select Edit. The Edit Feed page opens.

  2. From the Primary Key drop-down, select a field from the list.

  3. If prompted, select Yes, purge the data from my domain table when I activate.

  4. Click Activate.

Delete columns

Warning

Deleting columns in a feed should not be done without considering the downstream effects of those changes. Changes may have unintended consequences: semantics, column names, pick-lists, and so on, may not be immediately available to a downstream process, which may cause that downstream process to fail.

A column may be deleted from a feed. The timing of this deletion is important. If a column is deleted during feed creation, that column will not be loaded to a domain table, be part of any Stitch processes, be a column in a customer 360 database table, be part of a SQL query that defines a query, or be a data point sent to an external destination.

However, a column that is deleted from a feed that has been loaded to a domain table, been part of the Stitch process, is a column in a customer 360 table, is part of a SQL query that defines a query, and is a data point that is sent to an external destination must be deleted carefully.

Before deleting any column that has been loaded to a domain table, be sure to identify if that column is part of any downstream workflows. Every reference to that column will need to be removed from Amperity. This is especially important for customer 360 database tables and for queries.

Edit columns

Warning

Editing columns in a feed should not be done without considering the downstream effects of those changes. Changes may have unintended consequences: semantics, column names, pick-lists, and so on, may not be immediately available to a downstream process, which may cause that downstream process to fail.

Changes to columns in feed data should be done carefully. Depending on the type and scope of changes to be made:

  1. Remove references to the column, re-load the data via the feed, and then re-add the column with the updated metadata.

  2. Remove any databases and queries that reference the column, re-load the data via the feed, and then recreate the databases and queries.

Load new data

Use the Load new data option in the row for a specific feed to upload a new data file. The file must have the same schema, but may contain new data. The domain table is updated automatically.

To load new data to an existing feed

  1. From the Sources page, open the menu for a feed, and then select Load new data. The Load New Data dialog box opens.

  2. Select a data source.

    Important

    The selected file must have a schema that matches the schema already defined for the feed.

  3. Click Load.

Generate fields

A field may be generated using one (or more) fields that already exist in a feed. There are three primary use cases:

  1. To match a customer key that must be built from a group of fields.

  2. To apply a foreign key to a field or group of fields.

  3. To combine fields into new columns to support downstream workflows.

Add customer keys

A customer key is a column in a data table with a value that represents a unique ID for that data as it exists in the customer’s data outside of Amperity. Some scenarios require Amperity to keep track of that unique identifier so that it may be used with downstream workflows that rely on it.

A customer key that is represented by a single field does not need a generated field. A customer key that is built from more than one field will require a generated field.

For example, a customer might use a combination of a customer’s first name, last name, and email address as a way to uniquely identify that customer within a data source. Use a generated field to build a field that contains those elements of the customer key, and then apply a UUID transform against it. The UUID will be available in the customer 360 database and to queries, as well as being a field that can be sent from Amperity to any downstream workflow.

To generate a customer key

  1. From the Sources page, expand Generated Fields.

  2. Click Add generated field. The New Field dialog box opens.

  3. Add the name for the generated field.

  4. From the Fields drop-down menu, select the fields that will be used to create the generated field.

  5. Select the Apply a UUID transform to this field option.

  6. From the Semantics drop-down menu, select the ck semantic tag.

  7. Click Activate.

    When data is loaded to the domain table, a column is added to the table that has the same name as the generated field and the value of the column is a UUID.

Add foreign keys

A foreign key is a column in a data table that identifies a field that is to be associated to a primary key in another data source or data table. A foreign key is one of the ways Amperity associates interaction records to customer records.

To generate a foreign key

  1. From the Sources page, expand Generated Fields.

  2. Click Add generated field. The New Field dialog box opens.

  3. Add the name for the generated field.

  4. From the Fields drop-down menu, select the fields that will be used to create the generated field.

  5. Select the Apply a UUID transform to this field option.

  6. An fk- semantic tag is not available from the drop-down menu. You must type the fk-[namespace] semantic tag into the field, and then hit RETURN.

  7. Click Activate.

    When data is loaded to the domain table, a column is added to the table that has the same name as the generated field and the value of the column is a UUID.

Combine fields

Sometimes it is necessary to create a synthetic key for use with downstream workflows. For example, table A might have a column named “full_name” and table B might have two columns named “first_name” and “last_name”:

Table A:                  Table B:

---------------          ------------ -----------
 full_name                first_name   last_name
---------------          ------------ -----------
 John Smith               John         Smith
 Justin Currie            Justin       Currie
 Mary Green               Mary         Green
---------------          ------------ -----------

To support a downstream workflow where queries against either table can look for full names, you can add a generated field to table B:

Table A:                  Table B:

---------------          ------------ ----------- ---------------
 full_name                first_name   last_name   full_name
---------------          ------------ ----------- ---------------
 John Smith               John         Smith       John-Smith
 Justin Currie            Justin       Currie      Justin-Currie
 Mary Green               Mary         Green       Mary-Green
---------------          ------------ ----------- ---------------

To combine fields into a single column

  1. From the Sources page, expand Generated Fields.

  2. Click Add generated field. The New Field dialog box opens.

  3. Add the name for the generated field.

  4. From the Fields drop-down menu, select the fields that will be used to create the generated field.

  5. If semantic tags should continue to be associated with this generated field, select the appropriate semantic for the value that will be the result of this combination.

  6. De-select the Apply a UUID transform to this field.

    Warning

    Associating a field that has a UUID transform to a field that also has a semantic tag could result in “John Smith” being associated with a value similar to “123e4567-e89b-12d3-a456-426614174000”.

  7. Click Activate.

    When data is loaded to the domain table, a column is added to the table that has the same name as the generated field. The value of the column is a hyphenated value that is based on the associated fields.

Make available to Stitch

A domain table with semantic tags applied to records that contain PII data should be made available to Stitch. A domain table that is made available to Stitch is used by Stitch for customer identity resolution.

Domain table data is made available to Stitch in two steps:

  1. Selecting the Make available to Stitch option when configuring a feed or a custom domain table.

    When selected, the name of the domain table that is associated with the feed or custom domain table is added to a list of domain tables that are accessible as a Stitch configuration setting.

  2. A list of domain tables within Stitch configuration that have been made available to Stitch. Each table in this list must be configured for Stitch before it can be part of the identity resolution process.

Tip

Only tables that contain PII data should be made available to Stitch. Tables that are later associated with Amperity IDs, but do not contain PII data, such as those that contain transactions, should use a foreign key to associate those records with an Amperity ID.

To make data available to Stitch

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. Under Domain Table select Make available to Stitch.

  3. Click Activate.

Replace data source

In certain cases a data source needs to be replaced in a way that allows a new schema to be added to Amperity, while limiting the amount of changes that must be applied to downstream workflows.

A data source may need to be replaced when there are minor schema changes or when an upstream workflow changes the manner in which that data source is provided to Amperity.

To replace a data source

  1. Add it as a second version of the original. For example: source_data and source_data_v2.

  2. Add a foreign key that is shared only by the original and second version.

    Tip

    Examine the data sources.

    If the primary key field is identical in both data sources, assign the foreign key to the primary key fields, otherwise identify a field in each source that can be assigned a shared foreign key.

  3. Make both tables available to Stitch when the data source contains PII data.

  4. Run Stitch.

    Note

    Be sure that the second version is added to the Stitch configuration before running Stitch.

  5. Step through the Stitch QA process, and then examine both versions for foreign key validation.

    Important

    The foreign key that is shared by the original and second version should not show multiple Amperity IDs within either the original or second versions. The same individual across both versions should not show different Amperity IDs.

Resolve name conflicts

A domain table requires each column name to be unique. However, a sample file may contain duplicate field names. If this occurs, use the Incoming Field column to apply a namespace, an ordinal, or some other naming convention to ensure that column names in the domain table will be unique.

For example, a sample CSV file:

name,email,fname,lname,phone,phone
Justin Currie,justincurrie@mail.com,Justin,Currie,222-555-1212,505-404-1234
David Zearfoss,davidzearfoss@mail.com,David,Zearfoss,222-555-1313,505-606-8765

will display an alert in the Feed Editor similar to:

There is a name conflict in the resulting fields of this feed. Make sure
that none of the field names are duplicated (including any generated fields).

This is because the phone field appears twice. Use the Incoming Field column to rename the first instance of phone to phone_home and the second instance of phone to phone_mobile.

Set error thresholds

Amperity processes and checks all rows. The error limit is the percentage of rows that may contain errors when compared to the total number of rows. Data is not ingested when the error limit is exceeded. Adjust the error limit higher or lower to match acceptable percentages for various types of data and data quality.

To set the error threshold

  1. From the Sources page, open a feed.

  2. In the Feed Editor, expand Error Thresholds.

  3. Under MAX, enter an integer between 0 and 100. (The default limit is 1%.)

  4. Click Activate.

Set last updated field

Amperity requires each feed to specify a field that describes when each record was last updated. If multiple records in the incoming data and/or the existing domain table have the same primary key, the record with the most recent “last updated” field will be retained. This may be associated with a field that has a datetime field type, or an integer (such as for unix timestamps).

Note

Amperity does not use a field with a date data type because that value is not granular enough to determine priority.

If you have no such updated field, you can choose to autogenerate a field, in which case the following logic is used to determine which record to keep in the case a primary key appears more than once:

  1. Records from newly-ingested data will always overwrite records that already exist in the domain table.

  2. If couriers are run over a date range, records from files associated with later dates will be retained.

  3. If multiple files are loaded for the same date, records for the latest-loaded file are retained. File loading order depends on the behavior of the source system, but is generally deterministic.

  4. If the same primary key appears on multiple records on the same text-based file, the latest row on the file is retained.

Note

When using ingest queries, the above tiebreakers are unavailable, so upserting behavior can be nondeterministic. Ensure that you either specify a “last updated” field, or that your ingest query only returns a single record for each primary key, to ensure deterministic results.

To set the last updated field

  1. From the Sources page, open the menu for a feed, and then select Edit. The Feed Editor page opens.

  2. The last updated field is above the field list in the center of the page.

  3. Under Last Updated Field, choose how Amperity will determine priority: automatically generated, a field with an integer data type, or a field with a datetime data type (often the same field to which the update-dt merge rules semantic tag is applied).

  4. Click Activate

Tip

The number of records in a domain table may not match the number of records loaded by Amperity after loading data. Amperity uses an UPSERT process when loading data and determines priority based on the Last Updated Field. If a large difference exists take a close look at the primary key and determine if the primary key is the cause.

Truncate feed

You can empty the contents of a feed directly from the Sources page.

To truncate a feed

  1. From the Sources page, open the menu for a feed, and then select Truncate. The Truncate Feed dialog box opens.

  2. Click Truncate.

View sample data

You can view sample data while building a feed. Click the View sample file link to open the View Sample Page. This page shows up to 200 rows of data based on the current state of the feed schema.