About Semantic Tags

A semantic is a way to apply a common understanding to individual points of data across multiple data sources, even when data sources have different schemas, naming conventions, and levels of data quality. Assigning a semantic tag to individual columns in customer data is an important prerequisite to the Stitch process.

A semantic type is directly associated with data values that appear in customer data tables. Semantic types exist for columns that contain values like first names, email addresses, home addresses, cities, phone numbers, and so on. Amperity has many built-in semantic types, including groupings for personally identifiable information (PII), transactions, itemized transactions, and other consumer behaviors.

The following semantic groups are available for tagging fields in customer records and interaction records:

How semantic tags work

Semantic tags must be defined for every feed that will provide profile data to Stitch. This ensures that data from rich sources of profile data are brought into Amperity in a consistent manner, which improves the outcome of the Stitch process.

Semantic tagging works like this:

  1. A field in the customer’s system named “fname” stores an individual’s given name.

  2. A field in the customer’s system named “lname” stores the same individual’s last name.

  3. A field in the customer’s system named “primary-phone” stores a phone number.

  4. A field in the customer’s system named “date” stores an individual’s birthdate.

  5. And so on.

For those semantic tags, the feed should apply semantic tags like this:

Input Field

Semantic Tag

fname

given-name

lname

surname

primary-phone

phone

date

birthdate

This same pattern is applied to every customer data source that is brought into Amperity and it results in every single semantically-tagged field being analyzed by Amperity during the Stitch process in exactly the same way.

Amperity has built-in semantic tags for personally-identifiable information (PII), transactions, and behaviors. In addition, custom semantic tagging may be applied to fields when adding them can help identify unique individuals across massive data sets.

Profile semantic tags are used by Amperity for identity resolution, which is the process that builds a unified customer profile for all of your unique customers. All other semantic tags, such as for transactions and itemized transactions, are used to associate your customer’s interactions with your brand to their individual customer profiles.

What semantic tags does Stitch rely on?

Stitch relies on the following semantic tags to be applied to customer records:

  • given-name (first name) and surname (last name). In some cases, a full-name is inferred (if not available).

  • Other important profile details, such as birthdate, email, and phone.

  • The address, address2, city, state, and postal tags are combined to represent a complete physical address.

  • Other location details, such as country and company.

  • Additional profile details, when available, such as gender, generational-suffix (Jr., Sr., III, etc.), and title.

Stitch uses foreign keys to associate individual customers to their interactions with your brands.

Blocklist

A bad-values blocklist contains known values that appear frequently in data and should be excluded from the Stitch process.

The following table describes recommended patterns to use when defining semantic tags for a bad-values blocklist.

Semantic Name

Description

blv/datasource

Apply to the datasource column in the bad-values blocklist table.

blv/semantic

Apply to the semantic column in the bad-values blocklist table.

blv/value

Apply to the value column in the bad-values blocklist table.

Compliance

The following table describes the semantic tags that are used for CCPA and/or GDPR compliance workflows.

Semantic Name

Description

compliance/request-email

The email address for the customer. This is used to find their records in Amperity.

compliance/request-id

The tracking identifier for the customer’s compliance workflow. This ID should be provided by the customer and must be unique.

compliance/request-strategy

The compliance request strategy used for matching exact email data, semantic tags, and Amperity IDs.

compliance/request-type

The type of compliance request. Possible values: delete or data subject access request (DSAR).

Custom

A custom semantic is a string that is applied as a semantic tag when configuring a feed. Some use cases for custom semantics include specifying keys (primary, customer, and foreign), assigning ordinals or namespaces to email and phone PII semantics, or arbitrary strings to capture specific customer use cases.

The following table describes recommended patterns to use when defining custom semantics:

Custom Semantic Pattern

Description

itemized transactions

All custom semantics that are associated with itemized transactions must be prefixed with txn-item/.

keys

Keys are used to identify signals in source data that can be applied during the Stitch process.

loyalty-id

The identifier for a loyalty program that is associated with an individual customer record.

A loyalty ID may be associated with a customer key (ck) or a foreign key (fk-[namespace]), but otherwise follows all patterns associated with PII semantics.

Tip

Use additional custom semantic tags when the data contains more information about loyalty programs. Keep the prefix loyalty-, and then append an appropriate string to improve the user experience with downstream workflows. For example, if the data contains a field for loyalty points, use a custom semantic named loyalty-points to tag that field.

PII

All custom semantics that are associated with transactions should be prefixed with the PII semantic to which the custom semantic is most closely associated. For example: email-personal and email-work are most closely associated with the email semantic.

Note

In general, the use of custom semantics for PII is limited to namespace or ordinal variation of email and phone.

product

All custom semantics that are associated with products must be prefixed with pc/.

transaction

All custom semantics that are associated with order-level transactions must be prefixed with txn/.

Database

A field in a database table may be flagged as required, as unique, or as both required and unique. These flags are validated by Amperity. When the validation conditions are not met a warning is raised. These flags should be used for specific types of fields in a table to help sure that data within Amperity remains healthy and to ensure that downstream workflows are built on top of the correct data. Database field semantics are preceded by a db/ in the drop-down menu for semantics in the Database Editor.

Warning

Validation warnings appear in the Notifications pane as part of the notification for a database update. Each validation warning specifies the table name and the field name that failed validation.

The following semantics may be used to tag fields as required, as unique, or as both required and unique in database tables:

Semantic Name

Datatype

Icon

Description

required

Indicates the field is required to have a non-NULL value.

Note

This tag is assigned automatically to all fields that contain the Amperity ID.

A field that is assigned the required semantic requires every value for that field within the same table to have a non-NULL value, but does not require values to be unique. NULL values will cause an error during validation. All other values, including zero-length strings, will pass validation.

Note

A field may be assigned the required and unique semantics. Use this only for fields that must be present and unique, such as for the Amperity ID.

unique

Indicates the field is required to be a unique field in the customer 360 database.

A field that is assigned the unique semantic requires every value for that field within the same table to be unique. Fields with NULL values are ignored by validation, but all other values, including zero-length strings, must pass.

Note

A field may be assigned the required and unique semantics. Use this only for fields that must be present and unique, such as for the Amperity ID.

Email engagement

Email engagement semantic tags capture email events data, such as clicks, opens, bounces, opt-ins, opt-outs, and conversions from any email service provider (ESP) data source.

  1. Use email events semantic tags when raw email events data is sent directly to Amperity.

    Caution

    The data volume for email events data can be very large. Talk with your Amperity representative before applying email events semantic tags to raw email events data.

  2. Use email summary semantic tags when data is aggregated prior to sending it to Amperity.

Email events

Email events associate email summary statistics to brands, email addresses, regions, event types, event dates and times, and sender IDs.

Apply email event semantic tags to data sources that contain data for raw email events. Use the built-in list of semantics when building a feed or custom domain table. Email event semantics are prefixed with email-event/ in the semantics drop-down menu in the Feed Editor.

Important

Email events semantic tags should only be applied to data sources that provide at least 15 months of raw email events data. The storage requirements for this type of data can be significant. Talk with your Amperity representative about your downstream use cases prior to applying email events semantic tags to raw email events data sources.

The following table lists the tags available to this semantic group (with required semantic tags noted by “ Required.”):

Semantic Name

Datatype

Description

brand

String

Required.

The brand or company from which an email was sent.

email

String

Required.

The email address to which an email was sent.

event-datetime

String

Required.

The date and time at which email event occurred.

event-type

String

Required.

The type of email event. Possible values:

  • Open

  • Click

  • Sent

  • Opt-in

  • Bounced

  • Converted

region

String

The region or location from which an email was sent. The region or location is typically associated to a single brand.

send-id

String

Required.

The unique identifier for the email that was sent to an email address at a specific date and time. If a data source does not provide a send ID a unique key is generated.

Email summary

Email summary statistics provide fields that summarize customer engagement with your brand. Individual statistics include brand, email address, counts for opens and clicks by day (1, 3, 5, 7, and 14) and by month (3, 6, 9, and 12), engagement frequency, and engagement status.

Apply email summary semantic tags to data sources that contain email summary data for how customers interact with emails sent to them from your brands. Use the built-in list of semantics when building a feed. Email summary semantics are prefixed with email-summary/ in the semantics drop-down menu in the Feed Editor.

Warning

Email summary semantic tags cannot be applied to raw email events data.

The following table lists the tags available to this semantic group (with required semantic tags noted by “ Required.”):

Semantic Name

Datatype

Description

brand

String

Required.

The brand or company from which an email was sent.

email

String

Required.

The brand or company from which an email was sent.

email-clicks-last-x-days

Integer

The number of email clicks in the last 1, 3, 5, 7, or 14 days.

email-clicks-last-x-months

Integer

The number of email clicks in the last 3, 6, 9, or 12 months.

email-opens-last-x-days

Integer

The number of email opens in the last 1, 3, 5, 7, or 14 days.

email-opens-last-x-months

Integer

The number of email opens in the last 3, 6, 9, or 12 months.

first-email-open-datetime

Datetime

The date and time at which an email was first clicked.

first-email-send-datetime

Datetime

The date and time at which an email was sent.

most-recent-bounce-datetime

Datetime

The date and time for the most recent bounced email.

most-recent-email-click-datetime

Datetime

The date and time at which a customer most recently clicked a link or offer within an opened email.

most-recent-email-open-datetime

Datetime

The date and time at which a customer most recently opened an email.

most-recent-email-optin-datetime

Datetime

The date and time at which a customer most recently opted-in to receiving email.

most-recent-email-optout-datetime

Datetime

The date and time at which a customer most recently opted-out from receiving email.

most-recent-email-send-datetime

Datetime

The date and time at which an email was most recently sent.

region

String

The region or location from which an email was sent. The region or location is typically associated to a single brand.

Itemized transactions

An itemized transactions semantic is a way to identify brands, channels, stores, orders, products, quantities, per-item costs, total costs, and so on. Use itemized transactions semantics when a data source contains one row per item.

Itemized transaction semantics should be applied to data sources that contain interaction records for customer transactions that contain details for each individual item in a transaction. Transaction semantics may applied alongside other semantics, depending on the data source, and are often applied alongside transaction semantics in the same data source. Use the built-in list of semantics when building a feed. Itemized transaction semantics are prefixed with txn-item/ in the semantics drop-down menu in the Feed Editor.

Important

This collection of semantic tags is used by Amperity to build the Unified_Itemized_Transactions table. Each semantic tag is directly associated with a column in that table. For example, values identified by the is-cancellation, item-cost, and order-id semantic tags are added to the is_cancellation, item_cost, and order_id columns, respectively.

The Unified_Itemized_Transactions table contains every row of every stitched table with all transactional data summarized to the item level, and then coalesced into a single column for each unique combination of order ID and product ID. The order ID is associated with an Amperity ID.

Carefully review the data in the Unified_Itemized_Transactions table, including column values that are calculated from values in other columns in this table or the Unified_Transactions table, to verify their accuracy and to ensure that associated semantic tags have been applied correctly.

The following table lists the tags available to this semantic group (with required semantic tags noted by “ Required.”):

Semantic Name

Datatype

Description

[custom-semantic]

String

Required

Use a foreign key (recommended) or a custom semantic tag (such as customer-id) within interaction records to associate them to the Amperity ID.

Important

See fk-[namespace]. At least one field must have the [custom-semantic] or fk-[namespace] semantic tags applied to it to support downstream processing requirements for interaction records. You may apply more than one, or use a combination, of these semantic tags.

When a custom semantic tag is added to itemized transactions data it:

  • Must be a unique customer identifier that can be used to join interaction records (transactions and itemized transactions) to tables that contain the Amperity ID.

  • Must be unique for each order ID in the Unified_Itemized_Transactions table.

currency

String

Optional

Currency represents the type of currency that was used to pay for an item. For example: dollar.

Note

Currency must be consistent across all orders from the same data source.

digital-channel

String

Optional

The digital channel by which a transaction was made. For example: Facebook, Google Ads, email, etc.

Note

This semantic tag should only be used when purchase-channel specifies an online channel.

fk-[namespace]

String

Required

The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must be namespaced. For example: fk-customer, fk-interaction, fk-audience, or fk-brand.

A namespaced foreign key must be present in interaction records that contain transactions data. A foreign key may used along with a customer ID.

Important

See [custom-semantic]. At least one field must have the fk-[namespace] or [custom-semantic] semantic tags applied to it to support downstream processing requirements for interaction records. You may apply more than one, or use a combination, of these semantic tags.

When a foreign key is added to transactions data it:

  • Must match a foreign key in a table that is output by Stitch.

  • Must be well-distributed across the data source (a high percentage of values must not be 0).

  • Must be unique for each order ID in the Unified_Itemized_Transactions table.

  • May contain a NULL value.

is-cancellation

Boolean

Required

A flag that indicates if the item was cancelled.

Important

The field to which the is-cancellation semantic is applied must represent a value that is TRUE when items are cancelled and FALSE when items are purchases and NULL when the value is unknown.

Note

The is-cancellation and is-return semantic tags may not be applied to the same field.

is-return

Boolean

Required

A flag that indicates if the item was returned.

Important

The field to which the is-return semantic is applied must represent a value that is TRUE when items are returns and FALSE when items are purchases and NULL when the value is unknown.

Note

The is-cancellation and is-return semantic tags may not be applied to the same field.

item-cost

Decimal

Optional

Item cost is the cost to produce all units of an item.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

item-discount-amount

Decimal

Optional

Item discount amount is the discount amount that is applied to all units that are associated with a single item within a single transaction.

This value should equal item quantity multipled by unit discount amounts.

This value is used by Amperity for discount sensitivity analysis.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

item-discount-percent

Decimal

Optional

Item discount percent is the percentage discount that is applied to all units that are associated with a single item within a single transaction.

This value is used by Amperity for discount sensitivity analysis.

Note

This value must be between 0 and 1.

item-list-price

Decimal

Optional

The manufacturer’s suggested retail price (MSRP) for all units of this item.

The manufacturer’s suggested retail price (MSRP) is the price before shipping costs, taxes, and/or discounts have been applied. MSRP is sometimes referred to as the base price.

This value should equal item revenue plus item discount amount.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

item-profit

Decimal

Optional

Item profit represents the amount of profit that is earned when all units of an item are sold.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

item-quantity

Integer

Required

Item quantity is the total number of items in an order. When an item has been returned or an order has been cancelled, item quantity is the total number of items that were returned and/or cancelled.

Note

This value must be less than or equal to 0 when is-return or is-cancellation are true.

item-revenue

Decimal

Required

The total revenue for all units of an item, after discounts are applied. When an item has been returned or the order has been cancelled, the total revenue for all items that were returned and/or cancelled.

This value should equal item list price minus item discount amount.

Note

This value must be less than or equal to 0 when is-return or is-cancellation are true.

item-subtotal

Decimal

Optional

An item subtotal is the amount for an item, before discounts are applied.

This value should equal unit list price times item quantity.

This value is used by Amperity to calculate discounts for discount sensitivity analysis.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

item-tax-amount

Decimal

Optional

An item tax amount is the total amount of taxes that are associated with the purchase of an item.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

order-datetime

Datetime

Required

Order datetime is the date (and time) on which an order was placed.

The order date:

  • Must have a consistent time zone across all dates in the transactions data.

  • Should be a local time zone.

  • Should be a timestamp, which is converted to datetime automatically when a date is present in the timestamp.

  • When is-return is TRUE, the date and time on which the order was returned.

  • When is-cancellation is TRUE, the date and time on which the order was cancelled.

Note

Other dates associated with an order that are not specific to a transactions, such as dates associated with hotel stays and reservations, should be added to the Unified_Product_Catalog table.

order-id

String

Required

An order ID is the unique identifier for the order and links together all of the items that were part of the same transaction. When an item has been returned or when an order has been cancelled, the order ID is the unique identifier for the original order, including the returned or cancelled items.

Note

The order ID should never change, even when an item in the order is returned or cancelled.

Important

If order IDs are recycled and/or are otherwise not guaranteed to be unique over time, the unique identifier for the order must be updated to be a combination of the order ID and the date on which the order occurred. This must be done using domain SQL similar to: CONCAT(order_id, order_date).

This field is the primary key. Each unique order ID:

  • Must be associated with the pk semantic tag.

  • Must be the unique ID for the original order when items in the order have been returned or when the order is cancelled.

Note

For data that contains itemized transactions, where a single transaction includes more than one of the same item, the order ID will appear more than once.

payment-method

String

Optional

A payment method is how a customer chose to pay for the items they have purchased. For example: credit card, gift card, or cash.

product-id

String

Required

The unique identifier for a product.

A stock keeping unit (SKU) is an identifier that captures all of the unique details of any individual product, including specific attributes that differentiate by color, size, material, and so on.

For example, a shirt with the same color and material, but with three different sizes would be represented by three unique SKUs and would also be represented by three unique product IDs.

For data that contains itemized transactions, where a single transaction includes more than one of the same product, the product ID must appear only once per order ID in the Unified_Itemized_Transactions table. Multiple instances of the same product must be added to the item quantity in the same row.

Caution

Every customer has their own definition for SKUs and product IDs. Be sure to understand this defintition before applying semantic tags to fields with product IDs to ensure they accurately reflect the customer’s definition.

purchase-brand

String

Optional

The brand for which a transaction was made.

Caution

This semantic tag should only be used when interaction records contain transaction data for more than one brand.

purchase-channel

String

Optional

A purchase channel is the channel from which a transaction was made. For example: in-store or online.

store-id

String

Required

A store ID is a unique identifier that is identified with the location of a store.

unit-cost

Decimal

Optional

Unit cost is the cost to produce a single unit of one item.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

unit-discount-amount

Decimal

Optional

Unit discount amount is the discount amount that is applied to a single unit of one item.

This discount is often applied to all units of the same item within a single transaction.

This value is used by Amperity for discount sensitivity analysis.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

unit-list-price

Decimal

Optional

The manufacturer’s suggested retail price (MSRP) for a single unit of an item.

The manufacturer’s suggested retail price (MSRP) is the price before shipping costs, taxes, and/or discounts have been applied. MSRP is sometimes referred to as the base price.

This value should equal the unit discount amount plus the unit subtotal.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

unit-profit

Decimal

Optional

Unit profit represents the amount of profit that is earned when a single unit of an item is sold.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

unit-revenue

Decimal

Optional

The total revenue for a single unit of an item. When an item has been returned or the order has been cancelled, the total revenue for a single unit of an item that was returned and/or cancelled.

Note

This value must be less than or equal to 0 when is-return or is-cancellation are true.

unit-subtotal

Decimal

Optional

A unit subtotal is the amount for a single unit of one item, before discounts have been applied.

This value is used by Amperity to calculate discounts for discount sensitivity analysis.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

unit-tax-amount

Decimal

Optional

A unit tax amount is the total amount of taxes that are associated with a single unit.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

Keys

Keys are used to identify signals in source data that can be applied during the Stitch process. For example, a table that contains customer records automatically assigns the pk semantic to any field identified as a primary key. For tables that contain interaction records, a foreign key is often used to associate important fields for interaction records to primary keys for customer records. This allows interaction records to be correlated with the Amperity ID as an outcome of the Stitch process even though interaction records are (typically) not processed by Stitch for the purpose of identity resolution.

Blocking keys (bk)

A blocking key defines a specific combination of characters to be used as a blocking strategy. For example, the first three characters in given-name, the first character in surname, and birthdate represent a blocking key.

You can define custom blocking labels using bk-[label], and then use them as a blocking strategy for Stitch.

Caution

Use blocking keys carefully and be sure to verify that Stitch results contain the desired outcome.

Customer keys (ck)

The ck semantic tag may be applied to a column that contains pre-existing, tenant-specific customer IDs. When customer keys are applied, Amperity compares them to the Amperity ID as part of the deduplication process.

Tip

What happens to customer keys in the Unified_Coalesced table?

  • Records may have NULL customer keys.

  • There may be only one customer key per data source.

  • There may be multiple customer keys per Amperity ID. This is because customer keys may also be tagged as foreign keys.

Foreign keys (fk)

A foreign key is a column in a data table that acts as primary key and can be used for deterministic matching of records. A record pair is assigned an exact match score (5.0) when foreign keys contain identical values during pairwise comparison.

The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must be namespaced. For example: fk-customer, fk-interaction, fk-audience, or fk-brand.

A foreign key semantic tag may be applied to any column in any data source, but should be associated with a field that can also act as a primary key for that data source and is present in other tables.

A foreign key may be used once within a table. A table may have more than one foreign key. For example, if a data source contains customer and audience identifiers, apply fk-customer to the customer identifier and fk-audience to the audience identifier.

Amperity is configured by default to prioritize foreign key matching over separation key unmatching.

The most common use cases for foreign keys associate fields that act like primary keys within interaction records to the primary keys within customer records, such as:

  • A customer identifier for transactions and itemized transactions associated to the primary key in a loyalty table.

  • A strong identifier within clickstream data to the primary key in a customer profile table.

Use foreign keys to define meaningful connections across all types of data sources to enable deterministic matching of record pairs during pairwise comparison.

Tip

What happens to foreign keys in the Unified_Coalesced table?

  • Records may have NULL foreign keys.

  • There may be multiple foreign keys in the data source, but there may not be duplicate foreign keys.

  • There may be multiple foreign keys per Amperity ID.

  • There should not be multiple Amperity IDs per foreign key.

Note

If foreign keys are linked together by a trivial duplicate they will appear in the Unified_Preprocessed_Raw table as a comma-separated list.

Important

A foreign key may also be tagged as a separation key. A foreign key applies when two records have the same value for the key. A separation key applies when two records have different values for the key.

Tagging the same field as both foreign and separation keys can be useful when customer data has a strong identifier that is also associated with an important profile semantic tag, such as phone or email.

Tip

In an unusual case where a foreign key is associated with a field to which a profile (PII) semantic tag is applied be sure to configure the column created by the foreign key in customer 360 database tables to hide values from users without permission to view PII.

Primary keys (pk)

A primary key is a column in a data table that uniquely identifies each row in a data source or data table.

The combination of data source and primary key allows Amperity to uniquely identify every row in every data table across the entirety of customer data input to Amperity.

Tip

What happens to primary keys in the Unified_Coalesced table?

  • Each record in the Unified_Coalesced table must have a primary key.

  • A primary key is unique within a data source, but that primary key may not be unique across all data sources.

  • There can be only one primary key per data source; each record in the Unified_Coalesced table can be uniquely identified by the pair of values defined in the “datasource” and “pk” columns.

  • Each record in the Unified_Coalesced table may only be associated with a single Amperity ID.

Separation keys (sk)

A separation key (sk) is used for deterministic unmatching of records.

The sk-[semantic] semantic tag is a separation key that is applied to profile data.

Important

By default, Amperity derives separation keys for sk-given-name and sk-generational-suffix.

More than one separation key may be applied within a table. All separation key semantic tags must be namespaced to match the profile semantic for the same field. For example: sk-birthdate matches birthdate and sk-surname matches surname.

A separation key may not be applied more than once within the same table.

Warning

Use this semantic only when the classifier for Stitch model selection is set to :general-ordinal-sk-priority to help address overclustering probems with similar names and similar households.

A record pair is assigned a no conflict score (0.0) when separation keys contain conflicting values during pairwise comparison. A record pair is split into two clusters when both pairs contain a non-NULL value.

Note

The following separation keys do not consider approximately matched values to be conflicting values:

  • sk-given-name For example, Mike and Michael are not conflicting values.

  • sk-birthdate For example, 1981-09-08 and 1981-08-09 are not conflicting values.

  • sk-generational-suffix

Use separation keys to prevent Stitch from matching records during pairwise comparison.

Important

A separation key may also be tagged as a foreign key. Tagging the same field as a foreign and separation key can be useful when customer data has a strong identifier that is also associated with an important profile semantic tag, such as phone or email.

Amperity is configured by default to prioritize foreign key matching over separation key unmatching.

Loyalty

The following table lists the tags available to this semantic group (with required semantic tags noted by “ Required.”):

Product catalog

Product catalog semantics should be applied to data sources that contain product catalog data. Product semantics may applied alongside other semantics, depending on the data source. Use the built-in list of semantics when building a feed. Product semantics are prefixed with pc/ in the semantics drop-down menu in the Feed Editor.

Note

What is the Unified_Product_Catalog table? The Unified_Product_Catalog table represents the taxonomy for your products and brands. You may apply product catalog semantic tags to any table that contains the taxonomy for your products and brands, and then use the product identifier as the basis for AmpIQ predictive modeling.

The following table lists the semantic tags for products:

Semantic Name

Datatype

Description

product-category

String

Optional

A category to which the product belongs. Use this semantic tag to identify how a customer categorizes individual products within their product catalog.

product-gender

String

Optional

Apply this as a custom semantic tag to a fields that contain a list of gender options for products. For example: F, M, unisex, NULL (for unknown).

product-id

String

Optional

The unique identifier for a product.

Important

AmpIQ predictive modeling requires a product catalog to contain between 20-2000 unique product IDs. A product ID is often associated with a stock keeping unit (SKU).

A stock keeping unit (SKU) is an identifier that captures all of the unique details of any individual product, including specific attributes that differentiate by color, size, material, and so on.

For example, a shirt with the same color and material, but with three different sizes would be represented by three unique SKUs and would also be represented by three unique product IDs.

Each customer has their own definition for product IDs and SKUs. Be sure to understand this defintition before applying semantic tags to fields with product IDs to ensure they accurately reflect the customer’s definition and meet the requirements for AmpIQ predictive modeling (if enabled).

product-description

String

Optional

A description of the product.

product-subcategory

String

Optional

A subcategory or secondary variant to which a product belongs.

Profile (PII)

Personally identifiable information (PII) is any data that could potentially identify a specific individual. PII data includes details like names, addresses, email addresses, and other profile attributes, but can also include attributes like a loyalty number, customer relationship management (CRM) system identifiers, and foreign keys in customer data.

A PII semantic assigns consistency to customer data to ensure that PII data is more easily discovered across many sets of data.

Profile semantics should be applied to customer records that contain three (or more) good sources of PII data. Profile semantics should be applied to interaction records only when customer records are stored alongside transaction details and when there are three (or more) good sources of PII data.

The following table lists the tags available to this semantic group:

Semantic Name

Datatype

Description

address

String

The address that is associated with the location of an individual customer record. For example: 123 Main Street.

address2

String

Additional address information, such as an apartment number or a post office box, that is associated with the location of an individual customer record. For example: Apt #9.

birthdate

Date

The date of birth that is associated with an individual customer record.

Tip

A field that is tagged with the birthdate semantic tag will return an error when the feed is saved and the data type is not set to Date.

city

String

The city that is associated with the location of an individual customer record.

company

String

The company, typically an employer or small business, that is associated with an individual customer record.

country

String

The country that is associated with the location of an individual customer record.

Important

A field to which the country semantic tag is applied is added to the Unified_Coalesced table, but is otherwise ignored by Stitch.

create-dt

Apply the create-dt semantic tag to columns in customer records that identify when the data was created. The field to which this semantic is applied must be a datetime field type.

email

String

The email address that is associated with an individual customer record. A customer record may be associated with multiple email addresses.

full-name

String

A combination of given name (first name) and surname (last name) that is associated with an individual customer record and is stored as a combined value in a single field within customer data. A full name may include a middle name or initial.

gender

String

The gender that is associated with an individual customer record.

Supported values for fields associated with the gender semantic tag include:

  • F

  • FEMALE (is normalized to F)

  • M

  • MALE (is normalized to M)

  • MAN (is normalized to M)

  • NONE (is treated as NULL)

  • WOMAN (is normalized to F)

generational-suffix

String

The suffix that identifies to which family generation a customer record belongs. For example: Jr., Sr. II, and III.

Caution

The generational-suffix semantic tag should only be applied once per feed and only to a field that contains the suffix separated from the first and last names.

given-name

String

The first name that is associated with an individual customer record.

Caution

The given-name semantic tag may only be applied once per feed.

phone

String

The phone number that is associated with an individual customer record. A customer record may be associated with multiple phone numbers.

Tip

A field that is tagged with the phone semantic tag will return an error when the feed is saved and the data type is not set to String.

postal

String

The zip code or postal code that is associated with the location of an individual customer record.

A full 9-digit zip code is derived from fields that contain zip code data.

Tip

A field that is tagged with the postal semantic tag will return an error when the feed is saved and the data type is not set to String.

state

String

The state or province that is associated with the location of an individual customer record.

surname

String

The last name that is associated with an individual customer record.

Caution

The surname semantic tag may only be applied once per feed.

title

String

The title that precedes a full name that is associated with an individual customer record. For example: Mr., Mrs, and Dr.

update-dt

Apply the update-dt semantic tag to columns in customer records that identify when the data was last updated. The field to which this semantic is applied must be a datetime field type. At least one customer record must have this semantic tag applied to ensure that the update_dt column is created in the Unified_Coalesced table and to ensure that the Merged_Customers table behaves correctly.

Stitch labels

Stitch labels identify when a single customer record was incorrectly merged together (overclustered) or when two customer records were incorrectly split apart (underclustered).

The following table describes recommended patterns to use when defining semantic tags for Stitch labels.

Stitch Labels Semantic Pattern

Description

sl/datasource

Apply this semantic tag to the datasource column.

sl/label-id

Apply this semantic tag to the label_id column.

sl/partition-id

Apply this semantic tag to the partition_id column.

sl/semantic

Apply this semantic tag to columns that are associated with the matching value, for example to a column that matches the email semantic tag.

sl/value

Apply this semantic tag to the value column.

Transactions

A transactions semantic is a way to identify brands, channels, stores, orders, products, quantities, per-item costs, total costs, and so on. Use transactions semantics when a data source contains one row per order.

Transactions semantics should be applied to data sources that contain interaction records for customer transactions. Transactions semantics may applied alongside other semantics, depending on the data source. Use the built-in list of semantics when building a feed. Transactions semantics are prefixed with txn/ in the semantics drop-down menu in the Feed Editor.

Important

This collection of semantic tags is used by Amperity to build the Unified_Transactions table. Each semantic tag is directly associated with a column in that table. For example, values identified by the digital-channel, order-discount-amount, and payment-method semantic tags are added to the digital_channel, order-discount-amount, and payment_method columns, respectively.

The Unified_Transactions table contains one row for each unique transaction record, with each order ID associated to an Amperity ID.

Carefully review the data in the Unified_Transactions table, including column values that are calculated from values in other columns in this table or the Unified_Itemized_Transactions table, to verify their accuracy and to ensure that associated semantic tags have been applied correctly.

The following table lists the tags available to this semantic group (with required semantic tags noted by “ Required.”):

Semantic Name

Datatype

Description

currency

String

Optional

Currency represents the type of currency that was used to pay for an item. For example: dollar.

Note

Currency must be consistent across all orders from the same data source.

customer-id

String

Required

A custom semantic tag that is applied to interaction records to identify a field that is used in downstream processes to associate interaction records to the Amperity ID.

A customer ID may appear once for each order ID in the transactions table. A customer ID may used along with a foreign key.

Important

See fk-[namespace]. At least one field must have the customer-id or fk-[namespace] semantic tags applied to it to support downstream processing requirements for interaction records. You may apply more than one, or use a combination, of these semantic tags.

When a customer key is added to transactions data it:

  • Must be a unique customer ID that can be used to join interaction records (transactions and itemized transactions) to tables that contain the Amperity ID.

  • Must be unique for each order ID in the Unified_Transactions table.

digital-channel

String

Optional

The digital channel by which a transaction was made. For example: Facebook, Google Ads, email, etc.

Note

This semantic tag should only be used when purchase-channel specifies an online channel.

fk-[namespace]

String

Required

The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must be namespaced. For example: fk-customer, fk-interaction, fk-audience, or fk-brand.

A namespaced foreign key must be present in interaction records that contain transactions data. A foreign key may used along with a customer ID.

Important

See customer-id. At least one field must have the fk-[namespace] or customer-id semantic tags applied to it to support downstream processing requirements for interaction records. You may apply more than one, or use a combination, of these semantic tags.

When a foreign key is added to transactions data it:

  • Must match a foreign key in a table that is output by Stitch.

  • Must be well-distributed across the data source (a high percentage of values must not be 0).

  • Must be unique for each order ID in the Unified_Transactions table.

order-cancelled-quantity

Integer

Required

The total number of items in the original transaction that were cancelled.

This value should match the sum of all items in itemized transactions that were cancelled for the same order ID.

Note

This value must be less than or equal to 0 when is_cancelled is TRUE.

order-cancelled-revenue

Decimal

Required

The total amount of revenue for all cancelled items in the transaction.

Note

This value must be less than or equal to 0.

order-cost

Decimal

Optional

Order cost represents the total cost of goods sold (COGS) for a single transaction.

Cost of goods sold (COGS) are the direct costs of producing goods that are sold by a brand, including the costs of materials and labor to produce the item, but excluding indirect expenses like distribution or sales.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

Warning

Only one of order-profit and order-cost may be present for a transaction.

order-datetime

Datetime

Required

Order datetime is the date (and time) on which an order was placed.

The order date:

  • Must have a consistent time zone across all dates in the transactions data.

  • Should be a local time zone.

  • Should be a timestamp, which is converted to datetime automatically when a date is present in the timestamp.

Note

Other dates associated with an order that are not specific to a transactions, such as dates associated with hotel stays and reservations, should be added to the Unified_Product_Catalog table.

order-discount-amount

Decimal

Optional

Order discount amount is the total discount amount that is applied to the entire order.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

This value is used by Amperity for discount sensitivity analysis.

Caution

This value should match your definition of an order-level discount amount. For example, this value may be associated with order value and it may be associated with a subtotal. Use domain SQL to update this field for any desired calculation when this semantic tag cannot be applied to a single field.

order-discount-percent

Decimal

Optional

Order discount percent is the percentage discount that is applied to the order value for the entire transaction, in addition to any item or unit-specific discount percentages.

This value may be used as an input to order discount amount.

Note

This value must be between 0 and 1.

This value is used by Amperity for discount sensitivity analysis.

Caution

This value should match your definition of an order-level discount percentage. For example, this value may be associated with order value and it may be associated with a subtotal. Use domain SQL to update this field for any desired calculation when this semantic tag cannot be applied to a single field.

order-id

String

Required

An order ID is the unique identifier for the order and links together all of the items that were part of the same transaction. When an item has been returned or when an order has been cancelled, the order ID is the unique identifier for the original order, including the returned or cancelled items.

Note

The order ID should never change, even when an item in the order is returned or cancelled.

Important

If order IDs are recycled and/or are otherwise not guaranteed to be unique over time, the unique identifier for the order must be updated to be a combination of the order ID and the date on which the order occurred. This must be done using domain SQL similar to: CONCAT(order_id, order_date).

This field is the primary key. Each unique order ID:

  • Must be associated with the pk semantic tag.

  • Must appear only once in the Unified_Transactions.

  • Must match at least one order ID from the Unified_Itemized_Transactions table. (The order ID in that table may be associated to multiple items within a single transaction.)

order-list-price

Decimal

Optional

The total value for a transaction, as defined by the manufacturer’s suggested retail price (MSRP) for all units of this item.

The manufacturer’s suggested retail price (MSRP) is the price before shipping costs, taxes, and/or discounts have been applied. MSRP is sometimes referred to as the base price.

This value should match the sum of item list price amounts in itemized transactions that are associated with the same order ID.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

order-profit

Decimal

Optional

Order profit is the amount of profit that is earned from a single transaction.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

Warning

Only one of order-profit and order-cost may be present for a transaction.

order-quantity

Integer

Required

Order quantity is the total number of individual items associated with the transaction.

This value should match the sum of all items in itemized transactions that have not been returned or cancelled for the same order ID.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

order-returned-quantity

Integer

Required

Order returned quantity is the total number of items in the original transaction that were returned.

This value should match the sum of all items in itemized transactions that were returned for the same order ID.

Note

This value must be less than or equal to 0 when is_return is TRUE.

order-returned-revenue

Decimal

Required

Order returned revenue total amount of revenue for all returned items in a transaction.

This value should match the sum of revenue for all items in itemized transactions that were returned for the same order ID.

Note

This value must be less than or equal to 0 when is_return is TRUE.

order-revenue

Decimal

Required

The total amount of revenue for all items in a transaction after discounts are applied, ignoring returns and/or cancellations.

This value should match the sum of revenue for all items in itemized transactions that were not returned or cancelled for the same order ID.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

order-shipping-amount

Decimal

Optional

The order shipping amount is the total cost of shipping all items in the same transaction.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

order-subtotal

Decimal

Optional

An order subtotal is the amount for an order, before discounts are applied.

This value should match the sum of item subtotal revenue in itemized transactions for the same order ID.

order-tax-amount

Decimal

Optional

An order tax amount is the total amount of taxes that are associated with an entire order.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

payment-method

String

Optional

A payment method is how a customer chose to pay for the items they have purchased. For example: credit card, gift card, or cash.

purchase-brand

String

Optional

The brand for which a transaction was made.

Caution

This semantic tag should only be used when interaction records contain transaction data for more than one brand.

purchase-channel

String

Optional

A purchase channel is the channel from which a transaction was made. For example: in-store or online.

store-id

String

Required

A store ID is a unique identifier that is identified with the location of a store.

sum-item-discount-amount

Decimal

Optional

The sum of discount amounts is the total of all discount amounts that were applied to each item within a transaction.

This value should match the sum of item discount amounts in the itemized transactions for the same order ID.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.

sum-item-revenue

Decimal

Optional

The sum of itemized revenue for the original order, not including returns and/or cancellations.

This value may be used as an input to order revenue. This value should match the sum of item revenue in itemized transactions, not including returns or cancellations, for the same order ID.

Note

This value must be greater than or equal to 0 for purchases, but less than or equal to 0 for returns or cancellations.