Enable predictive models

Predictive models are a feature of Amperity that predict customer behavior, such as customer lifetime value (pCLV), churn propensity, product affinity, and lifecycle events.

Each predictive model outputs a collection of predicted attributes. A predicted attribute is represented by a column in a table that can be used during segmentation. Use predicted attributes to define segments that filter customers based on predicted behavior.

Caution

Predictive models are an optional component of Amperity. If you plan to enable predictive models it is recommended to review the data requirements, and then take steps to configure columns and tables for predictive models at the same time.

Predictive models workflow

Predictive models are built outside of Amperity and require data from certain tables in the customer 360 database to be made available to it on a daily basis, after which the results are then pulled to Amperity and the related data science tables are updated.

The steps in the predictive models workflow include:

  1. The Merged Customers, Unified Transactions, and Unified Itemized Transactions tables are exported on a daily basis from Amperity to an Amazon S3 bucket that is used for predictive models.

  2. Predictive models are run against the Merged Customers, Unified Transactions, and Unified Itemized Transactions tables and the results are returned to the Amazon S3 bucket.

  3. A set of couriers are configured to pull these results from the Amazon S3 bucket to your tenant on a daily basis.

  4. Predictive models tables are loaded to Amperity using feeds, after which the associated predictive models domain tables are refreshed.

  5. After the Stitch process is complete, refreshed predictive models tables are passed through to your customer 360 database.

  6. Queries and segments that rely on predictive models table attributes will use the refreshed data the next time they are run and any new queries and segments will have access to the refreshed data.

  7. A campaign will use refreshed data the next time it is scheduled to run.

Predictive models components

What components of predictive models are visible within Amperity?

Tab

Description

Sources

Feeds that define the data model for predictive modeling tables within Amperity:

  • PCLV

  • PCLV - Changes

  • Predicted Affinity ProductAttribute

  • Predicted Affinity Audience ProductAttribute

Domain tables that are refreshed on a daily basis:

  • Predicted Pclv

  • Predicted PclvChanges

  • Predicted Affinity ProductAttribute

  • Predicted Affinity Audience ProductAttribute.

Couriers to pull the results of predictive modeling to Amperity on a daily basis:

  • PCLV (including PCLV Changes and PCLV Metadata)

  • PCLV - Validation

  • Predicted Affinity

  • Predicted Affinity Audience

Customer 360

A series of passthrough database tables:

  • Predicted CLV Attributes

  • Predicted Affinity ProductAttribute

Queries

and

Segments

Access to predictive models attributes from the following tables: Predictive pCLV Attributes and Affinity (for recommended audiences). Updated attributes are applied automatically the next time you run the segment or query.

Destinations

A series of orchestrations that use database exports to send the following tables to predictive models on a daily basis:

  • Merged Customers

  • Unified Itemized Transactions

  • Unified Transactions

Verify prerequisites

Verify that predictive models have met all prerequisites.

Data requirements

Predictive models have the same requirements as segments and campaigns, along with the following additional requirements:

  1. Data sources must provide at least four years of historical data to support predictive models; five (or more) years is recommended. Providing as much historical data as possible improves model performance and prediction quality.

  2. A defined product catalog with 20-2000 distinct values is required to support recommended audience sizes for product affinity.

    Important

    SKU-level recommendations are not supported.

  3. If required fields are not available in customer data, maintain the column on the table with NULL values.

The following sections describe requirements for columns and tables, and fields.

Columns and tables

Predictive models use specific columns across the following tables:

  • Merged Customers

  • Transaction Attributes

  • Unified Customer

  • Unified Itemized Transactions

  • Unified Transactions

The Unified Customer table is built as part of Stitch output when data sources with customer records have profile (PII) semantic tags applied.

The Transaction Attributes, Unified Itemized Transactions, and Unified Transactions are built as part of Stitch output when data sources with interactions records have transactions and itemized transactions semantic tags applied.

The Merged Customers table is a version of the Unified Customer table that you must build in the customer 360 database to filter profile data. The Merged Customers table must be updated to roll-up transactions and itemized transactions.

These tables contain fields that are important components of predictive models.

Fields

The following columns are required by predictive models and must exist in the following tables as an output of the Stitch process or be part of the configuration of the customer 360 database:

Column Name

Required for …

Description

amperity_id

Predictive models

The unique identifier that is assigned to clusters of customer records that all represent the same individual. The Amperity ID does not replace primary and foreign keys, but exists alongside them within unified profiles.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

  • Unified Itemized Transactions

  • Unified Transactions

  • Transaction Attributes

birthdate

Predictive models

The date of birth that is associated with a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

city

Predictive models

The city that is associated with the location of a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

country

Predictive models

The country that is associated with the location of a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

digital_channel

Predictive models

The digital channel through which a transaction was made. For example: Facebook, Google Ads, email, etc.

This column must be available from the following tables in the customer 360 database:

  • Unified Itemized Transactions

  • Unified Transactions

email

Predictive models

The email address that is associated with a customer. A customer may have more than one email address.

This column must be available from the following table in the customer 360 database:

  • Merged Customers

gender

Predictive models

The gender that is associated with a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

given_name

Predictive models

The first name that is associated with a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

order_id

Predictive models

An order ID is the unique identifier for the order and links together all of the items that were part of the same transaction. When an item has been returned or when an order has been canceled, the order ID is the unique identifier for the original order, including the returned or canceled items.

This column must be available from the following tables in the customer 360 database:

  • Unified Itemized Transactions

  • Unified Transactions

order_datetime

Predictive models

Order datetime is the date (and time) on which an order was placed.

This column must be available from the following tables in the customer 360 database:

  • Unified Itemized Transactions

  • Unified Transactions

order_revenue

Predictive models

The total amount of revenue for all items in a transaction after discounts are applied, ignoring returns and/or cancellations.

This column must be available from the following table in the customer 360 database:

  • Unified Transactions

postal

Predictive models

The zip code or postal code that is associated with the location of a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

purchase_brand

Predictive models

The brand associated with a customer interaction.

This column must be available from the following table in the customer 360 database:

  • Unified Itemized Transactions

  • Unified Transactions

purchase_channel

Predictive models

A channel is the purchase method used by a customer. For example: online, point-of-sale, loyalty, in-store, and so on.

This column must be available from the following table in the customer 360 database:

  • Unified Itemized Transactions

  • Unified Transactions

state

Predictive models

The state or province that is associated with the location of a customer.

This column must be available from the following tables in the customer 360 database:

  • Merged Customers

  • Unified Customer

surname

Predictive models

The last name that is associated with a customer.

This column must be available from the following table in the customer 360 database:

  • Merged Customers

Request to enable

After you have configured Amperity for the recommended patterns and workflows for data sources that contain interactions records, segments and campaigns are enabled, and all verification steps have been completed, contact your Amperity representative and request to enable predictive models.

Important

By default, predictive models are fit across all brands for a tenant, even if a tenant has multiple brands. To fit predictions for each brand:

  1. A customer must purchase predictive models for each individual brand.

  2. The tenant must be configured to support brand-specific databases, with a database for each brand with dedicated Merged Customers, Unified Itemized Transactions, and Unified Transactions tables.

  3. A taxonomy that defines product categories and brands.

    This taxonomy must be defined in the Customer 360 database by extending the Unified Itemized Transactions table to include all of the columns that define the taxonomy.

The typical process for setting up predictive models follows a series of general steps:

  1. (~1 week) Ensure that data is made available to Amperity that meets the semantic tagging requirements (along with any predicted-models-specific fields), including for transactions, itemized transactions, and product catalog. Use domain SQL to perform pre-Stitch data transforms as necessary.

    Verify that data is being processed correctly to the following tables in the customer 360 database: Unified Transactions, Unified Itemized Transactions, Transaction Attributes, and Transaction Attributes Extended.

  2. (~1 week) Amperity begins the process of training the models, validating backtesting, and optimizing predictions.

  3. (~1-2 weeks) Amperity performs stability testing on the models and the predictive results. This step may require some tuning of the model and will take longer for very large data sets.

Once this process is complete, the predicted attributes are available on a 1-day delayed rolling basis. These tables are added to your tenant as a feed in the Sources page and are available for use with segments and queries.

Configure predictive tables

A series of predictive tables is added to your tenant under a feed in the Sources page named Predicted. This is done as part of the enablement process for predictive models. These tables are refreshed automatically on a daily basis, are not processed by Stitch, and are made available to your customer 360 database, the Queries page, and the Segments page as passthrough tables:

  • Predicted Affinity ProductAttribute

  • Predicted Pclv

  • Predicted PclvChanges

  • Predicted Affinity Audience ProductAttribute

where ProductCategory is a string that best represents the unique products and brands within your tenant.

  1. The Predicted Pclv series of tables provides data to the Predicted CLV Attrbutes table, which represents the current view of your customer and transaction data.

  2. The Predicted Affinity ProductAttribute and Predicted Affinity Audience ProductAttribute tables are used for product affinity and are associated with the Affinity table.

Make the following updates in your customer 360 database:

Add Affinity table

An Affinity table associates individual customers to the products they are most likely to purchase. Use an Affinity table to help deliver personalized experiences to your customers.

To add a Predicted Affinity table you must extend the customer 360 database to add a table that joins the Predicted Affinity ProductAttribute passthrough table to the Predicted Affinity Audience ProductAttribute passthrough table.

To add the Affinity table

  1. From the Customer 360 page, under All Databases, select the menu for the customer 360 database, and then click Edit.

  2. From the Database Editor, click Add Table.

  3. Name the table “Predicted Affinity” (or some other name that clearly identifies this table as the product affinity table for your tenant).

  4. Set the build mode to SQL.

  5. Add the following SQL:

    SELECT
      r.product_attribute AS `product_sub_category`
      ,r.amperity_id
      ,r.score
      ,r.ranking
      ,r.ranking <= s.audience_size_small AS audience_size_small
      ,r.ranking <= s.audience_size_medium AS audience_size_medium
      ,r.ranking <= s.audience_size_large AS audience_size_large
    FROM Predicted_Affinity_ProductAttribute AS r
    LEFT JOIN Predicted_Affinity_Audience_ProductAttribute AS s
    ON r.product_attribute = s.product_attribute
    
  6. Click Validate to verify the SQL runs without error.

  7. Click Next. This opens the Database Table Definition page.

  8. Add a table description. This enables a tooltip that is visible from other areas in Amperity.

  9. Verify that the db/required and db/unique database field semantics were applied to the amperity_id column.

  10. Under Version History, select Enable table version history.

  11. Click Save.

Add pCLV attributes table

The Predicted CLV Attributes table contains information, for each individual Amperity ID, about predicted future spend and the probability of churn.

Add the Predicted CLV Attributes table to the customer 360 database.

To add the Predicted CLV Attributes table

  1. From the Customer 360 page, under All Databases, select the menu for a database, and then click Edit.

  2. Click Add Table.

  3. Set Build Mode to “Passthrough”, and then select the Predicted CLV Attributes table from the drop-down menu.

  4. Hide the table from the Visual Segment Editor by verifying that Show in VSE? is unselected.

  5. Click Activate to update the customer 360 database with your changes.

Define product catalog

A product catalog must be defined, and then joined to the Unified Itemized Transactions table.

  1. A data source must be available with product catalog semantic tags applied.

    Note

    Amperity documentation refers to this data source as the Unified Product Catalog table, but it may have a different name in your tenant.

  2. Apply the product-id semantic tag to product catalog data as required by predictive modeling; the product-category, product-description, and product-subcategory semantic tags are optional.

    Important

    Product affinity modeling requires the field that defines product categories – product-category in the Unified Product Catalog table – to contain between 20 and 2000 unique values.

  3. A list of custom attributes may be defined. These must be available as columns in the Customer 360 database and must be added to the Unified Itemized Transactions table.

  4. Tables that define product catalog semantic tags and custom attributes are managed as a data source (and associated feed) and should be maintained as part of your organization’s regular Amperity workflow.

Extend the Unified Itemized Transactions to include the taxonomy that defines your products and brands:

  1. Add the column to which the product-id semantic tag was applied

  2. Add other columns to which product catalog semantic tags were applied

  3. Add any columns that define custom attributes, as necessary

The following example extends the Unified Itemized Transactions table using a table named Unified Product Catalog that has columns for each of the optional product catalog semantic tags:

SELECT
  uit.*
  ,upc.product_category AS `product_category`
  ,upc.product_description AS `product_description`
  ,upc.product_subcategory AS `product_subcategory`
FROM Unified_Itemized_Transactions uit
LEFT JOIN Unified_Product_Catalog upc ON uit.product_id = upc.product_id

Some brands require a single parameter to define a taxonomy and some require more than one. The following example shows a taxonomy that defines custom attributes for a class and subclass:

SELECT
  uit.*,
  ,product_class AS `product_class`
  ,product_subclass AS `product_subclass`
FROM Unified_Itemized_Transactions uit
LEFT JOIN Product_Metadata pm ON uit.product_id = pm.product_id
WHERE uit.is_return IS NULL AND uit.is_cancellation IS NULL
AND uit.item_revenue >= 0

Important

Every brand is unique. Discuss with your Amperity representative how to extend the Unified Itemized Transactions table to support the taxonomy that is required for your products and brands.

Add pCLV attributes to Customer360 table

The Customer 360 table is the unified view of the customer across all points of engagement, including attributes that cross systems. This table does not exist by default and must be created within the customer 360 database. Each row represent a complete record for a unique individual, including their Amperity ID, merged PII data, and summary attributes.

You must edit the SELECT statement in the Customer 360 table to select attributes from the Predicted CLV Attributes table.

To add predictive models table attributes to the Customer 360 table

  1. From the Customer 360 page, under All Databases, open the menu for the customer 360 database, and then select Edit. The Database Editor page opens.

  2. Select the Customer 360 table.

  3. Edit the Customer 360 table to update the SELECT statement for attributes from the Predicted CLV Attributes table, similar to:

    -- Predicted Attributes from Predicted_CLV_Attributes pa
    ,pa.predicted_probability_of_transaction_next_365d AS `predicted_probability_of_transaction_next_365d`
    ,pa.predicted_order_frequency_next_365d AS `predicted_order_frequency_next_365d`
    ,pa.predicted_average_order_revenue_next_365d AS `predicted_average_order_revenue_next_365d`
    ,pa.predicted_clv_next_365d AS `predicted_clv_next_365d`
    ,pa.historical_order_frequency_lifetime AS `historical_order_frequency_lifetime`
    ,pa.days_since_last_order AS `days_since_last_order`
    ,pa.predicted_customer_lifecycle_status AS `predicted_customer_lifecycle_status`
    ,pa.predicted_customer_lifetime_value_tier AS `predicted_customer_lifetime_value_tier`
    
  4. Add a LEFT JOIN for the Predicted CLV Attributes table below the FROM clause, similar to:

    LEFT JOIN Predicted_CLV_Attributes pa ON pa.amperity_id = mc.amperity_id
    
  5. Click Next.

  6. Click the icon or empty space in the Icon column to open the Select an Icon dialog box, after which you can choose an icon or search for additional icons on the Font Awesome website.

    You may use any icon in the Font Awesome library.

    Icon identifiers start with fa-, and then a unique string. For example: fa-home (for   ), fa-star (for   ), and fa-project-diagram for (  ).

    Icons are available in two styles: regular (far) and solid (fas).

    Use a style prefix in front of an icon identifier to apply that style. For example: far fa-home for a regular icon or fas fa-home for a solid icon.

  7. Click Save.

  8. Click Activate.

  9. Run the customer 360 database.