About databases¶
The Customer 360 page allows databases to be created from any combination of stitched output, passed-through domain tables, and custom domain tables. At least one database must be designated as your “Customer 360” database, but there is no limit to the number of databases you may configure for use with any downstream workflow. For example:
Use a passthrough database to separate raw source data from stitched customer data.
Use a QA database to build tables for use with validating Stitch quality and for ensuring that interactions records (for orders and items) are being measured correctly.
Use a custom database for experimentation.
Important
This topic assumes that you have completed the processes for adding customer profiles and adding transactions to Amperity and have created your customer 360 database.
About Spark SQL¶
Spark SQL is a high performance SQL query engine that is used by Amperity to ingest data, create domain tables, and extend the outcome of the Stitch process in your customer 360 database.
Use Spark SQL to build database tables.
About table types¶
Tables are organized into the following categories:
Standard tables¶
A standard database table forms the foundation of your databases. Standard tables are output by Stitch and provide unified data structures for all data sources to which semantic tags were applied.
Standard tables include:
Campaign Recipients
Customer 360
Customer Attributes
Email Engagement Attributes
Email Engagement Summary
Lifecycle Events
Merged Customers
Transaction Attributes
Transaction Attributes Extended
Unified Customer
Unified Coalesced
Unified Email Events
Unified Itemized Transactions
Unified Product Catalog
Unified Scores
Unified Transactions
along with tables that are associated with predictive modeling and AmpIQ, such as Predicted CLV Attributes.
Source and custom tables¶
Source and custom tables make raw source data and custom domain tables available to your customer 360 database as a series of passthrough tables. Source and custom tables will vary from tenant to tenant, but typically represent data that can provide useful context to your customer profile data and to support a variety of upstream and downstream workflows, such as:
Behavioral data
Clickstream data
Marketing campaign response data
Display advertising performance data
Passthrough tables that contain original customer data
And so on …
Any domain table or custom domain table can be configured as a passthrough table for any database.
QA tables¶
A QA database table is output by Stitch for use with the Stitch QA process.
QA tables include:
Detailed Examples
Stitch Blocking Keys
Stitch Scores
Unified Changes Clusters
Unified Changes PKS
Unified Coalesced
Unified Preprocessed Raw
Unified Scores
Define customer profile¶
Customer profile details are pulled from the Customer 360 table, which represents all of your unified customer profiles, including:
Names (first names, last names), email addresses, physical addresses, phone numbers
Transaction details (first purchases, last purchases, total purchases, etc)
Other custom profile values that are unique to your company
You can choose to summarize these customer profile details directly on the Customer 360 page.
Customer profile details are displayed in the following order:
Fields with an associated icon
Fields marked as a favorite
Fields without an associated icon or not marked as a favorite
Note
Profile details that appear in the Customer Profile section are selected during the Configure and Save step when configuring the Customer 360 table.
Manage databases¶
The Customer 360 page may define any number of individual databases, in addition to your customer 360 database. For example, it is common to add a dedicated database from which you can manage the Stitch QA process.
The All Databases section of the Customer 360 page lists all of the databases that are available, including both active and inactive databases.
Note
Only databases in an Active state may be run.
Add empty database¶
An empty database contains no tables. Use any combination of passthrough, SQL, or SQL template tables to build a custom database.
To add an empty database
From the Customer 360 page click Create Database.
Enter the name of the database.
Click Create.
Add database from template¶
Use the Customer 360, Passthrough, or Stitch QA database templates to add a database that contains the set of tables for use with that type of database.
To add a database from a template
From the Customer 360 page click Create Database.
Enter the name of the database.
From the Template drop-down, select “Customer 360”, “Passthrough”, or “Stitch QA”.
Click Create.
Customer 360 template¶
The “Customer 360” template adds tables based on the types of records that are available to Stitch. It is recommended to use this template after at least one data source for customer records and interactions records have been configured in the Sources page.
The following tables are added by the “Customer 360” template when profile semantic tags are applied to customer records:
Customer 360
Merged Customers
Unified Coalesced
Unified Customer
Unified Scores
The following tables are added by the “Customer 360” template when transaction and itemized transaction semantic tags are applied to interactions records:
Customer Attributes
Transaction Attributes
Transaction Attributes Extended
Unified Itemized Transactions
Unified Transactions
These tables link brand interactions to customer profiles and are a specific requirement of AmpIQ.
If the “Customer 360” template is used to create the customer 360 database before interactions records are also available, you must add those tables to the customer 360 database manually.
Passthrough template¶
A database that is created using the passthrough template will add a table for each domain table or custom domain table that can be passed through to the database.
Stitch QA template¶
The following tables are added by the “Customer 360” template when profile semantic tags are applied to customer records:
Detailed Examples
Unified Changes Clusters
Unified Changes PKS
Unified Coalesced
Unified Preprocessed Raw
Unified Scores
Use this database to validate the quality of Stitch output.
Use multiple databases¶
You may configure more than one database to be a customer 360 database. Each customer 360 database will contain a set of standard tables, and then may have an optional set of source and custom tables added as passthrough tables.
Apply flexible merge rules¶
Amperity allows multiple databases to exist within the same tenant. Each database may define its own unique set of rules for merging customer profile data. These merge rules are configured using Spark SQL and each field within the merge rules can be customized.
For example, a tenant may have data sources from call centers, online transactions, and email platforms that may contain slightly different sets of customer profile data:
After loading this data to Amperity and assigning the Amperity ID to each of your customers, you can use flexible merge rules to support multiple customer 360 databases.
Your operations teams can combine prioritizing the most common values for each customer with deterministic matching
Your email marketing team can combine prioritizing customer profile values from your email platform with probabilistic matching
Your paid media team can combine all possible values to improve match rates on platforms like Google Ads and Facebook
Tip
Ask your Amperity implementation team for recommendations and best practices for how you can configure flexible merge rules to support all of your use cases.
Manage tables¶
Use any of the following methods to add tables to databases:
Add as passthrough¶
A passthrough table adds a domain table to your database. Some domain tables are replaced by stitched domain tables, which include the Amperity ID, but otherwise are identical to the source domain table.
If a table already contains an Amperity ID, you may use a passthrough table.
To add a table as a passthrough table
From the Customer 360 page, under All Databases, open the menu for a database, and then click Edit.
Click Add Table.
Set Build Mode to “Passthrough”, and then select a table from the drop-down menu.
Note
The name of the table is assigned automatically and defaults to the name of the table as shown in the drop-down menu. You may change the name of the table after selecting a table.
Update the name of the passthrough table, if necessary.
Verify semantic tags and ensure that fields that contain PII are marked correctly.
Make the table available to the Segment editor in AmpIQ by selecting the Make available in Visual Segment Editor option.
Note
Only tables that contain an Amperity ID may be made available to the Segment editor in AmpIQ and/or used with campaigns.
Click Activate to update the database with your changes.
Add as SQL¶
A SQL table is added to a database using Spark SQL and a SELECT
statement that defines the table’s schema.
Note
If a table was not stitched and/or does not have an Amperity ID, you must use SQL to associate the unique ID in that table to the Amperity ID. For example, CCUST associates the unique ID in a table to the Amperity ID:
SELECT
CCUST.amperity_id
,CTX.customer_id
,CTX.orderid
,CTX.purchasedate
,CTX.transactiontotal
,CTX.pointsearned
,CTX.numberofitems
,CTX.avgitemprice
,CTX.productcode
,CTX.productcategory
,CTX.storeorwebid
FROM POS_transactions AS CTX
LEFT JOIN POS_customers AS CCUST ON (CCUST.customer_id = CTX.customer_id)
To add a table as a SQL table
From the Customer 360 page, under All Databases, open the menu for a database, and then click Edit.
Click Add Table.
Set Build Mode to “SQL”, and then define a SQL query using Spark SQL.
Click Validate to verify that the SQL query runs correctly.
Verify semantic tags and ensure that fields that contain PII are marked correctly.
Make the table available to the Segment editor in AmpIQ by selecting the Make available in Visual Segment Editor option.
Note
Only tables that contain an Amperity ID may be made available to the Segment editor in AmpIQ and/or used with campaigns.
Click Activate to update the database with your changes.
Add from table template¶
Use table templates to add tables to your database that use Amperity standard tables as their starting points. Each template provides complete SQL that matches the default use case for the table. Some templates require additional configuration within the SQL, including a series of steps that are required for using table templates with multi-brand databases.
To add a table using a SQL template
From the Customer 360 page, under All Databases, open the menu for a database, and then click Edit.
Click Add Table.
Set Build Mode to “SQL”, open the Apply template drop-down menu, and then select a table template.
Update the name of the table template, if necessary.
Update the SQL in the table template to support your use case.
Important
Refer to the individual topic for each standard table template for more information about how to configure any required and/or recommended steps.
Verify semantic tags and ensure that fields that contain PII are marked correctly.
Make the table available to the Segment editor in AmpIQ by selecting the Make available in Visual Segment Editor option.
Note
Only tables that contain an Amperity ID may be made available to the Segment editor in AmpIQ and/or used with campaigns.
Click Activate to update the database with your changes.
Standard table templates¶
Table templates are available for all standard tables:
Other templates¶
Table templates are available for other tables:
Customer360 that joins Transaction Attributes
Merged Households; this table requires access to address standardization lookup tables.
Validation alerts¶
The SQL editor shows a validation alert when syntax is detected in your query that has the potential to cause performance issues with database generation, the length of time it may take to complete running a query, or situations where you can improve the quality of your SQL syntax.
What should I do if my query has a validation alert?
A validation alert does not mean your SQL is invalid. If your query has a validation alert, review the alert, review your SQL, and consider alternates that can help you avoid potential performance issues or avoid the situation the alert describes. There are situations where the SQL you need is the SQL that is causing the validation warning. You may activate a query even when it contains a validation alert.
Examples of validation alerts include:
Implicit CROSS JOIN¶
An implicit CROSS JOIN occurs when a query unintentionally returns a Cartesian product. A Cartesian product combines every item in the first table with every item in the second. For example, if table A has three items and table B has three items, the Cartesian product is 9 pairs.
A Cartesian product in Amperity between any two tables is often a very, very large number of pairs and, as a result, is an expensive operation. It is recommended to optimize your Spark SQL queries to avoid an implicit CROSS JOIN.
The database editor will show a validation alert for an implicit CROSS JOIN in situations like:
A join statement does not use an ON() or USING() clause to specify one (or more) column names as the join criteria.
A SELECT statement returns too many columns from two tables. For example, using
SELECT * FROM Table_A, Table_B
can behave like a CROSS JOIN. In this situation you should first filter out NULL values to avoid performance issues, or refactor your query to be more specific about which columns from which tables are to be joined.
OVER without PARTITION BY¶
A window function with an OVER statement that does not include a PARTITION BY clause often leads to performance issues when the OVER statement is asked to run across a large number of rows.
If you see this alert, add the PARTITION BY clause to your window function.
Unescaped backslashes in regular expressions¶
Unescaped backslashes can cause errors with functions that use regular expressions, such as RLIKE(), that are difficult to diagnose.
Caution
A backslash ( \
) contained within a regular expression MUST be escaped using another backslash. For example, a regular expression that matches a single numeric digit should be \\d
and not \d
.
Unintentional broadcast joins¶
A broadcast join sends the smaller table in a join operation to all Spark executors, and then evaluates the larger table across each executor’s partitions. When a broadcast join is small, it’s fast. Above a certain threshold they can performance issues.
A broadcast join often looks like:
WHERE column_name NOT IN values
When values is not a small list of constants a broadcast join may occur.
Note
In some situations you may want to use a LEFT ANTI JOIN, which returns values from the left-side table when they do not have matches on the right-side table. Use a LEFT ANTI JOIN to avoid performance issues that may be caused by a broadcast join.
Extend databases¶
Extend databases to support any use case or requirement.
The following topics represent starting points for some of the ways you can extend your customer 360 database:
This list represents a small percentage of the use cases you can enable with Amperity. Talk with your Amperity representative about how you can best enable these (and other) use cases for your tenant.
Export databases and tables¶
A database may be configured to export one (or more) tables (or even the entire database) from Amperity. Each database export must be assigned a unique name, and then is configured to select one (or more) tables to be included in the export. A database export must be associated with a configured destination and must be added to an orchestration.
To add a database export
From the Customer 360 page click Configure Exports. This opens the Database Exports page.
Click Create New Export. This opens the Add Export dialog box.
Add the name of the database export, and then click Add. This will add a draft database export to the Database Exports page.
Open the menu for the draft database export, and then select Edit.
From the Database drop-down, select a database.
From the Entities list, select one (or more) database tables to add to the database export.
For a single table, click the table. For multiple tables, use click + command for each table to be selected. For all tables click the first table, hold shift, and then click the last table.
Click Activate.
Assign the database export to an orchestration.