Build queries¶
A query is SQL that is run from the Queries page against database tables in the Customer 360 page. A query returns a refined and filtered subset of useful customer data.
Queries page¶
The Queries page provides an overview of the status of every query. A table shows the status and details. Queries are listed by row. The details include the date and time at which this query last ran, along with the number of records (rows) that were returned during the last completed run.
Query editor¶
The SQL Query Editor is the user interface for a full SQL query engine based on Presto SQL that interacts with customer database tables in Amperity. The SQL Query Editor relies primarily on using the SELECT statement, along with common table expressions, joins, functions, and other components of Presto SQL to build and design advanced queries.
Use the Query Editor to build SQL queries against tables and columns in your customer 360 database to support any downstream workflow. The Query Editor uses Presto SQL.
Queries may be authored using the visual Query Editor or the SQL Query Editor. Click Create, and then select the query editor to open. Queries that are already created have an icon that shows from which query editor they were authored.
indicates the query was created using the visual Query Editor.
indicates the query was created using the SQL Query Editor.
All queries must be activated before they can run as part of a scheduled workflow.
Note
Amperity is a multi-user system and the set of queries for your company is shared across all users. That means that if one user creates a draft query, another can open and edit it, so work can be easily passed between people on your team.
However, it also means that if 2 users are editing the same thing at the same time, their changes will collide. Amperity resolves this by applying the last set of changes saved as a whole. This will always keep the query in a consistent state (it will never be half-implemented). But changes that were saved first will be overwritten. As a result, we strongly recommend that you coordinate changes to specific objects in Amperity with others on your team.
About Presto SQL¶
Presto is a distributed SQL query engine that is designed to efficiently query vast amounts of data using distributed queries. Presto is used within the Queries and Segments pages in the Amperity user interface.
Amperity queries are built using Presto SQL to define a SELECT statement. Please refer to the Presto SQL reference.
All queries that are built via the SQL Query Editor are done using a SELECT statement. In some cases, a WITH is used along with the SELECT statement. Each select statement can additional functionality, such as WHERE, LEFT JOIN, GROUP BY, ORDER BY, LIMIT clauses, CASE expressions, functions, operators, and other components that are part of Presto SQL, which is the underlying SQL engine for both the visual Query Editor and SQL Query Editor.
Tip
Follow the recommendations and patterns for indentation, naming conventions, reserved words, and whitespace.
Build queries¶
Queries may be authored using the visual Query Editor or the SQL Query Editor. Click Create and then select the query editor to open. Queries that are already created have an icon that shows from which query editor they were authored. All queries must be activated before they can run as part of a scheduled workflow.
Add query¶
The SQL Query Editor is the user interface for a full SQL query engine based on Presto SQL that interacts with customer database tables in Amperity. The SQL Query Editor relies primarily on using the SELECT statement, along with common table expressions, joins, functions, and other components of Presto SQL to build and design advanced queries.
To add a query using the SQL Query Editor
From the Queries page click Create, and then select SQL Query. This opens the SQL Query Editor.
Under Database, select a database. The Customer 360 database is selected by default.
Define the query against the selected database.
Click Run Query and debug any issues that may arise.
Click Activate.
Preview results¶
You can preview the results of a SQL query by clicking the Run Query button. This will do one of the following things:
Return the first 100 results of the query to the preview pane directly below the query editor.
Return an empty table.
Return some type of error.
Use the preview results pane to fine-tune your queries, to make sure they return the data you want, and to make sure they don’t contain any errors.
Tip
Query editor SQL queries often evaluate millions of records. This means they may take a few minutes to run. You may use other areas of Amperity while a query is being run.
For draft queries, setting a the LIMIT clause to “100” while developing a query is often enough for testing and validating query results against very large data sets.
To preview query results
From the Queries page, open the menu for a query, and then select View. This opens a query editor.
Click Run Query to run the query. Wait for it to return results.
Example the columns and the data that is in them.
Adjust your query as necessary until it runs correctly.
Click Activate.
Validate query¶
SQL queries are validated from the SQL Query Editor by clicking the Run Query button. The results of the query are returned in the results window. The quality of the results can be inspected, and then fine-tuned. Errors in the syntax are reported in the results window.
Add to orchestration¶
Use the Orchestration option to define a schedule for a query.
To add a query to an orchestration
From the Queries page, open the menu for a query, and then select View. This opens a query editor.
Tip
The query does not need to be in edit mode to configure an orchestration.
Under Being Sent To click Add one now. This opens the Add Orchestration dialog box.
Follow the steps to add an orchestration. The steps will vary depending on the destination, the data template, and the orchestration.
Click Save.
Example queries¶
This section contains examples of queries that you can add to your tenant:
Important
These queries are not meant to be copy/paste queries. Use them as examples. Most will require some customization to be used effectively within your tenant.
Cohort analysis¶
The following SQL builds a cohort analysis against the Transaction Attributes Extended table that returns a month-by-month view of customers acquired, split by channel (store vs. online), and then for each monthly cohort, how many repurchased within 60, 90, 180, and 365 days, and the channel on which customers made their repeat purchases.
Tip
Build cohort analysis queries for your tenant, and then send the results downstream to your favorite analytics and/or BI tool.
SELECT
YEAR(tae.first_order_datetime) AS first_order_year
,MONTH(tae.first_order_datetime) AS first_order_month
,tae.first_order_purchase_channel
,COUNT(*) AS num_amp_id
,SUM(CASE
WHEN tae.second_order_datetime <= tae.first_order_datetime + interval '60' day
THEN 1
ELSE 0
END) AS repeat_60d
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'web'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '60' day
THEN 1
ELSE 0
END) AS repeat_60d_web
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'store'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '60' day
THEN 1
ELSE 0
END) AS repeat_60d_store
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'web'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '90' day
THEN 1
ELSE 0
END) AS repeat_90d_web
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'store'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '90' day
THEN 1
ELSE 0
END) AS repeat_90d_store
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'web'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '180' day
THEN 1
ELSE 0
END) AS repeat_180d_web
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'store'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '180' day
THEN 1
ELSE 0
END) AS repeat_180d_store
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'web'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '365' day
THEN 1
ELSE 0
END) AS repeat_365d_web
,SUM(CASE
WHEN tae.second_order_purchase_channel = 'store'
AND tae.second_order_datetime <= tae.first_order_datetime + interval '365' day
THEN 1
ELSE 0
END) AS repeat_365d_store
FROM Transaction_Attributes_Extended tae
GROUP BY 1,2,3
ORDER BY 1,2,3
Count loyalty by state¶
The following example counts customers in the United States, and then also in California, Oregon, Washington, Alaska, and Hawaii who also belong to the loyalty program (which is indicated when loyalty_id is not NULL):
SELECT
state
,COUNT(amperity_id) AS TotalCustomers
FROM
Customer360
WHERE (UPPER("country") = 'US'
AND UPPER("state") in ('AK', 'CA', 'HI', 'OR', 'WA')
AND LOWER("loyalty_id") IS NOT NULL)
GROUP BY state
Customer acquisition¶
The following examples show how to track customer acquisition by day for single- and multi-brand tenants.
Tip
Build customer acquisition queries for your tenant, and then send the results downstream to your favorite analytics and/or BI tool.
For single-brand tenants
SELECT
DATE(first_order_datetime) AS first_order_date
,COUNT(DISTINCT amperity_id) AS total_customers
,SUM(CASE
WHEN one_and_done
THEN 1
ELSE 0
END) AS total_one_and_done
,AVG(lifetime_order_revenue) AS avg_clv
,SUM(CASE
WHEN L12M_order_frequency > 0
THEN 1
ELSE 0
END) AS L12M_total_orders
FROM Transaction_Attributes_Extended
GROUP BY 1,2,3
ORDER BY 1,2,3
For multi-brand tenants
SELECT
multi_purchase_brand
,multi_purchase_channel
,DATE(first_order_datetime) AS first_order_date
,COUNT(DISTINCT amperity_id) AS total_customers
,SUM(CASE
WHEN one_and_done
THEN 1
ELSE 0
END) AS total_one_and_done
,AVG(lifetime_order_revenue) AS avg_clv
,SUM(CASE
WHEN L12M_order_frequency > 0
THEN 1
ELSE 0
END) AS L12M_total_orders
FROM Transaction_Attributes_Extended
GROUP BY 1,2,3
ORDER BY 1,2,3
Days to second purchase¶
The following example shows how to return the days to second purchase starting from a date range for the first order.
SELECT
days_to_second_purchase
,COUNT(DISTINCT amperity_id) AS "customer_count"
FROM (
SELECT
amperity_id
,first_order_datetime
,second_order_datetime
,DATE_DIFF('day', first_order_datetime, second_order_datetime) AS "days_to_second_purchase"
FROM Transaction_Attributes
WHERE (
(
"amperity_id" IN (
SELECT DISTINCT "t0"."amperity_id"
FROM "Transaction_Attributes_Extended" "t0"
WHERE (
"t0"."first_order_datetime" BETWEEN TIMESTAMP '2019-01-01'
AND TIMESTAMP '2022-01-01'
AND "t0"."second_order_datetime" IS NOT NULL
)
)
)
)
GROUP BY amperity_id, first_order_datetime, second_order_datetime
)
GROUP BY 1
ORDER BY 1 ASC
Tip
You can extend the WHERE statement to show days to second purchase by brand:
WHERE (
"t0"."first_order_datetime" BETWEEN timestamp '2019-12-01'
AND timestamp '2023-12-28'
AND "t0"."second_order_datetime" IS NOT NULL
AND "t0"."first_order_purchase_brand" = 'Brand A'
--AND "t0"."first_order_purchase_brand" = 'Brand B'
--AND "t0"."first_order_purchase_brand" = 'Brand C'
--AND "t0"."first_order_purchase_brand" = 'Brand ...'
--Comment brands out when not needed
)
Google Analytics reports¶
The following example shows how build acquisition channel reports using the GoogleAnalytics4_TransactionalAnalytics4 table that is created by the Google Analytics data source.
WITH first_order AS (
SELECT DISTINCT
ut.amperity_id
,order_ID
,first_order_datetime
FROM Unified_Transactions ut
INNER JOIN (
SELECT
amperity_id
,first_order_datetime
FROM Transaction_Attributes
WHERE first_order_datetime
BETWEEN timestamp '2022-09-05 00:00:00'
AND timestamp '2022-09-13 00:00:00') ta
ON ta.amperity_id = ut.amperity_id
AND ta.first_order_datetime = ut.order_datetime
)
SELECT
sessionSource
,COUNT(amperity_id) new_customers
FROM first_order
LEFT JOIN (
SELECT
transactionId
,sessionSource
FROM GoogleAnalytics4_TransactionalAnalytics4
) GA
ON first_order.order_ID = GA.transactionId
GROUP BY 1
ORDER BY 2 DESC
What columns are available?
You can use any column in the GoogleAnalytics4_TransactionalAnalytics4 table to build reports:
itemRevenue
transactionId
browser
sessionCampaignName
sessionSource
sessionGoogleAdsAdGroupName
date
deviceCategory
operatingSystem
sessionGoogleAdsQuery
operatingSystemVersion
sessionMedium
sessionGoogleAdsAdGroupId
Leaky bucket ratio¶
A leaky bucket ratio tracks the difference between the customers your brand wins and loses over time.
The following query shows how to build a leaky bucket ratio by using columns in the Customer Attributes and Transaction Attributes Extended tables to find the difference between lost and won customers during the past month and the past year.
WITH leaky_bucket AS (
SELECT
YEAR(churn_trigger_start_datetime) AS year
,MONTH(churn_trigger_start_datetime) AS month
,SUM(IF(churn_trigger = 'Lost',1,0)) AS lost
,0 AS won
FROM Customer_Attributes
GROUP BY 1,2
UNION ALL
SELECT
YEAR(first_order_datetime) AS year
,MONTH(first_order_datetime) AS month
,0 AS lost
,COUNT(first_order_id) AS won
FROM Transaction_Attributes_Extended
GROUP BY 1,2
)
SELECT
year
,month
,SUM(lost) AS lost
,SUM(won) AS won
,SUM(won) - SUM(lost) AS ratio
FROM leaky_bucket
GROUP BY 1,2
ORDER BY year DESC, month DESC
Missing Amperity IDs¶
Use the following example to find missing transaction IDs. The query calculates the percentage of missing Amperity IDs and customer IDs in the Unified Transactions table.
SELECT
COUNT(DISTINCT CASE WHEN amperity_id IS NULL THEN order_id END) orders_missing_amp_id
,COUNT(DISTINCT CASE WHEN customer_id IS NULL THEN order_id END) orders_missing_cust_id
,1.0000*COUNT(DISTINCT CASE WHEN amperity_id IS NULL THEN order_id END)/COUNT(distinct order_id) pct_orders_missing_amp_id
,1.0000*COUNT(DISTINCT CASE WHEN customer_id IS NULL THEN order_id END)/COUNT(distinct order_id) pct_orders_missing_cust_id
FROM (
SELECT DISTINCT
,order_id
,amperity_id
,customer_id AS customer_id_value
FROM Unified_Transactions
)
Update the following line and replace “customer_id_value” with an actual customer ID:
customer_id AS customer_id_value
Month-over-month revenue¶
The following query returns month-over-month revenue.
SELECT
MONTH(order_datetime) order_month
,COUNT(DISTINCT ut.amperity_id) number_customers
,COUNT(DISTINCT new_cust.amperity_id) number_new_customers
,CAST(AVG(order_revenue) AS decimal) aov
,1.00*SUM(order_revenue)/SUM(order_quantity) aor
,CAST(SUM(order_revenue) AS DECIMAL) total_revenue
FROM (
SELECT
amperity_id
,order_datetime
,SUM(item_revenue) order_revenue
,SUM(item_quantity) order_quantity
FROM Unified_Itemized_Transactions
WHERE YEAR(order_datetime)=2020
GROUP BY 1,2) ut
LEFT JOIN (
SELECT amperity_id
FROM Transaction_Attributes
WHERE year(first_order_datetime)=2020) new_cust
ON ut.amperity_id=new_cust.amperity_id
GROUP BY 1
One and dones, by category¶
The following example shows how to return all one-and-done purchasers for a single calendar year by product category:
WITH new_in_21 AS (
SELECT
amperity_id
,one_and_done
FROM Transaction_Attributes
WHERE YEAR(first_order_datetime) = 2021
),
product_categories AS (
SELECT DISTINCT
new_in_21.amperity_id
,one_and_done
,product_category
FROM Unified_Itemized_Transactions uit
INNER JOIN new_in_21 ON new_in_21.amperity_id=uit.amperity_id
)
SELECT
product_category
,COUNT(distinct amperity_id) customer_count
,1.0000*SUM(CASE when one_and_done THEN 1 ELSE 0 END) / COUNT(DISTINCT amperity_id) pct_one_done
FROM product_categories
GROUP BY 1
ORDER BY 3 DESC
One and dones, by year¶
The following example shows how to return all one-and-done purchasers for a single calendar year:
WITH one_and_dones_2022 AS (
SELECT
amperity_id
FROM Transaction_Attributes
WHERE one_and_done AND YEAR(first_order_datetime) = 2022
)
SELECT
COUNT(*) one_and_dones_2022
FROM
one_and_dones_2022
Order revenue, 7-day rolling window¶
The following example shows a rolling seven day window for order revenue.
SELECT
*
FROM (
SELECT
purchase_channel
,order_day
,SUM(order_revenue) OVER (PARTITION BY purchase_channel ORDER BY order_day ROWS BETWEEN 6 preceding AND current row) rolling_7_day_revenue
FROM (
SELECT
purchase_channel
,DATE(order_datetime) order_day
,SUM(order_revenue) order_revenue
FROM Unified_Transactions
WHERE amperity_id IS NOT NULL
AND order_datetime > (CURRENT_DATE - interval '36' day)
GROUP BY 1,2
)
)
WHERE order_day > (CURRENT_DATE - interval '30' day)
ORDER BY 1,2
Partition pCLV by brand¶
Use the NTILE() function to partition by brand, and then order by predicted customer lifetime value.
NTILE(100) OVER (PARTITION BY brand ORDER BY predicted_clv desc, _uuid_pk)
Percent rank of purchases¶
Use the PERCENT_RANK() function to return the percent rank of all purchases.
SELECT DISTINCT
amperity_id
,order_revenue
,PERCENT_RANK() OVER (ORDER BY order_revenue) pct_rank
FROM Unified_Transactions
WHERE order_datetime > DATE('2021-04-01') AND amperity_id IS NOT NULL
ORDER BY pct_rank DESC
Purchasers, 12-month rolling window¶
The following query returns a distinct count of purchasers during the previous 12 months.
WITH periods AS (
SELECT DISTINCT
YEAR(order_datetime) order_year
,MONTH(order_datetime) order_month
FROM Unified_Transactions
)
,amp_base AS (
SELECT
amperity_id
,order_year
,order_month
FROM (
SELECT DISTINCT
amperity_id
FROM Unified_Transactions
) ut
CROSS JOIN periods
)
,purchase_by_month AS (
SELECT
YEAR(order_datetime) purchase_year
,MONTH(order_datetime) purchase_month
,amperity_id AS purchase_amperity_id
,COUNT(distinct order_id) orders
FROM Unified_Transactions
GROUP BY 1,2,3
)
,all_together AS (
SELECT
order_year
,order_month
,amperity_id
,SUM(orders) order_month_total
,SUM(SUM(orders)) OVER (
PARTITION BY amperity_id
ORDER BY order_year, order_month
ROWS BETWEEN 11 preceding AND current row
) preceding_12_month_orders
FROM amp_base
LEFT JOIN purchase_by_month
ON
amp_base.amperity_id=purchase_by_month.purchase_amperity_id
AND amp_base.order_year=purchase_by_month.purchase_year
AND amp_base.order_month=purchase_by_month.purchase_month
GROUP BY 1,2,3
)
,last12_month_purchasers AS (
SELECT
order_year
,order_month
,COUNT(DISTINCT CASE
WHEN preceding_12_month_orders > 0
THEN amperity_id
END) last_12_month_purchasers
FROM all_together
GROUP BY 1,2
)
SELECT *
FROM last12_month_purchasers
ORDER BY 1 DESC, 2 DESC
Product replenishment by date¶
The following query shows how to return results for product replenishment use cases. The query uses first purchase dates and next purchase dates to build timeframes around which a replenishment campaign can be run.
Important
You must configure this query to match your brand’s product catalog.
The product_description field represents the name of a specific product in your product catalog. Update this field to match the field in your product catalog that defines product descriptions. For example, this may be a field that returns a SKU value. Update the “’<string>’” value to match the specific name of a product.
The product_group field represents a category of products within your product catalog. Update this field to match the field in your product catalog that defines product categories.
Use the “AND product_group <> ‘<group>’” line to specify one (or more) groups. Add a line for each group. Remove the line when product categories should not returned for your product replenishment use case.
Explore the query results and tune the query parameters to align to your brand’s product catalog and replenishment use case.
WITH product_purchases AS (
SELECT
amperity_id
,product_description
,order_datetime
FROM Unified_Itemized_Transactions
WHERE product_description = '<string>'
AND item_list_price > 0
AND product_group <> '<group>'
AND item_revenue > 0
)
,product_purchase_pairs AS (
SELECT
t1.amperity_id
,t1.product_description
,t1.order_datetime AS first_purchase_date
,MIN(t2.order_datetime) AS next_purchase_date
FROM product_purchases t1
JOIN product_purchases t2 ON t1.amperity_id = t2.amperity_id
AND t1.product_description = t2.product_description
AND t1.order_datetime < t2.order_datetime
GROUP BY t1.amperity_id
,t1.product_description
,t1.order_datetime
)
SELECT *
FROM (
SELECT
DATE_DIFF(
'day'
,first_purchase_date
,next_purchase_date
) AS days_between_purchases
,COUNT(*) AS number_of_customers
FROM product_purchase_pairs
GROUP BY DATE_DIFF ('day', first_purchase_date, next_purchase_date)
ORDER BY days_between_purchases
)
WHERE days_between_purchases > 0
Product replenishment by decile¶
The following query shows how to return results for product replenishment use cases. The query uses first purchase dates and next purchase dates to build timeframes, and then groups customers into deciles.
Important
You must configure this query to match your brand’s product catalog.
The product_description field represents the name of a specific product in your product catalog. Update this field to match the field in your product catalog that defines product descriptions. For example, this may be a field that returns a SKU value. Update the “’<string>’” value to match the specific name of a product.
The product_group field represents a category of products within your product catalog. Update this field to match the field in your product catalog that defines product categories.
Use the “AND product_group <> ‘<group>’” line to specify one (or more) groups. Add a line for each group. Remove the line when product categories should not returned for your product replenishment use case.
Explore the query results and tune the query parameters to align to your brand’s product catalog and replenishment use case.
WITH product_purchases AS (
SELECT
amperity_id
,product_description
,order_datetime
FROM Unified_Itemized_Transactions
WHERE item_list_price > 0
AND product_group <> 'SAMPLE'
AND item_revenue > 0
)
,product_purchase_pairs AS (
SELECT
t1.amperity_id
,t1.product_description
,t1.order_datetime AS first_purchase_date
,MIN(t2.order_datetime) AS next_purchase_date
FROM product_purchases t1
JOIN product_purchases t2 ON t1.amperity_id = t2.amperity_id
AND t1.product_description = t2.product_description
AND t1.order_datetime < t2.order_datetime
GROUP BY t1.amperity_id
,t1.product_description
,t1.order_datetime
)
,days_diff AS (
SELECT
amperity_id
,product_description
,DATE_DIFF(
'day'
,first_purchase_date
,next_purchase_date
) AS days_between_purchases
FROM product_purchase_pairs
)
,deciles AS (
SELECT
amperity_id
,product_description
,days_between_purchases
,NTILE(10) OVER (
PARTITION BY product_description
ORDER BY days_between_purchases ASC
) AS decile
FROM days_diff
WHERE days_between_purchases > 0
)
SELECT *
FROM (
SELECT
product_description
,AVG(days_between_purchases) AS average_days_between
,COUNT(amperity_id) AS count_replenishments
,COUNT(DISTINCT amperity_id) AS count_customers
,MAX(CASE
WHEN decile = 1 THEN days_between_purchases
END) AS "Maximum days (1st decile)"
,MAX(CASE
WHEN decile = 2 THEN days_between_purchases
END) AS "2nd decile"
,MAX(CASE
WHEN decile = 3 THEN days_between_purchases
END) AS "3rd decile"
,MAX(CASE
WHEN decile = 4 THEN days_between_purchases
END) AS "4th decile"
,MAX(CASE
WHEN decile = 5 THEN days_between_purchases
END) AS "5th decile"
,MAX(CASE
WHEN decile = 6 THEN days_between_purchases
END) AS "6th decile"
,MAX(CASE
WHEN decile = 7 THEN days_between_purchases
END) AS "7th decile"
,MAX(CASE
WHEN decile = 8 THEN days_between_purchases
END) AS "8th decile"
,MAX(CASE
WHEN decile = 9 THEN days_between_purchases
END) AS "9th decile"
,MAX(CASE
WHEN decile = 10 THEN days_between_purchases
END) AS "Minimum days (10th decile)"
FROM deciles
GROUP BY 1
ORDER BY 3 DESC
)
Products by revenue¶
Use the following example to return your top 5 products by revenue.
SELECT
product_id
,SUM(item_revenue) AS total_revenue
FROM Unified_Itemized_Transactions
WHERE
DATE_TRUNC('month', order_datetime) = DATE_TRUNC('month', CURRENT_DATE)
GROUP BY product_id
ORDER BY total_revenue DESC
LIMIT 5
Use the LIMIT clause to configure the top N products.
Rank by product affinity¶
The following query shows an example of how to rank customers by product affinity:
WITH rankings AS (
SELECT
amperity_id
,product_attribute
,RANK() OVER (PARTITION BY amperity_id ORDER BY ranking) rec_ranking
FROM Predicted_Affinity_Table
WHERE ranking < 10000000
)
,top_5_by_customers AS (
SELECT
amperity_id
,MAX(CASE WHEN rec_ranking=1 THEN product_attribute END) prod_ranking_1
,MAX(CASE WHEN rec_ranking=2 THEN product_attribute END) prod_ranking_2
,MAX(CASE WHEN rec_ranking=3 THEN product_attribute END) prod_ranking_3
,MAX(CASE WHEN rec_ranking=4 THEN product_attribute END) prod_ranking_4
,MAX(CASE WHEN rec_ranking=5 THEN product_attribute END) prod_ranking_5
FROM rankings
GROUP BY 1
)
SELECT *
FROM top_5_by_customers
Use the WHERE clause to set the maximum threshold for product affinity:
WHERE ranking < some_value
For example, if the threshold is “10000000” then a customer is more than ten millionth in product affinity, then that customer will not be included in the ranking.
Rank by Amperity ID¶
Use the RANK() function to find the largest transactions by Amperity ID, and then return them in ascending order:
WITH ranked_transactions AS (
SELECT
t.orderid
,t.amperity_id
,t.transactiontotal
,RANK() OVER (
PARTITION BY t.amperity_id
ORDER BY t.transactiontotal DESC
) AS rank
FROM TransactionsEcomm t
ORDER BY t.amperity_id, rank ASC
LIMIT 100
)
SELECT * FROM ranked_transactions WHERE rank = 1
Reachable email, by state¶
The following example shows how to return a list of customers with reachable email addresses, by state:
SELECT
COUNT(amperity_id) AS count
,state
,SUM(email_count) AS has_email
FROM (
SELECT
c.amperity_id
,state
,IF(e.amperity_id IS NULL,0,1) AS email_count
FROM Customer_360 c
LEFT JOIN (
SELECT DISTINCT amperity_id FROM Email_Table agg
WHERE agg.email_opt_out_field = 'No'
AND agg.email_out_status = 'opt-in'
) e ON e.amperity_id = c.amperity_id
WHERE state IS NOT NULL)
GROUP BY state
Return percentiles¶
Return 10th percentiles:
SELECT
tier*.1 AS tier
,MIN(lifetime_order_revenue) lifetime_revenue_amount
FROM (
SELECT
amperity_id
,lifetime_order_revenue
,NTILE(10) OVER (ORDER BY lifetime_order_revenue) tier
FROM Transaction_Attributes
)
GROUP BY 1
ORDER BY 1
Return percentiles by 4:
SELECT
tier*.25 tier
,MIN(lifetime_order_revenue) lifetime_revenue_amount
FROM (
SELECT DISTINCT
amperity_id
,lifetime_order_revenue
,NTILE(4) OVER (ORDER BY lifetime_order_revenue) tier
FROM Transaction_Attributes
)
GROUP BY 1
ORDER BY 1
Revenue by month¶
The following query returns three values for each month in a calendar year: revenue for customers who are in your top 20 percent, total revenue, and the percent of total revenue your top 20 percent customers represent.
SELECT
month
,SUM(CASE
WHEN top_20
THEN revenue
ELSE NULL
END) revenue
,SUM(revenue) total_revenue
,1.00*SUM(CASE
WHEN top_20
THEN revenue
ELSE NULL
END)/SUM(revenue) percent
FROM (
SELECT
month
,amperity_id
,revenue
,CASE
WHEN rank IN (9,10) THEN TRUE
ELSE FALSE
END AS top_20
FROM (
SELECT
MONTH(order_datetime) AS month
,amperity_id
,SUM(order_revenue) revenue
,NTILE(10) OVER (ORDER BY SUM(order_revenue)) rank
FROM Unified_Transactions
WHERE YEAR(order_datetime)=2022
GROUP BY 1,2
)
)
GROUP BY 1
ORDER BY month ASC
Revenue opportunity, 1 group¶
Use the following example to size revenue opportunity for a single group, such as for “one-and-done” or for “repeat purchaser”. This query is most effective when run against the Transaction Attributes Extended table, but may be useful when run against other tables. Customize the WHERE clause in the WITH statement to define the group. Customize the main SELECT statement to define the time periods to measure.
WITH customers AS (
SELECT
amperity_id
FROM Transaction_Attributes_Extended
WHERE one_and_done
)
SELECT
COUNT(DISTINCT tae.amperity_id) AS num_customers
,AVG(lifetime_average_order_value) AS lt_aov
,AVG(L12M_average_order_value) AS l12m_aov
,AVG(lifetime_order_revenue) AS lt_rev
,AVG(l12m_order_revenue) AS l12m_rev
,AVG(lifetime_average_item_price) AS lt_aur
,AVG(l12m_average_item_price) AS l12m_aur
,AVG(lifetime_order_frequency) AS avg_lt_orders
,AVG(l12m_order_frequency) AS avg_l12m_orders
,SUM(l12m_order_revenue) AS total_l12m_revenue
FROM customers cust
INNER JOIN Transaction_Attributes_Extended tae
ON cust.amperity_id=tae.amperity_id
Revenue opportunity, many groups¶
Use the following example to size revenue opportunity for multiple groups, such as moving from “one-and-done” to “repeat purchaser”. This query is most effective when run against the Transaction Attributes Extended table, but may be useful when run against other tables. Customize the WITH statement to define a list of fields to be used for comparing movements between groups. Customize the main SELECT statement to define the time periods for which this movement will be measured.
WITH customers AS (
SELECT
amperity_id
,multi_purchase_channel
FROM Transaction_Attributes_Extended
)
SELECT
cust.multi_purchase_channel
,COUNT(DISTINCT tae.amperity_id) AS num_customers
,AVG(lifetime_average_order_value) AS lt_aov
,AVG(L12M_average_order_value) AS l12m_aov
,AVG(lifetime_order_revenue) AS lt_rev
,AVG(l12m_order_revenue) AS l12m_rev
,AVG(lifetime_average_item_price) AS lt_aur
,AVG(l12m_average_item_price) AS l12m_aur
,AVG(lifetime_order_frequency) AS avg_lt_orders
,AVG(l12m_order_frequency) AS avg_l12m_orders
,SUM(l12m_order_revenue) AS total_l12m_revenue
FROM customers cust
INNER_JOIN Transaction_Attributes_Extended tae
ON cust.amperity_id=tae.amperity_id
GROUP BY 1
Top 20% revenue during previous year¶
Which customers are your top 20-percent for revenue during the previous year?
Use the monetary component of approximate RFM scores to quickly find customers who are in the top 20-percent, along with the percentage of total revenue those customers represent.
Tip
Divide percentage of revenue by the number of customers in the top 20 to find the average percentage of revenue that is generated by each customer across all of your brands, stores, channels, and regions.
The following example returns your top 20 percent of customers by revenue:
SELECT
amperity_id
,L12M_monetary
FROM Transaction_Attributes_Extended
WHERE L12M_monetary >= 9
Totals by brand¶
Use the SUM() function to build totals for quantity, returns, and revenue:
Total quantity by brand¶
Use the SUM() function to return total quantity by brand, and then sort in descending order:
SELECT
purchase_brand
,SUM(order_quantity) AS total_quantity
FROM Unified_Transactions
GROUP BY 1
ORDER BY 1 DESC
Total returns by brand¶
Use the SUM() function to return total returns by brand, and then sort in descending order:
SELECT
purchase_brand
,SUM(order_returned_revenue) AS total_returns
FROM Unified_Transactions
GROUP BY 1
ORDER BY 1 DESC
Total revenue by brand¶
Use the SUM() function to return total revenue by brand, and then sort in descending order:
SELECT
purchase_brand
,SUM(order_revenue) AS total_revenue
FROM Unified_Transactions
GROUP BY 1
ORDER BY 1 DESC
Transactions by day¶
Use the following example to generate a breakdown of customer and order metrics by day. This query must be run against the Unified Transactions table. Export the results of this query to BI tools like Tableau or desktop tools like Google Sheets or Microsoft Excel.
SELECT
DATE(order_datetime) AS day_of_order
,purchase_brand
,purchase_channel
,SUM(order_revenue) AS total_revenue
,AVG(order_revenue) AS aov
,COUNT(DISTINCT order_id) AS orders
,COUNT(DISTINCT amperity_id) AS customers
,SUM(order_quantity) AS total_items
,AVG(order_quantity) AS avg_items
,SUM(order_returned_revenue) AS total_returned_revenue
FROM Unified_Transactions
GROUP BY 1,2,3
ORDER BY 1,2,3