Detailed examples¶
Use this query to return a detailed set of examples that show you what is in your customer 360 database. Use this data to identify which types of features lead to scores with the largest effect on overall Stitch results, including how those features are associated with various combinations of fields that contain PII data.
When to use¶
Run this query on a regular basis.
Review the results of this query (typically ~30-60 minutes).
Look for clusters with anomalies above the threshold.
Investigate each individual cluster in which anomalies are discovered.
Configure query¶
From the Queries page, open the Stitch QA folder, and then select this query.
Tip
Add the Stitch QA queries template folder if it does not already exist.
Update the list of matched records to reflect those used for your tenant. The list of matched records is an A/B pair of semantic fields. The default template includes given-name, surname, email, phone, address, birthdate, city, state, and postal.
Add matched records for one (or more) of the full-name, generational-suffix, gender, address2, company, and title semantic fields. For example:
1 ,t1.given_name AS given_name_a 2 ,t2.given_name AS given_name_b 3 ,t1.surname AS surname_a 4 ,t2.surname AS surname_b 5 ,t1.full_name AS full_name_a 6 ,t2.full_name AS full_name_b 7 ,t1.generational_suffix AS generational_suffix_a 8 ,t2.generational_suffix AS generational_suffix_b 9 ,t1.title AS title_a 10 ,t2.title AS title_b 11 ,t1.email AS email_a 12 ,t2.email AS email_b 13 ,t1.phone AS phone_a 14 ,t2.phone AS phone_b 15 ,t1.address AS address_a 16 ,t2.address AS address_b 17 ,t1.address2 AS address2_a 18 ,t2.address2 AS address2_b 19 ,t1.birthdate AS birthdate_a 20 ,t2.birthdate AS birthdate_b 21 ,t1.city AS city_a 22 ,t2.city AS city_b 23 ,t1.state AS state_a 24 ,t2.state AS state_b 25 ,t1.country AS country_a 26 ,t2.country AS country_b 27 ,t1.postal AS postal_a 28 ,t2.postal AS postal_b 29 ,t1.company AS company_a 30 ,t2.company AS company_b 31 ,t1.gender AS gender_a 32 ,t2.gender AS gender_b
When customer keys are present, uncomment the matching records:
1,t1.pk AS pk_a 2,t2.pk AS pk_b 3,t1.ck AS ck_a 4,t2.ck AS ck_b
Update the matching records for all foreign keys:
1-- ,t1.fk_field_name AS fk_field_name_a 2-- ,t2.fk_field_name AS fk_field_name_b
For example:
1t1.fk_customer_id AS fk_customer_id_a, 2t2.fk_customer_id AS fk_customer_id_b, 3t1.fk_campaign AS fk_campaign_a 4t2.fk_campaign AS fk_campaign_b
Rows may be numbered by case (default) or by Amperity ID.
1ROW_NUMBER() OVER (PARTITION BY scores.score 2 ORDER BY scores.amperity_id 3 ,scores.pk1 4 ,scores.pk2) 5 AS case_count 6-- ROW_NUMBER() OVER (PARTITION BY scores.amperity_id 7-- ORDER BY scores.amperity_id, 8-- scores.pk1, 9-- scores.pk2) 10-- AS amperity_id_count
Set the case count limit to be greater than “1” to return more examples of each feature:
1Filtered_Examples AS ( 2 SELECT * FROM Detailed_Examples 3 WHERE case_count <= 1 4 AND score_count > 1000 5)
Important
Set case_count to “1” when calculating coverage.
Set the score count limit:
1Filtered_Examples AS ( 2 SELECT * FROM Detailed_Examples 3 WHERE case_count <= 1 4 AND score_count > 1000 5)
Update the list of fields for filtered examples for the list of semantic values you want included in the results.
For example, to include email, given_name, address, and phone:
1SELECT * 2 FROM Filtered_Examples 3 WHERE email_a <> email_b 4 AND given_name_a <> given_name_b 5 AND address_a <> address_b 6 AND phone_a <> phone_b 7ORDER BY score DESC
Hint
This should be the same list as specified for matched records.
To return coverage statistics, uncomment the SELECT statement under coverage stats:
1SELECT 2 SUM(score_count) AS example_coverage 3 ,(SELECT COUNT(1) FROM Unified_Scores) AS total_scores, 4 ,CAST(SUM(score_count) as DECIMAL(12,3)) 5 / CAST((SELECT COUNT(1) FROM Unified_Scores) AS DECIMAL(12,3)) 6 * CAST(100 AS DECIMAL) AS coverage_percentage 7FROM Filtered_Examples
If using the bad-values blocklist you may apply those blocklist values to the results of this query. This requires uncommenting two sections of this query. First, uncomment the matching records:
1,t1.has_blv as has_blv_a 2,t2.has_blv as has_blv_b 3,t1.blv_given_name as blv_given_name_a 4,t2.blv_given_name as blv_given_name_b 5,t1.blv_surname as blv_surname_a 6,t2.blv_surname as blv_surname_b 7,t1.blv_email as blv_email_a 8,t2.blv_email as blv_email_b 9,t1.blv_phone as blv_phone_a 10,t2.blv_phone as blv_phone_b 11,t1.blv_address as blv_address_a 12,t2.blv_address as blv_address_b
and then enable them to be part of the LEFT JOIN operations for unified scores:
1AND has_blv IS null 2AND blv_surname IS null 3AND blv_given_name IS null 4AND blv_email IS null 5AND blv_phone IS null 6AND blv_address IS null
Click Run Query and debug any issues that may arise.
Click Activate.