Detailed examples

Use this query to return a detailed set of examples that show you what is in your customer 360 database. Use this data to identify which types of features lead to scores with the largest effect on overall Stitch results, including how those features are associated with various combinations of fields that contain PII data.

When to use

  1. Run this query on a regular basis.

  2. Review the results of this query (typically ~30-60 minutes).

  3. Look for clusters with anomalies above the threshold.

  4. Investigate each individual cluster in which anomalies are discovered.

Configure query

  1. From the Queries page, open the Stitch QA folder, and then select this query.

    Tip

    Add the Stitch QA queries template folder if it does not already exist.

  2. Update the list of matched records to reflect those used for your tenant. The list of matched records is an A/B pair of semantic fields. The default template includes given-name, surname, email, phone, address, birthdate, city, state, and postal.

    Add matched records for one (or more) of the full-name, generational-suffix, gender, address2, company, and title semantic fields. For example:

     1 ,t1.given_name AS given_name_a
     2 ,t2.given_name AS given_name_b
     3 ,t1.surname AS surname_a
     4 ,t2.surname AS surname_b
     5 ,t1.full_name AS full_name_a
     6 ,t2.full_name AS full_name_b
     7 ,t1.generational_suffix AS generational_suffix_a
     8 ,t2.generational_suffix AS generational_suffix_b
     9 ,t1.title AS title_a
    10 ,t2.title AS title_b
    11 ,t1.email AS email_a
    12 ,t2.email AS email_b
    13 ,t1.phone AS phone_a
    14 ,t2.phone AS phone_b
    15 ,t1.address AS address_a
    16 ,t2.address AS address_b
    17 ,t1.address2 AS address2_a
    18 ,t2.address2 AS address2_b
    19 ,t1.birthdate AS birthdate_a
    20 ,t2.birthdate AS birthdate_b
    21 ,t1.city AS city_a
    22 ,t2.city AS city_b
    23 ,t1.state AS state_a
    24 ,t2.state AS state_b
    25 ,t1.country AS country_a
    26 ,t2.country AS country_b
    27 ,t1.postal AS postal_a
    28 ,t2.postal AS postal_b
    29 ,t1.company AS company_a
    30 ,t2.company AS company_b
    31 ,t1.gender AS gender_a
    32 ,t2.gender AS gender_b
    
  3. When customer keys are present, uncomment the matching records:

    1,t1.pk AS pk_a
    2,t2.pk AS pk_b
    3,t1.ck AS ck_a
    4,t2.ck AS ck_b
    
  4. Update the matching records for all foreign keys:

    1-- ,t1.fk_field_name AS fk_field_name_a
    2-- ,t2.fk_field_name AS fk_field_name_b
    

    For example:

    1t1.fk_customer_id AS fk_customer_id_a,
    2t2.fk_customer_id AS fk_customer_id_b,
    3t1.fk_campaign AS fk_campaign_a
    4t2.fk_campaign AS fk_campaign_b
    
  5. Rows may be numbered by case (default) or by Amperity ID.

     1ROW_NUMBER() OVER (PARTITION BY scores.score
     2                   ORDER BY scores.amperity_id
     3                            ,scores.pk1
     4                            ,scores.pk2)
     5                   AS case_count
     6-- ROW_NUMBER() OVER (PARTITION BY scores.amperity_id
     7--                    ORDER BY scores.amperity_id,
     8--                             scores.pk1,
     9--                             scores.pk2)
    10--                    AS amperity_id_count
    
  6. Set the case count limit to be greater than “1” to return more examples of each feature:

    1Filtered_Examples AS (
    2  SELECT * FROM Detailed_Examples
    3  WHERE case_count <= 1
    4  AND score_count > 1000
    5)
    

    Important

    Set case_count to “1” when calculating coverage.

  7. Set the score count limit:

    1Filtered_Examples AS (
    2  SELECT * FROM Detailed_Examples
    3  WHERE case_count <= 1
    4  AND score_count > 1000
    5)
    
  8. Update the list of fields for filtered examples for the list of semantic values you want included in the results.

    For example, to include email, given_name, address, and phone:

    1SELECT *
    2  FROM Filtered_Examples
    3  WHERE email_a <> email_b
    4  AND given_name_a <> given_name_b
    5  AND address_a <> address_b
    6  AND phone_a <> phone_b
    7ORDER BY score DESC
    

    Hint

    This should be the same list as specified for matched records.

  9. To return coverage statistics, uncomment the SELECT statement under coverage stats:

    1SELECT
    2  SUM(score_count) AS example_coverage
    3  ,(SELECT COUNT(1) FROM Unified_Scores) AS total_scores,
    4  ,CAST(SUM(score_count) as DECIMAL(12,3))
    5        / CAST((SELECT COUNT(1) FROM Unified_Scores) AS DECIMAL(12,3))
    6        * CAST(100 AS DECIMAL) AS coverage_percentage
    7FROM Filtered_Examples
    
  10. If using the bad-values blocklist you may apply those blocklist values to the results of this query. This requires uncommenting two sections of this query. First, uncomment the matching records:

     1,t1.has_blv as has_blv_a
     2,t2.has_blv as has_blv_b
     3,t1.blv_given_name as blv_given_name_a
     4,t2.blv_given_name as blv_given_name_b
     5,t1.blv_surname as blv_surname_a
     6,t2.blv_surname as blv_surname_b
     7,t1.blv_email as blv_email_a
     8,t2.blv_email as blv_email_b
     9,t1.blv_phone as blv_phone_a
    10,t2.blv_phone as blv_phone_b
    11,t1.blv_address as blv_address_a
    12,t2.blv_address as blv_address_b
    

    and then enable them to be part of the LEFT JOIN operations for unified scores:

    1AND has_blv IS null
    2AND blv_surname IS null
    3AND blv_given_name IS null
    4AND blv_email IS null
    5AND blv_phone IS null
    6AND blv_address IS null
    
  11. Click Run Query and debug any issues that may arise.

  12. Click Activate.