Detailed examples

Use this query to return a detailed set of examples that show you what is in your customer 360 database. Use this data to identify which types of features lead to scores with the largest effect on overall Stitch results, including how those features are associated with various combinations of fields that contain PII data.

When to use

  1. Run this query on a regular basis.

  2. Review the results of this query (typically ~30-60 minutes).

  3. Look for clusters with anomalies above the threshold.

  4. Investigate each individual cluster in which anomalies are discovered.

Configure query

  1. From the Queries page, open the Stitch QA folder, and then select this query.

    Tip

    Add the Stitch QA queries template folder if it does not already exist.

  2. Update the list of matched records to reflect those used for your tenant. The list of matched records is an A/B pair of semantic fields. The default template includes given-name, surname, email, phone, address, birthdate, city, state, and postal.

    Add matched records for one (or more) of the full-name, generational-suffix, gender, address2, company, and title semantic fields. For example:

     ,t1.given_name AS given_name_a
     ,t2.given_name AS given_name_b
     ,t1.surname AS surname_a
     ,t2.surname AS surname_b
     ,t1.full_name AS full_name_a
     ,t2.full_name AS full_name_b
     ,t1.generational_suffix AS generational_suffix_a
     ,t2.generational_suffix AS generational_suffix_b
     ,t1.title AS title_a
     ,t2.title AS title_b
     ,t1.email AS email_a
     ,t2.email AS email_b
     ,t1.phone AS phone_a
     ,t2.phone AS phone_b
     ,t1.address AS address_a
     ,t2.address AS address_b
     ,t1.address2 AS address2_a
     ,t2.address2 AS address2_b
     ,t1.birthdate AS birthdate_a
     ,t2.birthdate AS birthdate_b
     ,t1.city AS city_a
     ,t2.city AS city_b
     ,t1.state AS state_a
     ,t2.state AS state_b
     ,t1.country AS country_a
     ,t2.country AS country_b
     ,t1.postal AS postal_a
     ,t2.postal AS postal_b
     ,t1.company AS company_a
     ,t2.company AS company_b
     ,t1.gender AS gender_a
     ,t2.gender AS gender_b
    
  3. When customer keys are present, uncomment the matching records:

    ,t1.pk AS pk_a
    ,t2.pk AS pk_b
    ,t1.ck AS ck_a
    ,t2.ck AS ck_b
    
  4. Update the matching records for all foreign keys:

    -- ,t1.fk_field_name AS fk_field_name_a
    -- ,t2.fk_field_name AS fk_field_name_b
    

    For example:

    t1.fk_customer_id AS fk_customer_id_a,
    t2.fk_customer_id AS fk_customer_id_b,
    t1.fk_campaign AS fk_campaign_a
    t2.fk_campaign AS fk_campaign_b
    
  5. Rows may be numbered by case (default) or by Amperity ID.

    ROW_NUMBER() OVER (PARTITION BY scores.score
                       ORDER BY scores.amperity_id
                                ,scores.pk1
                                ,scores.pk2)
                       AS case_count
    -- ROW_NUMBER() OVER (PARTITION BY scores.amperity_id
    --                    ORDER BY scores.amperity_id,
    --                             scores.pk1,
    --                             scores.pk2)
    --                    AS amperity_id_count
    
  6. Set the case count limit to be greater than “1” to return more examples of each feature:

    Filtered_Examples AS (
      SELECT * FROM Detailed_Examples
      WHERE case_count <= 1
      AND score_count > 1000
    )
    

    Important

    Set case_count to “1” when calculating coverage.

  7. Set the score count limit:

    Filtered_Examples AS (
      SELECT * FROM Detailed_Examples
      WHERE case_count <= 1
      AND score_count > 1000
    )
    
  8. Update the list of fields for filtered examples for the list of semantic values you want included in the results.

    For example, to include email, given_name, address, and phone:

    SELECT *
      FROM Filtered_Examples
      WHERE email_a <> email_b
      AND given_name_a <> given_name_b
      AND address_a <> address_b
      AND phone_a <> phone_b
    ORDER BY score DESC
    

    Hint

    This should be the same list as specified for matched records.

  9. To return coverage statistics, uncomment the SELECT statement under coverage stats:

    SELECT
      SUM(score_count) AS example_coverage
      ,(SELECT COUNT(1) FROM Unified_Scores) AS total_scores,
      ,CAST(SUM(score_count) as DECIMAL(12,3))
            / CAST((SELECT COUNT(1) FROM Unified_Scores) AS DECIMAL(12,3))
            * CAST(100 AS DECIMAL) AS coverage_percentage
    FROM Filtered_Examples
    
  10. If using the bad-values blocklist you may apply those blocklist values to the results of this query. This requires uncommenting two sections of this query. First, uncomment the matching records:

    ,t1.has_blv as has_blv_a
    ,t2.has_blv as has_blv_b
    ,t1.blv_given_name as blv_given_name_a
    ,t2.blv_given_name as blv_given_name_b
    ,t1.blv_surname as blv_surname_a
    ,t2.blv_surname as blv_surname_b
    ,t1.blv_email as blv_email_a
    ,t2.blv_email as blv_email_b
    ,t1.blv_phone as blv_phone_a
    ,t2.blv_phone as blv_phone_b
    ,t1.blv_address as blv_address_a
    ,t2.blv_address as blv_address_b
    

    and then enable them to be part of the LEFT JOIN operations for unified scores:

    AND has_blv IS null
    AND blv_surname IS null
    AND blv_given_name IS null
    AND blv_email IS null
    AND blv_phone IS null
    AND blv_address IS null
    
  11. Click Run Query and debug any issues that may arise.

  12. Click Activate.