Detailed Examples table

The Detailed Examples table has detailed examples of Stitch results. Use these examples to help identify which features lead to scores with the biggest effect on Stitch results, including how they associate with various combinations of fields that contain PII data.

The Detailed Examples table is automatically added to the Stitch QA database as a SQL table when it is generated using the “Stitch QA” database template.

Use with Stitch QA

Use the Detailed Examples table as part of the Stitch QA process to return a detailed set of examples that show you what is in your customer 360 database. Use this data to identify which types of features lead to scores with the largest effect on overall Stitch results, including how those features are associated with various combinations of fields that contain profile (PII) data.

Configure query

The Detailed Examples query should be configured to match the fields that are in your tenant. Open this table in the Database Editor in edit mode and make the following updates:

  1. Uncomment rows of matching a/b pairs or add new rows so that the uncommented rows match your tenant.

  2. Adjust case count and score count limits, if necessary.

  3. Add customer keys (ck) if they are available (and update this query if customer keys are added).

  4. Uncomment rows for the bad-values blocklist after it is enabled for your tenant.

Column reference

The Detailed Examples table contains the following columns, as its starting point. This table is typically updated to add more pairs, enable use for blocklisted values, and to support additional tenant-specific use cases.

Column name

Data type

Description

Amperity ID

String

The unique identifier assigned to clusters of customer profiles that all represent the same individual. The Amperity ID does not replace primary, foreign, or other unique customer keys, but exists alongside them within unified profiles.

Note

The Amperity ID is a universally unique identifier (UUID) that is 36 characters spread across five groups separated by hyphens: 8-4-4-4-12.

For example:

123e4567-e89b-12d3-a456-426614174000

Address A

Address B

String

The address connected with the location of a customer, such as “123 Main Street”.

Birthdate A

Birthdate B

String

The date of birth connected with a customer.

Case Count

String

City A

City B

String

The city connected with the location of a customer.

Country A

Country B

String

The country connected with the location of a customer.

Datasource A

Datasource B

String

The name of the data source from which this customer profile originated.

Tip

The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table.

Email A

Email B

String

The email address connected with a customer. A customer may have more than one email address.

Given Name A

Given Name B

String

The first name connected with a customer.

Match Category

String

A match category is a classifier that applies to an individual record-pair within a cluster of record-pairs. The match category is the result of this classification.

Match Category

Description

Exact

Amperity has the highest confidence that these records represent the same person because all profile data exactly matches.

Excellent

Amperity has near perfect confidence that these records belong to the same person, despite select types of profile data not matching.

High

Using deductive reasoning, Amperity has high confidence that these records match, despite some profile data not matching.

Moderate

Amperity has moderate confidence that these records match, due to weak or fuzzy matches between unique customer attributes, such as email, phone, or address.

Weak

Amperity lacks confidence, but if asked to guess, Amperity would assert these records do belong to the same individual, because they match on non-unique customer attributes, such as name, state, zip code.

Non-match

Amperity has high confidence that these records do NOT match, because core profile data is in conflict.

Also in: Unified Scores

Phone A

Phone B

String

The phone number connected with a customer. A customer may have more than one phone number.

PK A

PK B

String

The combination of data source and primary key allows Amperity to uniquely identify every row in every data table across the entirety of customer data input to Amperity.

Tip

The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table.

Postal A

Postal B

String

The zip code or postal code connected with the location of a customer.

Score

Float

A score has a value from “0.0” to “5.0” that represents the combined score assigned to the record pair by Stitch. A score has two parts: the score is on the left side and the score’s strength is on the right.

The record pair score correlates to the match category, which is a classifier applied by Amperity to individual record pairs. The record pair score corresponds to the classification:

  • 5 for exact matches

  • 4 for excellent matches

  • 3 for high matches

  • 2 for moderate matches

  • 1 for weak matches

  • 0 for no matches

The record pair strength represents the strength of the record pair score. It is a two digit number. For example: .31 is a lower strength and .93 is a higher strength.

Note

Scores are shown for records that end up in the same cluster, including any scores that are below threshold. Scores are not shown for records that do not end up in the same cluster.

Also in: Unified Scores

Score Count

String

State A

State B

String

The state or province connected with the location of a customer.

Surname A

Surname B

String

The last name connected with a customer.

SQL reference

  1WITH Filtered_Scores AS (
  2  SELECT
  3    score
  4    ,COUNT(1) AS c
  5    ,500/COUNT(1) AS p
  6  FROM Unified_Scores
  7  GROUP BY score HAVING COUNT(1) > 10000
  8),
  9Sampled_Pairs AS (
 10  SELECT
 11    us.amperity_id
 12    ,us.match_category
 13    ,us.source1
 14    ,us.source2
 15    ,us.pk1
 16    ,us.pk2
 17    ,us.score
 18  FROM Filtered_Scores fs
 19  JOIN Unified_Scores us
 20  ON us.score = fs.score
 21  WHERE rand() < p
 22)
 23
 24SELECT
 25  scores.match_category
 26  ,scores.score
 27  ,fs.c AS score_count
 28  ,scores.amperity_id
 29  ,t1.datasource AS datasource_a
 30  ,t2.datasource AS datasource_b
 31  ,t1.given_name AS given_name_a
 32  ,t2.given_name AS given_name_b
 33  ,t1.surname AS surname_a
 34  ,t2.surname AS surname_b
 35  -- UNCOMMENT BELOW IF full_name IS AVAILABLE
 36  -- ,t1.full_name AS full_name_a
 37  -- ,t2.full_name AS full_name_b
 38  -- UNCOMMENT BELOW IF generational_suffix IS AVAILABLE
 39  -- ,t1.generational_suffix AS generational_suffix_a
 40  -- ,t2.generational_suffix AS generational_suffix_b
 41  -- UNCOMMENT BELOW IF title IS AVAILABLE
 42  -- ,t1.title AS title_a
 43  -- ,t2.title AS title_b
 44  ,t1.email AS email_a
 45  ,t2.email AS email_b
 46  ,t1.phone AS phone_a
 47  ,t2.phone AS phone_b
 48  ,t1.address AS address_a
 49  ,t2.address AS address_b
 50  -- UNCOMMENT BELOW IF address2 IS AVAILABLE
 51  -- ,t1.address2 AS address2_a
 52  -- ,t2.address2 AS address2_b
 53  ,t1.birthdate AS birthdate_a
 54  ,t2.birthdate AS birthdate_b
 55  ,t1.city AS city_a
 56  ,t2.city AS city_b
 57  ,t1.state AS state_a
 58  ,t2.state AS state_b
 59  ,t1.country AS country_a
 60  ,t2.country AS country_b
 61  ,t1.postal AS postal_a
 62  ,t2.postal AS postal_b
 63  -- UNCOMMENT BELOW IF company IS AVAILABLE
 64  -- ,t1.company AS company_a
 65  -- ,t2.company AS company_b
 66  -- UNCOMMENT BELOW IF gender IS AVAILABLE
 67  -- ,t1.gender AS gender_a
 68  -- ,t2.gender AS gender_b
 69  ,t1.pk AS pk_a
 70  ,t2.pk AS pk_b
 71  -- UNCOMMENT BELOW IF ck IS AVAILABLE
 72  -- ,t1.ck AS ck_a
 73  -- ,t2.ck AS ck_b
 74  -- OPTIONAL: UNCOMMENT FOR BAD-VALUES BLACKLIST
 75  -- ,t1.has_blv AS has_blv_a
 76  -- ,t2.has_blv AS has_blv_b
 77  -- ,t1.blv_given_name AS blv_given_name_a
 78  -- ,t2.blv_given_name AS blv_given_name_b
 79  -- ,t1.blv_surname AS blv_surname_a
 80  -- ,t2.blv_surname AS blv_surname_b
 81  -- ,t1.blv_email AS blv_email_a
 82  -- ,t2.blv_email AS blv_email_b
 83  -- ,t1.blv_phone AS blv_phone_a
 84  -- ,t2.blv_phone AS blv_phone_b
 85  -- ,t1.blv_address AS blv_address_a
 86  -- ,t2.blv_address AS blv_address_b
 87  -- USE AS DESIRED FOR namespaced fks
 88  -- ,t1.fk_field_name AS fk_field_name_a
 89  -- ,t2.fk_field_name AS fk_field_name_b
 90  ,ROW_NUMBER() OVER (PARTITION BY scores.score
 91                      ORDER BY scores.amperity_id, scores.pk1, scores.pk2
 92                      ) AS case_count
 93  -- ,ROW_NUMBER() OVER (PARTITION BY scores.amperity_id
 94                         ORDER BY scores.amperity_id, scores.pk1, scores.pk2
 95                         ) AS amperity_id_count
 96FROM Sampled_Pairs AS scores
 97LEFT JOIN Filtered_Scores
 98  AS fs ON fs.score = scores.score
 99LEFT JOIN Unified_Coalesced
100  AS t1 ON scores.pk1 = t1.pk AND scores.source1 = t1.datasource
101LEFT JOIN Unified_Coalesced
102  AS t2 ON scores.pk2 = t2.pk AND scores.source2 = t2.datasource
103-- OPTIONAL: UNCOMMENT FOR BAD-VALUES BLOCKLIST
104-- AND has_blv IS NULL
105-- AND blv_surname IS NULL
106-- AND blv_given_name IS NULL
107-- AND blv_email IS NULL
108-- AND blv_phone IS NULL
109-- AND blv_address IS NULL