Unified_Coalesced table¶
The Unified_Coalesced table contains all the PII data that has been processed through Stitch. The data is organized by the semantic tag as the column heading and then coalesced into one single table. A unique Amperity ID may appear in more than one row.
The Unified_Coalesced table is added to databases in the Customer 360 tab when:
The Stitch QA database is added using the “Stitch QA” database template.
A customer 360 database is added using the “Customer 360” database template.
A table is added to a custom database using the Unified_Coalesced table as a passthrough table.
Use with Stitch QA¶
Use the Unified_Coalesced table as part of the Stitch QA process to:
Ensure semantic tags are selected and visible to the Queries page
Investigate relationships between records, especially when records appear to be assigned (not assigned) to an incorrect (correct) Amperity ID, which defines a single cluster of records
Investigate blocklisted values
Tip
The Unified_Coalesced and Unified_Preprocessed_Raw tables are frequently used together during Stitch QA because many investigation patterns will compare the values in one table to the values in the other.
Profile semantics and foreign keys¶
Use the Customer 360 page to verify that rows with profile semantic tags and foreign keys are selected to be available to the Queries page. Has the list of profile and/or foreign key semantics changed since the last time you have performed Stitch QA? This will occur when a data source is added that requires a new foreign key or in a situation where one of the less frequently used profile semantic tags is applied to a new data source.
Open the table in the Database Explorer and verify that all rows in the table that are associated with a semantic tag or a foreign key have a checkmark in the left column. Rows without a checkmark will not make the associated FIELD available to the Queries page. If rows do not have a checkmark, edit the table and apply the checkmark, save the table, activate the Stitch QA database, and then run the database to refresh the table.
Relationships between records¶
Use the Queries page to investigate unexpected relationships between individual records that share the same Amperity ID and to investigate why records that look like they should belong to the same cluster of records were split.
Use a combination of the Unified_Coalesced and Unified_Preprocessed_Raw tables to review the data that is associated with these records to help understand why groups of records were matched or why certain records were assigned to a cluster.
In some cases, you may need to investigate and compare entire clusters to better understand why (or why not) these groups of records were maintained (or split) during hierarchical comparison, a step in the Stitch process that occurs after pairwise comparison and scoring.
Global blocklist values¶
The Unified_Coalesced table provides additional data about individual records, such as blocklisted values, component IDs, or if a record belongs to a supersized cluster.
In some cases, groups of records may have values removed by the global bad-values blocklist instead of the tenant-specific bad-values blocklist.
Compare records in the Unified_Coalesced table to records in the Unified_Preprocessed_Raw table to identify which values were removed.
Column reference¶
The Unified_Coalesced table contains the following columns:
Column Name |
Data type |
Description |
---|---|---|
amperity_id |
String |
The unique identifier that is assigned to clusters of customer records that all represent the same individual. The Amperity ID does not replace primary and foreign keys, but exists alongside them within unified profiles. Note The Amperity ID is a universally unique identifier (UUID) that is represented by 36 characters spread across five groups separated by hyphens: 8-4-4-4-12. For example: 123e4567-e89b-12d3-a456-426614174000
|
address |
String |
The address that is associated with the location of an individual customer record. For example: 123 Main Street. Values in this column depend on fields that are tagged with the address semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
address2 |
String |
Additional address information, such as an apartment number or a post office box, that is associated with the location of an individual customer record. For example: Apt #9. Values in this column depend on fields that are tagged with the address2 semantic. Also in: Merged_Customers, Unified_Customer |
birthdate |
Date |
The date of birth that is associated with an individual customer record. Values in this column depend on fields that are tagged with the birthdate semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
blv_address |
Boolean |
When true, the address on this customer record matches a blocklist value. |
blv_email |
Boolean |
When true, the email on this customer record matches a blocklist value. |
blv_given_name |
Boolean |
When true, the given-name on this customer record matches a blocklist value. |
blv_phone |
Boolean |
When true, the phone on this customer record matches a blocklist value. |
blv_surname |
Boolean |
When true, the surname on this customer record matches a blocklist value. |
city |
String |
The city that is associated with the location of an individual customer record. Values in this column depend on fields that are tagged with the city semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
ck |
String |
The ck semantic tag may be applied to a column that contains pre-existing, tenant-specific customer IDs. When customer keys are applied, Amperity compares them to the Amperity ID as part of the deduplication process. Tip What happens to customer keys in the Unified_Coalesced table?
Also in: Unified_Customer |
component_id |
Integer |
An identifier that represents a set of records that are transitively connected with a score above threshold as an outcome of blocking and initial scoring. Records that share a component ID, but have different Amperity IDs, were split during hierarchical comparison. Tip Records with different Also in: Unified_Preprocessed_Raw |
country |
String |
The country that is associated with the location of an individual customer record. Values in this column depend on fields that are tagged with the country semantic. Important The country field is added to the Unified_Coalesced table when fields are tagged with the country profile semantic. Also in: Merged_Customers, Unified_Customer |
create_dt |
String |
Apply the create-dt semantic tag to columns in customer records that identify when the data was created. The field to which this semantic is applied must be a datetime field type. Also in: Merged_Customers, Unified_Customer |
datasource |
String |
The name of the data source from which this customer record originated. Tip The combination of pk and datasource uniquely identifies a row in the Unified_Coalesced table, which correlates to a single row in a domain table. |
String |
The email address that is associated with an individual customer record. A customer record may be associated with multiple email addresses. Values in this column depend on fields that are tagged with the email semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
|
fk_[name] |
String |
The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must be namespaced. For example: fk-customer, fk-interaction, fk-audience, or fk-brand. A column is added for each foreign key that is defined in the Sources tab. Tip What happens to foreign keys in the Unified_Coalesced table?
Also in: Unified_Customer, Unified_Preprocessed_Raw |
full_name |
String |
A combination of given name (first name) and surname (last name) that is associated with an individual customer record and is stored as a combined value in a single field within customer data. A full name may include a middle name or initial. Values in this column depend on fields that are tagged with the full-name semantic. Also in: Merged_Customers, Unified_Customer |
gender |
String |
The gender that is associated with an individual customer record. Values in this column depend on fields that are tagged with the gender semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
generational_suffix |
String |
The suffix that identifies to which family generation a customer record belongs. For example: Jr., Sr. II, and III. Also in: Merged_Customers, Unified_Customer |
given_name |
String |
The first name that is associated with an individual customer record. Values in this column depend on fields that are tagged with the given-name semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
has_blv |
Boolean |
The |
is_supersized |
Boolean |
Indicates when a rough heuristic is applied to the first grouping of records (rep_pk) to partition supersized records into smaller components. Supersized records occur when more than 500 groups are associated with the first grouping of records. |
loyalty_id |
String |
The identifier for a loyalty program that is associated with an individual customer record. Also in: Merged_Customers |
phone |
String |
The phone number that is associated with an individual customer record. A customer record may be associated with multiple phone numbers. Values in this column depend on fields that are tagged with the phone semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
pk |
String |
The combination of data source and primary key allows Amperity to uniquely identify every row in every data table across the entirety of customer data input to Amperity. Tip The combination of pk and datasource uniquely identifies a row in the Unified_Coalesced table, which correlates to a single row in a domain table. |
postal |
String |
The zip code or postal code that is associated with the location of an individual customer record. Values in this column depend on fields that are tagged with the postal semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
rep_ds |
Integer |
The rep_ds column shows the datasource that is associated with the rep_pk column. |
rep_pk |
Integer |
The rep_pk column is an identifier that represents the first grouping of records done by Stitch. This grouping is based on identical semantic patterns. Tip The combination of rep_ds and rep_pk represent qualified trivial duplications, which are records with enough identical PII to score 3.0 (or greater) and were grouped together by Stitch early in the identity resolution process. All qualified trivial duplicates are treated as a single record by downstream Stitch processes. The rep_ds and rep_pk fields are included in the Unified_Coalesced table to help with situations where it’s necessary to understand why two records were not clustered together. |
state |
String |
The state or province that is associated with the location of an individual customer record. Values in this column depend on fields that are tagged with the state semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
supersized_id |
Integer |
An identifier that represents supersized records that were partitioned into smaller components. Also in: Unified_Preprocessed_Raw |
surname |
String |
The last name that is associated with an individual customer record. Values in this column depend on fields that are tagged with the surname semantic. Also in: Customer_360, Merged_Customers, Unified_Customer, Unified_Preprocessed_Raw |
title |
String |
The title that precedes a full name that is associated with an individual customer record. For example: Mr., Mrs, and Dr. Also in: Merged_Customers, Unified_Customer |
update_dt |
String |
Apply the update-dt semantic tag to columns in customer records that identify when the data was last updated. The field to which this semantic is applied must be a datetime field type. At least one customer record must have this semantic tag applied to ensure that the update_dt column is created in the Unified_Coalesced table and to ensure that the Merged_Customers table behaves correctly. Also in: Merged_Customers, Unified_Customer |