Unified Coalesced table¶
The Unified Coalesced table contains all the PII data that has been processed through Stitch. The data is organized by the semantic tag as the column heading and then coalesced into one single table. A unique Amperity ID may appear in more than one row.
The Unified Coalesced table is added to databases in the Customer 360 page when:
The Stitch QA database is added using the “Stitch QA” database template.
A customer 360 database is added using the “Customer 360” database template.
A table is added to a custom database using the Unified Coalesced table as a passthrough table.
Use with Stitch QA¶
Use the Unified Coalesced table as part of the Stitch QA process to:
Ensure semantic tags are selected and visible to the Queries page
Investigate relationships between records, especially when records appear to be assigned (not assigned) to an incorrect (correct) Amperity ID, which defines a single cluster of records
Investigate blocklisted values
Tip
The Unified Coalesced and Unified Preprocessed Raw tables are frequently used together during Stitch QA because many investigation patterns will compare the values in one table to the values in the other.
Profile semantics and foreign keys¶
Use the Customer 360 page to verify that rows with profile semantic tags and foreign keys are selected to be available to the Queries page. Has the list of profile and/or foreign key semantics changed since the last time you have performed Stitch QA? This will occur when a data source is added that requires a new foreign key or in a situation where one of the less frequently used profile semantic tags is applied to a new data source.
Open the table in the Database Explorer and verify that all rows in the table that are associated with a semantic tag or a foreign key have a checkmark in the left column. Rows without a checkmark will not make the associated FIELD available to the Queries page. If rows do not have a checkmark, edit the table and apply the checkmark, save the table, activate the Stitch QA database, and then run the database to refresh the table.
Relationships between records¶
Use the Queries page to investigate unexpected relationships between individual records that share the same Amperity ID and to investigate why records that look like they should belong to the same cluster of records were split.
Use a combination of the Unified Coalesced and Unified Preprocessed Raw tables to review the data that is associated with these records to help understand why groups of records were matched or why certain records were assigned to a cluster.
In some cases, you may need to investigate and compare entire clusters to better understand why (or why not) these groups of records were maintained (or split) during hierarchical comparison, a step in the Stitch process that occurs after pairwise comparison and scoring.
Global blocklist values¶
The Unified Coalesced table provides additional data about individual records, such as blocklisted values, component IDs, or if a record belongs to a supersized cluster.
In some cases, groups of records may have values removed by the global bad-values blocklist instead of the tenant-specific bad-values blocklist.
Compare records in the Unified Coalesced table to records in the Unified Preprocessed Raw table to identify which values were removed.
Column reference¶
The Unified Coalesced table contains the following columns:
Column name |
Data type |
Description |
---|---|---|
Amperity ID |
String |
The unique identifier that is assigned to clusters of customer records that all represent the same individual. The Amperity ID does not replace primary and foreign keys, but exists alongside them within unified profiles. Note The Amperity ID is a universally unique identifier (UUID) that is represented by 36 characters spread across five groups separated by hyphens: 8-4-4-4-12. For example: 123e4567-e89b-12d3-a456-426614174000
|
Address |
String |
The address that is associated with the location of a customer, such as “123 Main Street”. Values in this column depend on fields that are tagged with the Address semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
address2 |
String |
Additional address information, such as an apartment number or a post office box, that is associated with the location of a customer, such as “Apt #9”. Values in this column depend on fields that are tagged with the Address2 semantic. Also in: Merged Customers, Unified Customer |
Birthdate |
Date |
The date of birth that is associated with a customer. Values in this column depend on fields that are tagged with the birthdate semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
BLV Address |
Boolean |
When true, the address on this customer record matches a blocklist value. |
BLV Email |
Boolean |
When true, the email on this customer record matches a blocklist value. |
BLV Given Name |
Boolean |
When true, the given-name on this customer record matches a blocklist value. |
BLV Phone |
Boolean |
When true, the phone on this customer record matches a blocklist value. |
BLV Surname |
Boolean |
When true, the surname on this customer record matches a blocklist value. |
City |
String |
The city that is associated with the location of a customer. Values in this column depend on fields that are tagged with the City semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
CK |
String |
The ck semantic tag may be applied to a column that contains pre-existing, tenant-specific customer IDs. When customer keys are applied, Amperity compares them to the Amperity ID as part of the deduplication process. Tip What happens to customer keys in the Unified Coalesced table?
Also in: Unified Customer |
Component ID |
Integer |
An identifier that represents a set of records that are transitively connected with a score above threshold as an outcome of blocking and initial scoring. Records that share a component ID, but have different Amperity IDs, were split during hierarchical comparison. Tip Records with different Also in: Unified Preprocessed Raw |
Country |
String |
The country that is associated with the location of a customer. Values in this column depend on fields that are tagged with the Country semantic. Important The country field is added to the Unified Coalesced table when fields are tagged with the country profile semantic. Also in: Merged Customers, Unified Customer |
Create DT |
String |
Apply the create-dt semantic tag to columns in customer records that identify when the data was created. The field to which this semantic is applied must be a datetime field type. Also in: Merged Customers, Unified Customer |
Datasource |
String |
The name of the data source from which this customer record originated. Tip The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table. |
String |
The email address that is associated with a customer. A customer may have more than one email address. Values in this column depend on fields that are tagged with the Email semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
|
FK [Name] |
String |
The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must be namespaced. For example: fk-customer, fk-interaction, fk-audience, or fk-brand. A column is added for each foreign key that is defined in the Sources page. Tip What happens to foreign keys in the Unified Coalesced table?
Also in: Unified Customer, Unified Preprocessed Raw |
Full Name |
String |
A combination of given name (first name) and surname (last name) for a customer. May include a middle name or initial. Values in this column depend on fields that are tagged with the Full Name semantic. Also in: Merged Customers, Unified Customer |
Gender |
String |
The gender that is associated with a customer. Values in this column depend on fields that are tagged with the Gender semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
Generational Suffix |
String |
The suffix that identifies to which family generation a customer record belongs. For example: Jr., Sr. II, and III. Also in: Merged Customers, Unified Customer |
Given Name |
String |
The first name that is associated with a customer. Values in this column depend on fields that are tagged with the Given Name semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
Has BLV |
Boolean |
The |
Is Supersized |
Boolean |
Indicates when a rough heuristic is applied to the first grouping of records (rep_pk) to partition supersized records into smaller components. Supersized records occur when more than 500 groups are associated with the first grouping of records. |
Loyalty ID |
String |
The identifier for a loyalty program that is associated with a customer. This column is added when the loyalty-id semantic tag is applied to customer profiles. Also in: Merged Customers |
Phone |
String |
The phone number that is associated with a customer. A customer may bhave more than one phone number. Values in this column depend on fields that are tagged with the Phone semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
PK |
String |
The combination of data source and primary key allows Amperity to uniquely identify every row in every data table across the entirety of customer data input to Amperity. Tip The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table. |
Postal |
String |
The zip code or postal code that is associated with the location of a customer. Values in this column depend on fields that are tagged with the postal semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
Rep DS |
Integer |
The rep_ds column shows the datasource that is associated with the rep_pk column. |
Rep PK |
Integer |
The rep_pk column is an identifier that represents the first grouping of records done by Stitch. This grouping is based on identical semantic patterns. Tip The combination of Rep DS and Rep PK represent qualified trivial duplications, which are records with enough identical PII to score 3.0 (or greater) and were grouped together by Stitch early in the identity resolution process. All qualified trivial duplicates are treated as a single record by downstream Stitch processes. The Rep DS and Rep PK fields are included in the Unified Coalesced table to help with situations where it’s necessary to understand why two records were not clustered together. |
State |
String |
The state or province that is associated with the location of a customer. Values in this column depend on fields that are tagged with the state semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
Supersized ID |
Integer |
An identifier that represents supersized records that were partitioned into smaller components. Also in: Unified Preprocessed Raw |
Surname |
String |
The last name that is associated with a customer. Values in this column depend on fields that are tagged with the Surname semantic. Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw |
Title |
String |
The title that precedes a full name that is associated with a customer, such as “Mr.”, “Mrs”, and “Dr”. Also in: Merged Customers, Unified Customer |
Update DT |
String |
Apply the update-dt semantic tag to columns in customer records that identify when the data was last updated. The field to which this semantic is applied must be a datetime field type. At least one customer record must have this semantic tag applied to ensure that the update_dt column is created in the Unified Coalesced table and to ensure that the Merged Customers table behaves correctly. Also in: Merged Customers, Unified Customer |