Unified Coalesced table

The Unified Coalesced table contains all the PII data that has been processed through Stitch. The data is organized by the semantic tag as the column heading and then coalesced into one single table. A unique Amperity ID may appear in more than one row.

The Unified Coalesced table is added to databases in the Customer 360 page when:

  1. The Stitch QA database is added using the “Stitch QA” database template.

  2. A customer 360 database is added using the “Customer 360” database template.

  3. A table is added to a custom database using the Unified Coalesced table as a passthrough table.

Use with Stitch QA

Use the Unified Coalesced table as part of the Stitch QA process to:

  • Ensure semantic tags are selected and visible to the Queries page

  • Investigate relationships between records, especially when records appear to be assigned (not assigned) to an incorrect (correct) Amperity ID, which defines a single cluster of records

  • Investigate blocklisted values

Tip

The Unified Coalesced and Unified Preprocessed Raw tables are frequently used together during Stitch QA because many investigation patterns will compare the values in one table to the values in the other.

Profile semantics and foreign keys

Use the Customer 360 page to verify that rows with profile semantic tags and foreign keys are selected to be available to the Queries page. Has the list of profile and/or foreign key semantics changed since the last time you have performed Stitch QA? This will occur when a data source is added that requires a new foreign key or in a situation where one of the less frequently used profile semantic tags is applied to a new data source.

Open the table in the Database Explorer and verify that all rows in the table that are associated with a semantic tag or a foreign key have a checkmark in the left column. Rows without a checkmark will not make the associated FIELD available to the Queries page. If rows do not have a checkmark, edit the table and apply the checkmark, save the table, activate the Stitch QA database, and then run the database to refresh the table.

Relationships between records

Use the Queries page to investigate unexpected relationships between individual records that share the same Amperity ID and to investigate why records that look like they should belong to the same cluster of records were split.

Use a combination of the Unified Coalesced and Unified Preprocessed Raw tables to review the data that is associated with these records to help understand why groups of records were matched or why certain records were assigned to a cluster.

In some cases, you may need to investigate and compare entire clusters to better understand why (or why not) these groups of records were maintained (or split) during hierarchical comparison, a step in the Stitch process that occurs after pairwise comparison and scoring.

Global blocklist values

The Unified Coalesced table provides additional data about individual records, such as blocklisted values, component IDs, or if a record belongs to a supersized cluster.

In some cases, groups of records may have values removed by the global bad-values blocklist instead of the tenant-specific bad-values blocklist.

Compare records in the Unified Coalesced table to records in the Unified Preprocessed Raw table to identify which values were removed.

Column reference

The Unified Coalesced table contains the following columns:

Column name

Data type

Description

Amperity ID

String

The unique identifier that is assigned to clusters of customer records that all represent the same individual. The Amperity ID does not replace primary and foreign keys, but exists alongside them within unified profiles.

Note

The Amperity ID is a universally unique identifier (UUID) that is represented by 36 characters spread across five groups separated by hyphens: 8-4-4-4-12.

For example:

123e4567-e89b-12d3-a456-426614174000

Address

String

The address that is associated with the location of an individual customer record. For example: 123 Main Street.

Values in this column depend on fields that are tagged with the Address semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

address2

String

Additional address information, such as an apartment number or a post office box, that is associated with the location of an individual customer record. For example: Apt #9.

Values in this column depend on fields that are tagged with the Address2 semantic.

Also in: Merged Customers, Unified Customer

Birthdate

Date

The date of birth that is associated with an individual customer record.

Values in this column depend on fields that are tagged with the birthdate semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

BLV Address

Boolean

When true, the address on this customer record matches a blocklist value.

BLV Email

Boolean

When true, the email on this customer record matches a blocklist value.

BLV Given Name

Boolean

When true, the given-name on this customer record matches a blocklist value.

BLV Phone

Boolean

When true, the phone on this customer record matches a blocklist value.

BLV Surname

Boolean

When true, the surname on this customer record matches a blocklist value.

City

String

The city that is associated with the location of an individual customer record.

Values in this column depend on fields that are tagged with the City semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

CK

String

The ck semantic tag may be applied to a column that contains pre-existing, tenant-specific customer IDs. When customer keys are applied, Amperity compares them to the Amperity ID as part of the deduplication process.

Tip

What happens to customer keys in the Unified Coalesced table?

  • Records may have NULL customer keys.

  • There may be only one customer key per data source.

  • There may be multiple customer keys per Amperity ID. This is because customer keys may also be tagged as foreign keys.

Also in: Unified Customer

Component ID

Integer

An identifier that represents a set of records that are transitively connected with a score above threshold as an outcome of blocking and initial scoring. Records that share a component ID, but have different Amperity IDs, were split during hierarchical comparison.

Tip

Records with different component_id values may show as having been blocked together. This can occur when a connecting record pair was removed because it scored below the pairwise comparison threshold.

Also in: Unified Preprocessed Raw

Country

String

The country that is associated with the location of an individual customer record.

Values in this column depend on fields that are tagged with the Country semantic.

Important

The country field is added to the Unified Coalesced table when fields are tagged with the country profile semantic.

Also in: Merged Customers, Unified Customer

Create DT

String

Apply the create-dt semantic tag to columns in customer records that identify when the data was created. The field to which this semantic is applied must be a datetime field type.

Also in: Merged Customers, Unified Customer

Datasource

String

The name of the data source from which this customer record originated.

Tip

The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table.

Email

String

The email address that is associated with an individual customer record. A customer record may be associated with multiple email addresses.

Values in this column depend on fields that are tagged with the Email semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

FK [Name]

String

The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must be namespaced. For example: fk-customer, fk-interaction, fk-audience, or fk-brand.

A column is added for each foreign key that is defined in the Sources page.

Tip

What happens to foreign keys in the Unified Coalesced table?

  • Records may have NULL foreign keys.

  • There may be multiple foreign keys in the data source, but there may not be duplicate foreign keys.

  • There may be multiple foreign keys per Amperity ID.

  • There should not be multiple Amperity IDs per foreign key.

Also in: Unified Customer, Unified Preprocessed Raw

Full Name

String

A combination of given name (first name) and surname (last name) that is associated with an individual customer record and is stored as a combined value in a single field within customer data. A full name may include a middle name or initial.

Values in this column depend on fields that are tagged with the Full Name semantic.

Also in: Merged Customers, Unified Customer

Gender

String

The gender that is associated with an individual customer record.

Values in this column depend on fields that are tagged with the Gender semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Generational Suffix

String

The suffix that identifies to which family generation a customer record belongs. For example: Jr., Sr. II, and III.

Also in: Merged Customers, Unified Customer

Given Name

String

The first name that is associated with an individual customer record.

Values in this column depend on fields that are tagged with the Given Name semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Has BLV

Boolean

The has_blv column indicates if blocklist values for address, email, phone, given-name, or surname are present in customer records.

Is Supersized

Boolean

Indicates when a rough heuristic is applied to the first grouping of records (rep_pk) to partition supersized records into smaller components. Supersized records occur when more than 500 groups are associated with the first grouping of records.

Loyalty ID

String

The identifier for a loyalty program that is associated with an individual customer record.

Also in: Merged Customers

Phone

String

The phone number that is associated with an individual customer record. A customer record may be associated with multiple phone numbers.

Values in this column depend on fields that are tagged with the Phone semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

PK

String

The combination of data source and primary key allows Amperity to uniquely identify every row in every data table across the entirety of customer data input to Amperity.

Tip

The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table.

Postal

String

The zip code or postal code that is associated with the location of an individual customer record.

Values in this column depend on fields that are tagged with the postal semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Rep DS

Integer

The rep_ds column shows the datasource that is associated with the rep_pk column.

Rep PK

Integer

The rep_pk column is an identifier that represents the first grouping of records done by Stitch. This grouping is based on identical semantic patterns.

Tip

The combination of Rep DS and Rep PK represent qualified trivial duplications, which are records with enough identical PII to score 3.0 (or greater) and were grouped together by Stitch early in the identity resolution process.

All qualified trivial duplicates are treated as a single record by downstream Stitch processes.

The Rep DS and Rep PK fields are included in the Unified Coalesced table to help with situations where it’s necessary to understand why two records were not clustered together.

State

String

The state or province that is associated with the location of an individual customer record.

Values in this column depend on fields that are tagged with the state semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Supersized ID

Integer

An identifier that represents supersized records that were partitioned into smaller components.

Also in: Unified Preprocessed Raw

Surname

String

The last name that is associated with an individual customer record.

Values in this column depend on fields that are tagged with the Surname semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Title

String

The title that precedes a full name that is associated with an individual customer record. For example: Mr., Mrs, and Dr.

Also in: Merged Customers, Unified Customer

Update DT

String

Apply the update-dt semantic tag to columns in customer records that identify when the data was last updated. The field to which this semantic is applied must be a datetime field type. At least one customer record must have this semantic tag applied to ensure that the update_dt column is created in the Unified Coalesced table and to ensure that the Merged Customers table behaves correctly.

Also in: Merged Customers, Unified Customer