Unified Coalesced table

The Unified Coalesced table has all PII data processed through Stitch. Each semantic tag is a column header. All data is coalesced into a single table. A unique Amperity ID may appear in more than one row.

The Unified Coalesced table is added to databases in the Customer 360 page when:

  1. The Stitch QA database is added using the “Stitch QA” database template.

  2. A customer 360 database is added using the “Customer 360” database template.

  3. A table is added to a custom database using the Unified Coalesced table as a passthrough table.

Use with Stitch QA

Use the Unified Coalesced table as part of the Stitch QA process to:

  • Ensure semantic tags are selected and visible to the Queries page

  • Investigate relationships between records, especially when records appear to be assigned (not assigned) to an incorrect (correct) Amperity ID, which defines a single cluster of records

  • Investigate blocklisted values

Tip

The Unified Coalesced and Unified Preprocessed Raw tables are frequently used together during Stitch QA because many investigation patterns will compare the values in one table to the values in the other.

Profile semantics and foreign keys

Use the Customer 360 page to verify that rows with profile semantic tags and foreign keys are selected to be available to the Queries page. Has the list of profile and/or foreign key semantics changed since the last time you have performed Stitch QA? This will occur when a data source is added that requires a new foreign key or in a situation where one of the less frequently used profile semantic tags is applied to a new data source.

Open the table in the Database Explorer and verify that all rows in the table that are associated with a semantic tag or a foreign key have a checkmark in the left column. Rows without a checkmark will not make the associated FIELD available to the Queries page. If rows do not have a checkmark, edit the table and apply the checkmark, save the table, activate the Stitch QA database, and then run the database to refresh the table.

Relationships between records

Use the Queries page to investigate unexpected relationships between individual records that share the same Amperity ID and to investigate why records that look like they should belong to the same cluster of records were split.

Use a combination of the Unified Coalesced and Unified Preprocessed Raw tables to review the data that is associated with these records to help understand why groups of records were matched or why certain records were assigned to a cluster.

In some cases, you may need to investigate and compare entire clusters to better understand why (or why not) these groups of records were maintained (or split) during hierarchical comparison, a step in the Stitch process that occurs after pairwise comparison and scoring.

Global blocklist values

The Unified Coalesced table provides additional data about individual records, such as blocklisted values, component IDs, or if a record belongs to a supersized cluster.

In some cases, groups of records may have values removed by the global bad-values blocklist instead of the tenant-specific bad-values blocklist.

Compare records in the Unified Coalesced table to records in the Unified Preprocessed Raw table to identify which values were removed.

Column reference

The Unified Coalesced table contains the following columns:

Column name

Data type

Description

Amperity ID

String

The unique identifier assigned to clusters of customer profiles that all represent the same individual. The Amperity ID does not replace primary, foreign, or other unique customer keys, but exists alongside them within unified profiles.

Note

The Amperity ID is a universally unique identifier (UUID) that is 36 characters spread across five groups separated by hyphens: 8-4-4-4-12.

For example:

123e4567-e89b-12d3-a456-426614174000

Address

String

The address connected with the location of a customer, such as “123 Main Street”.

Values in this column depend on fields that are tagged with the Address semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Address2

String

Address information, such as an apartment number or a post office box, connected with the location of a customer, such as “Apt #9”.

Values in this column depend on fields that are tagged with the Address2 semantic.

Also in: Merged Customers, Unified Customer

Birthdate

Date

The date of birth connected with a customer.

Values in this column depend on fields that are tagged with the birthdate semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

BLV Address

Boolean

When true, the address on this customer profile matches a blocklist value.

BLV Email

Boolean

When true, the email on this customer profile matches a blocklist value.

BLV Given Name

Boolean

When true, the given-name on this customer profile matches a blocklist value.

BLV Phone

Boolean

When true, the phone on this customer profile matches a blocklist value.

BLV Surname

Boolean

When true, the surname on this customer profile matches a blocklist value.

City

String

The city connected with the location of a customer.

Values in this column depend on fields that are tagged with the City semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

CK

String

The ck semantic tag identifies pre-existing, tenant-specific customer IDs. Amperity compares customer keys to the Amperity ID as part of the deduplication process.

Tip

What happens to customer keys in the Unified Coalesced table?

  • Records may have NULL customer keys.

  • There may be only one customer key per data source.

Also in: Unified Customer

Component ID

Integer

An identifier that represents a set of records that are transitively connected with a score above threshold as an outcome of blocking and initial scoring. Records that share a component ID, but have different Amperity IDs, are split during hierarchical comparison.

Tip

Records with different component_id values may show as having blocked together. This can occur after removing a connecting record pair that scored below the pairwise comparison threshold.

Also in: Unified Preprocessed Raw

Country

String

The country connected with the location of a customer.

Values in this column depend on fields that are tagged with the Country semantic.

Important

The country field is added to the Unified Coalesced table when fields are tagged with the country profile semantic.

Also in: Merged Customers, Unified Customer

Create DT

String

Apply the create-dt semantic tag to columns that identify the creation date or time. The field must be a datetime field type.

Also in: Merged Customers, Unified Customer

Datasource

String

The name of the data source from which this customer profile originated.

Tip

The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table.

Email

String

The email address connected with a customer. A customer may have more than one email address.

Values in this column depend on fields that are tagged with the Email semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

FK [Name]

String

The fk-[namespace] semantic tag identifies a field as a foreign key. A foreign key semantic tag must use a namespace. For example: fk-customer, fk-interaction, fk-audience, or fk-brand.

A column is added for each foreign key that is defined in the Sources page.

Tip

What happens to foreign keys in the Unified Coalesced table?

  • Records may have NULL foreign keys.

  • There may be multiple foreign keys in the data source, but there may not be duplicate foreign keys.

  • There may be multiple foreign keys per Amperity ID.

  • There should not be multiple Amperity IDs per foreign key.

Also in: Unified Customer, Unified Preprocessed Raw

Full Name

String

A combination of given name and surname–or first name and last name–for a customer. Amperity selects the first non-nil value:

  1. A value tagged with full-name.

  2. A concatenation of values tagged with given-name and surname if they both exist.

  3. A value tagged with given-name.

  4. A value tagged with surname.

Also in: Merged Customers, Unified Customer

Gender

String

The gender connected with a customer.

Values in this column depend on fields that are tagged with the Gender semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Generational Suffix

String

The suffix that identifies to which family generation a customer profile belongs. For example: Jr., Sr. II, and III.

Also in: Merged Customers, Unified Customer

Given Name

String

The first name connected with a customer.

Values in this column depend on fields that are tagged with the Given Name semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Has BLV

Boolean

The has_blv column indicates if blocklist values for address, email, phone, given-name, or surname are present in customer profiles.

Is Supersized

Boolean

Indicates when a rough heuristic applies to the first grouping of records–the rep_pk field–to partition supersized records into smaller parts. Supersized records occur when more than 500 groups associate with the first grouping of records.

Loyalty ID

String

The identifier for a loyalty program connected with a customer.

This column is added when the loyalty-id semantic tag is applied to customer profiles.

Also in: Merged Customers

Phone

String

The phone number connected with a customer. A customer may have more than one phone number.

Values in this column depend on fields that are tagged with the Phone semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

PK

String

The combination of data source and primary key allows Amperity to uniquely identify every row in every data table across the entirety of customer data input to Amperity.

Tip

The combination of PK and Datasource uniquely identifies a row in the Unified Coalesced table, which correlates to a single row in a domain table.

Postal

String

The zip code or postal code connected with the location of a customer.

Values in this column depend on fields that are tagged with the postal semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Rep DS

Integer

The rep_ds column shows the data source connected with the rep_pk column.

Rep PK

Integer

The rep_pk column is an identifier that represents the first grouping of records done by Stitch. This grouping relies on semantic patterns.

Tip

The combination of Rep DS and Rep PK represent qualified trivial duplications, which are records with enough identical PII to score 3.0 (or greater) and were grouped together by Stitch early in the identity resolution process.

All qualified trivial duplicates are treated as a single record by downstream Stitch processes.

The Rep DS and Rep PK fields are included in the Unified Coalesced table to help with situations where it’s necessary to understand why two records were not clustered together.

State

String

The state or province connected with the location of a customer.

Values in this column depend on fields that are tagged with the state semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Supersized ID

Integer

An identifier represents supersized records partitioned into smaller parts.

Also in: Unified Preprocessed Raw

Surname

String

The last name connected with a customer.

Values in this column depend on fields that are tagged with the Surname semantic.

Also in: Customer 360, Merged Customers, Unified Customer, Unified Preprocessed Raw

Title

String

The title that precedes a full name connected with a customer, such as “Mr”, “Mrs”, and “Dr”.

Also in: Merged Customers, Unified Customer

Update DT

String

Apply the update-dt semantic tag to datetime fields in customer profiles that identify the most recent update in the source system. At least one customer profile must have this semantic tag applied to ensure that the update_dt column exists in the Unified Coalesced table.

Also in: Merged Customers, Unified Customer