Foreign key validation

Use this query to compare two versions of a data source or two distinct data sources and return coverage and distribution data for foreign keys across multiple data sources.

Note

This query should not be used to compare distinct foreign keys across data sources, such as fk-name-id and fk-customer-id.

When to use

  1. Looking for name IDs with multiple Amperity IDs.

  2. Looking for name IDs with different Amperity IDs.

  3. Inspecting name IDs between tables.

  4. Checking for foreign keys with multiple Amperity IDs between tables.

  5. Checking for foreign keys with different Amperity IDs between tables.

  6. Inspecting foreign keys between tables.

Configure query

  1. From the Queries page, open the Stitch QA folder, and then select this query.

    Tip

    Add the Stitch QA queries template folder if it does not already exist.

  2. Uncomment the specific SELECT statement to run, and then update that statement for the foreign key names and table names.

  3. Click Run Query and debug any issues that may arise.

  4. Click Activate.

Names, multiple IDs

Use any of the following SELECT statements to look for name IDs with multiple Amperity IDs. Update name_id to the name ID, and data_source and/or data_source_v2 to the name of the table and/or table version:

SELECT
  name_id
  ,COUNT(DISTINCT amperity_id)
FROM
  data_source
GROUP BY 1
HAVING COUNT(DISTINCT amperity_id) > 1

SELECT
  name_id
  ,COUNT(DISTINCT amperity_id)
FROM
  data_source_v2
GROUP BY 1
HAVING COUNT(DISTINCT amperity_id) > 1

Names, different IDs

Use any of the following SELECT statements to look for name IDs with different Amperity IDs. Update name_id to the name ID, and data_source and/or data_source_v2 to the name of the table and/or table version:

SELECT
  v1.name_id
  ,v1.amperity_id
  ,v2.amperity_id
FROM (
  SELECT DISTINCT
    name_id
    ,amperity_id
  FROM
    data_source_v2)
  AS v1
FULL OUTER JOIN (
  SELECT DISTINCT
    name_id
    ,amperity_id
  FROM
    data_source)
  AS v2
ON v1.name_id = v2.name_id
WHERE v1.amperity_id <> v2.amperity_id
LIMIT 100

Inspect Names

Use any of the following SELECT statements to inspect name IDs. Update name_id to the name ID, and data_source and/or data_source_v2 to the name of the table and/or table version:

SELECT COUNT(DISTINCT name_id)
FROM data_source
SELECT COUNT(DISTINCT name_id)
FROM data_source_v2
SELECT COUNT(DISTINCT name_id)
FROM data_source
WHERE name_id IN (
  SELECT name_id
  FROM data_source_v2
)
SELECT COUNT(DISTINCT name_id)
FROM data_source_v2
WHERE name_id NOT IN (
  SELECT name_id
  FROM data_source
)

Multiple Amperity IDs

Use any of the following SELECT statements to look for foreign keys that have multiple Amperity IDs. Update fk_field_name to the name of a foreign key, and data_source, data_source_1 and/or data_source_2 to the name of the table:

SELECT
 fk_field_name
 ,COUNT(DISTINCT amperity_id)
FROM
  data_source_1
GROUP BY 1
HAVING COUNT(DISTINCT amperity_id) > 1

SELECT
  fk_field_name
  ,COUNT(DISTINCT amperity_id)
FROM
  data_source_2
GROUP BY 1
HAVING COUNT(DISTINCT amperity_id) > 1

Different Amperity IDs

Use any of the following SELECT statements to look for foreign keys that have different Amperity IDs. Update fk_field_name to the name of a foreign key, and data_source, data_source_1 and/or data_source_2 to the name of the table:

SELECT
  v1.fk_field_name
  ,v1.amperity_id
  ,v2.amperity_id
FROM (
  SELECT DISTINCT
    fk_field_name
    ,amperity_id
  FROM
    data_source_2
  )
AS v1 FULL OUTER JOIN (
  SELECT DISTINCT
    fk_field_name
    ,amperity_id
  FROM
    data_source_1
  )
AS v2 ON v1.fk_field_name = v2.fk_field_name
WHERE v1.amperity_id <> v2.amperity_id
LIMIT 100

Inspect keys between tables

Use any of the following SELECT statements to inspect foreign keys between tables. Update fk_field_name to the name of a foreign key, and data_source, data_source_1 and/or data_source_2 to the name of the table:

SELECT
  COUNT(DISTINCT fk_field_name)
FROM
  data_source
SELECT
  COUNT(DISTINCT fk_field_name)
FROM
  data_source_1
WHERE fk_field_name IN (
  SELECT fk_field_name
  FROM data_source_2
)
SELECT
  COUNT(DISTINCT fk_field_name)
FROM data_source_1
WHERE fk_field_name NOT IN (
  SELECT fk_field_name
  FROM data_source_2
)
SELECT
  COUNT(DISTINCT fk_field_name)
FROM data_source_2
WHERE fk_field_name NOT IN (
  SELECT fk_field_name
  FROM data_source_1
)