Configure Amperity Bridge for Google BigQuery

Amperity Bridge for Google BigQuery is a first-class integration enables inbound data access to Amperity from Google BigQuery without copying data or scheduling ETL workloads.

What is BigQuery Sharing?

BigQuery Sharing publishes and shares datasets between organizations. Create an exchange, and then invite organizations to publish or subscribe to data in the exchange.

Tip

Amperity Bridge shares data directly with Google BigQuery. A bridge does not require a courier or a database export to be configured. A bridge can be configured to run automatically as part of any scheduled workflow.

Data types

Most Google BigQuery data types are supported by Amperity Bridge.

Important

If an upstream data type changes, edit the bridge in Amperity, accept the changes, and then save and sync the bridge.

Note

BYTES and RANGE data types are unsupported.

The following table describes how Google BigQuery data types map to Amperity data types.

Google BigQuery data type

Amperity data type

ARRAY

An ordered list of zero or more elements of non-array values.

Each element must be a Google BigQuery type that maps to a supported Amperity type.

Array

An ordered list of zero or more elements of non-array values by field name and value.

Fields within an Array must contain values for data types supported by Amperity.

BIGINT

A numeric value and an alias for INT64.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values less than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

BIGNUMERIC

A decimal value with precision of 76.

Important

Amperity supports precision for BIGNUMERIC data types when they are less than or equal to 38. Precision that is greater than 38 is not supported.

Decimal

A fixed point number, such as for prices or message sizes. The number of characters in the decimal value is configurable. For example:

  • 1.50 (prices)

  • 1874.380 (message sizes)

  • 141.48042 (order subtotals)

BOOLEAN , BOOL

A value that can be TRUE, FALSE, or NULL.

Boolean

A value that can be TRUE, FALSE, or NULL.

BYTES

Variable-length binary data.

Warning

The Google BigQuery BYTES data type is unsupported. Exclude fields with BYTES data types from tables before sharing them with Amperity.

BYTEINT

A numeric value and an alias for INT64.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values less than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

DATE

A Gregorian calendar date, independent of time zone.

Date

An ISO-8601 compliant date value, such as a birthdate. For example:

  • 2021-11-04

DATETIME

A Gregorian date and time, as they might be displayed on a watch, independent of time zone.

Datetime

ISO-8601 compliant date and time values, such as a purchase or transaction, the time at which data was last updated, or a campaign launch date. For example:

  • Mon Nov 30 2020 16:00:00 GMT-0800 (Pacific Standard Time)

  • Sat Sep 02 2017 14:36:19 GMT-0700 (Pacific Daylight Time)

DECIMAL (p,s)

A decimal value with a precision of 38 digits.

Decimal

A fixed point number, such as for prices or message sizes. The number of characters in the decimal value is configurable. For example:

  • 1.50 (prices)

  • 1874.380 (message sizes)

  • 141.48042 (order subtotals)

FLOAT64

An approximate double precision numeric value.

Float

A floating point number. For example:

  • 3.14

  • 3.14159

GEOGRAPHY

A collection of points, linestrings, and polygons that represent a set or subset of the surface of the Earth.

String

A string of characters.

INT

A numeric value and an alias for INT64.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values less than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

INT64

A numeric value up to 64 bits.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values less than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

INTEGER

A numeric value and an alias for INT64.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values less than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

JSON

Represents JSON, a lightweight data-interchange format.

String

A string of characters.

NUMERIC (p,s)

A decimal value with a precision of up to 38 digits.

Decimal

A fixed point number, such as for prices or message sizes. The number of characters in the decimal value is configurable. For example:

  • 1.50 (prices)

  • 1874.380 (message sizes)

  • 141.48042 (order subtotals)

RANGE

A contiguous range between two dates, datetimes, or timestamps.

Warning

The Google BigQuery RANGE data type is unsupported. Exclude fields with RANGE data types from tables before sharing them with Amperity.

SMALLINT

A numeric value and an alias for INT64.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values less than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

STRING

Variable-length character data.

String

A sequence of characters, such as first and last names, email addresses, physical addresses, UUIDs and other IDs, phone numbers, zip codes, product names, and descriptions. May be empty. For example:

  • John

  • Smith

  • John Smith

  • johnsmith @ domain.com

  • 123 Main Street

  • 206-555-1111

  • 00002ac0-0915-3cb4-b7c7-5ee192b3bd49

  • ACME

  • pants

  • “A data source that pulls from an Amazon S3 bucket.”

STRUCT

A container of ordered fields. Fields within a STRUCT must contain values for data types supported by Amperity.

Struct

A container of ordered fields by name and type.

TIME

A time of day, as might be displayed on a clock, independent of a specific date and time zone.

String

A sequence of characters that represents the time of day.

TIMESTAMP

An absolute point in time, independent of any time zone or convention, such as daylight saving time (DST).

Datetime

ISO-8601 compliant date and time values, such as a purchase or transaction, the time at which data was last updated, or a campaign launch date. For example:

  • Mon Nov 30 2020 16:00:00 GMT-0800 (Pacific Standard Time)

  • Sat Sep 02 2017 14:36:19 GMT-0700 (Pacific Daylight Time)

TINYINT

A numeric value and an alias for INT64.

Note

BIGINT, BYTEINT, INT, INTEGER, SMALLINT, and TINYINT data types are aliases for INT64 and represent values with fewer than 64 bits.

Integer

A numeric value, such as the quantity of items purchased. For example:

  • 1

  • 12345

From Google BigQuery

A sync from Google BigQuery to Amperity requires configuration steps to be made in both Amperity and Google BigQuery.

  1. Get details

  2. Configure Google BigQuery

  3. Add inbound bridge

Get details

Before you can create inbound sharing between Google BigQuery and Amperity you will need to collect the following information.

Requirement 1.

The individual who configures Amperity to share data with Google BigQuery must be assigned the Datagrid Administrator policy in Amperity or have the ability to configure a bridge for Google BigQuery in a sandbox.

Requirement 2.

A Google Cloud account with the BigQuery API enabled .

Requirement 3.

The user who creates the data exchange and listing in Google BigQuery must have permission to access to the Google Analytics Hub API. The following permissions are required:

  • serviceUsage.services.get

  • serviceUsage.services.list

  • serviceUsage.services.enable

Note

roles/serviceusage.serviceUsageAdmin is a predefined IAM role that includes the permissions required to access the Analytics Hub API.

Requirement 4.

The user who creates the data exchange and listing in Google BigQuery must have permission to create listings. The following permissions are required:

  • bigquery.datasets.get

  • bigquery.datasets.update

Note

roles/bigquery.dataOwner and roles/bigquery.admin are predefined IAM roles that include the permissions required to create listings in Google BigQuery.

Requirement 5.

The following values are required to connect Amperity to a Google BigQuery data listing:

  1. Share link (for the data exchange)

    This value is available after creating the data exchange and data listing in Google BigQuery. Copy this value. It is required when adding the inbound share.

    https://console.cloud.google.com/bigquery/ \
    analytics-hub/exchanges/projects/123456789012/ \
    locations/<location>/dataExchanges/<exchange>/ \
    listings/<listing>
    
  2. Subscriber

    Amperity generates an account identifer to use for activating sharing in Google BigQuery. Copy this value and use it to configure permissions to the data exchange. Add a principal using this value and assign it to the Analytics Hub Subscriber role.

    The account identifier generated by Amperity is similar to:

    amperity-bridge-<bridge-id>@ \
    amperity.iam.gserviceaccount.com
    

Configure Google BigQuery

To configure Google BigQuery to sync data with Amperity do the following:

  1. Create data exchange

  2. Create data listing

  3. Configure subscriber and share link

Create data exchange

A data exchange in Google BigQuery is private by default. Only users and groups that have access to the data exchange can view or subscribe to data listings within the data exchange.

Create a data exchange in Google BigQuery and then allow Amperity access to that data exchange.

To create a Google BigQuery data exchange

Step 1.

Open the BigQuery Console to the Sharing page.

On the Sharing page click Create exchange.

Step 2.

In the Exchange configuration pane, do the following:

  1. Select the project and region for the data exchange.

  2. Enter a display name.

  3. Add your Amperity account locator.

Note

If you are configuring a new bridge, leave the Amperity account locator field blank. This will be added after the account identifier is created.

Step 3.

Click Create exchange.

Important

Do not configure permissions for the data exchange. The value for the Add principal field must be the value that is generated by Amperity.

Click Skip.

Create data listing

A data listing in Google BigQuery is a reference to a shared dataset that exists in a data exchange. A data listing is private by default.

A data listing within a private data exchange may be shared with Amperity.

To create a Google BigQuery data listing

Step 1.

On the Sharing page click the name of the exchange.

On the Listings tab for the exchange click Create listing.

Step 2.

From the Resource type dropdown select “BigQuery Dataset”, and then select a dataset.

Note

Configuring data egress controls is optional.

Step 3.

Configure a series of details for the data listing.

Click Next.

  • Required. Add a display name.

  • Optional. Configure listing details, discoverability, and email logging.

Click Next

  • Optional. Configure the contact information for primary, provider, and publisher contacts.

Step 4.

Click Publish.

On the Listings tab, in the row for the listing that will be shared with Amperity, from the ellipses menu select Copy share link.

Amperity generates a URL similar to:

https://console.cloud.google.com/bigquery/analytics-hub/exchanges/projects/123456789012/locations/<location>/dataExchanges/<exchange>/listings/<listing>

Configure this identifier in Google BigQuery as the subscriber for the data listing.

Add inbound bridge

Configure an inbound bridge to sync data from Google BigQuery to Amperity.

To add an inbound bridge for Google BigQuery

Step 1.

Open the Sources page. Under Inbound shares click Add bridge.

Choose Google BigQuery.

Add a bridge for a sync.

This opens the Add bridge dialog box.

Add a bridge for a sync.

Add a name and description for the bridge or select an existing bridge, and then click Confirm.

Step 2.

Add the share link that was copied from the Google BigQuery data exchange.

Amperity generates the subscriber link after the share link has been configured. The share link is a URL similar to:

https://console.cloud.google.com/bigquery/analytics-hub/exchanges/projects/123456789012/locations/<location>/dataExchanges/<exchange>/listings/<listing>

Configure this identifier in Google BigQuery as the subscriber for the data listing.

Step 3.

Use the Select tables dialog box to select any combination of schemas and tables to be synced to Amperity.

Select schemas and tables to be shared.

If you select a schema, all tables in that schema will be synced. Any new tables added later will need to be manually added to the sync.

When finished, click Next. This will open the Domain table mapping dialog box.

Step 4.

Map the tables that are synced from Google BigQuery to domain tables in Amperity.

Map synced tables to domain tables.

Tables that are synced with Amperity are added as domain tables.

  • The names of synced tables must be unique among all domain tables.

  • Primary keys are not assigned.

  • Semantic tags are not applied.

Tip

Use a custom domain table to assign primary keys, apply semantic tags, and shape data within synced tables to support any of your Amperity workflows.

When finished, click Save and sync. This will start a workflow that synchronizes data from Google BigQuery to Amperity and will create the mapped domain table names.

You can manually sync tables that are synced with Amperity using the Sync option from the Actions menu for the inbound bridge.