Amperity Bridge

Amperity Bridge allows users to share data between Amperity and a data lakehouse using industry-standard data formats. Each bridge can be quickly configured to enable inbound and/or outbound connections that give your brand access to shared tables without replicating data.

Advantages of Amperity Bridge include:

  • Fast setup Connect Amperity to a lakehouse in minutes using sharing keys instead of integrations.

  • Zero copy Control access to shared tables without replicating data across platforms. Build pipelines faster and consolidate your brand’s storage costs into a single location.

  • Scalable processing Enrich massive volumes of data quickly. Data is not moved or transformed from where it resides. Model customer data directly in the lakehouse or model it in Amperity.

  • Live data View customer data at rest in a lakehouse or in Amperity through a shared catalog. Explore and query data without waiting for refreshes or updates.

Inbound shares

Delta Sharing is an open protocol for simple and secure sharing of live data between organizations without copying data to another system and regardless of which computing platforms are used.

A bridge represents a connection between Amperity and an external lakehouse. Each bridge may be configured for one inbound and one outbound connection.

An inbound share represents the configuration for how a shared dataset is made available from a lakehouse to Amperity, such as using the Delta Sharing open protocol to share data with Databricks.

An inbound share is configured in a series of steps across Databricks and Amperity.

  1. Inbound prerequisites

  2. Configure Databricks

  3. Add bridge

Inbound prerequisites

Before you can create inbound sharing between Databricks and Amperity a recipient and share must be created in Databricks, after which tables are added to the share and access to the share is granted to the recipient. The user who performs these actions may use the Databricks CLI or the Databricks Catalog Explorer and must the CREATE RECIPIENT, CREATE SHARE, USE CATALOG, USE SCHEMA, and SELECT permissions, along with the ability to grant the recipient access to the share.

Requirement 1.

The user who will create a recipient for sharing data from Databricks to Amperity must have CREATE RECIPIENT permissions in Databricks.

Note

If a Databricks notebook is used to create the recipient the cluster must use Databricks Runtime 11.3 LTS (or higher) and must be running in shared mode or single-cluster access mode.

Requirement 2.

The user who will create a share in the Unity Catalog metastore must have CREATE SHARE permissions in Databricks.

Requirement 3.

The user who will add tables to a share must:

  • Be a share owner; Databricks recommends to use a group as the share owner.

  • Have USE CATALOG and USE SCHEMA permissions on the catalog and schema in which the tables are located.

  • Have SELECT permissions to each table.

Requirement 4.

The user who grants the recipient access to the metastore must be one of the following:

  • A metastore administrator.

  • A user with delegated permissions or ownership on both the share and recipient objects.

    If the user created the recipient and share, they are the share owner and recipient owner.

    If the user did not create the recipient and share they will need USE SHARE and SET SHARE PERMISSION on the share and USE RECIPIENT on the recipient.

Requirement 5.

The IP address for Amperity may need to be added to an allowlist.

Most connections are made directly to your Amperity tenant. Use one of the following Amperity IP addresses for an allowlist that is required by an upstream system. The specific IP address to use depends on the location in which your tenant is hosted:

  • On Amazon AWS use “52.42.237.53”

  • On Amazon AWS (Canada) use “3.98.199.97”

  • On Microsoft Azure use “104.46.106.84”

  • On Microsoft Azure (EU) use “20.123.127.54”

Configure Databricks

To configure Databricks to share data with Amperity you will need to create a share and add tables to that share, create a recipient , grant the recipient access to the share , and then get an activation link . The activation link allows a user to download a credential file that is required to configure inbound sharing in Amperity.

Note

The following section briefly describes using the Databricks Catalog Explorer to configure Databricks to be ready to share data with Amperity, along with links to Databricks documentation for each step. You may use the Databricks CLI if you prefer. Instructions for using the Databricks CLI are available from the linked pages.

To configure Databricks for inbound sharing to Amperity

Step 1.

A share is a securable object in Unity Catalog that can be configured to share tables with Amperity.

Open the Databricks Catalog Explorer. Under Delta Sharing, choose Shared by me, then select Share data, and then create a share .

After you have created the share you may add tables to the share . Click Add assets, and then select the tables to share.

Step 2.

A recipient in Databricks represents the entity that will consume shared data: Amperity. Configure the recipient for open sharing and to use token-based authentication.

Open the Databricks Catalog Explorer. Under Delta Sharing, choose Shared by me, and then click New recipient to create a recipient .

After the recipient is created, grant the recipient access to the share .

Step 3.

Open sharing uses token-based authentication.

The credentials file that contains the token is available from an activation link . Use a secure channel to share the activation link with the user who will download the credentials file, and then configure Amperity for inbound sharing.

Important

You can download the credential file only once. Recipients should treat the downloaded credential as a secret and must not share it outside of their organization. If you have concerns that a credential may have been handled insecurely, you can rotate credentials at any time.

Add inbound bridge

A bridge represents a connection between Amperity and an external lakehouse. Each bridge may be configured for one inbound and one outbound connection.

To add an inbound bridge

Step 1.

Open the Sources page. Under Inbound shares click Add bridge. This opens the Create bridge dialog box.

Add a bridge for an inbound share.

Add the name for the bridge and a description or select an existing bridge, and then click Confirm.

Step 2.

Connect the bridge to Databricks by uploading the credential file that was downloaded from the activation link . There are two ways to upload the credential file:

  1. Uploading the credentials as the second step when adding a bridge. Drop the file into the dialog box or browse to a location on your local machine.

  2. Choosing the Upload credential option from the Actions menu for an inbound share.

After the credential file is uploaded, click Continue.

Important

You can download the credential file only once. Recipients should treat the downloaded credential as a secret and must not share it outside of their organization. If you have concerns that a credential may have been handled insecurely, you can rotate credentials at any time.

When finished, click Continue. This will open the Select tables to share dialog box.

Step 3.

Use the Select tables to share dialog box to select any combination of schemas and tables to be synced to Amperity.

Select schemas and tables to be shared.

If you select a schema, all tables in that schema will be synced. Any new tables added later will need to be manually added to the sync.

When finished, click Next. This will open the Domain table mapping dialog box.

Step 4.

Map the tables that are shared from Databricks to domain tables in Amperity.

Map inbound shared tables to domain tables.

Tables that are shared with Amperity are added as domain tables.

  • The names of shared tables must be unique among all domain tables.

  • Primary keys are not assigned.

  • Semantic tags are not applied.

Tip

Use a custom domain table to assign primary keys, apply semantic tags, and shape data within shared tables to support any of your Amperity workflows.

When finished, click Save and sync. This will start a workflow that synchronizes data from Databricks to Amperity and will create the mapped domain table names.

You can manually sync tables that are shared with Amperity using the Sync option from the Actions menu for the inbound bridge.

Outbound shares

Delta Sharing is an open protocol for simple and secure sharing of live data between organizations without copying data to another system and regardless of which computing platforms are used.

A bridge represents a connection between Amperity and an external lakehouse. Each bridge may be configured for one inbound and one outbound connection.

An outbound share represents the configuration for how a shared dataset is made available to a lakehouse from Amperity, such as using the Delta Sharing open protocol to share data with Databricks.

An outbound share is configured in a series of steps across Databricks and Amperity.

Tip

If you have already installed and configured the Databricks CLI and have permission to configure catalogs and providers in Databricks, the configuration process for outbound shares takes about 5 minutes.

  1. Outbound prerequisites

  2. Add bridge

  3. Select tables to share

  4. Download credential file

  5. Add provider

  6. Add catalog from share

  7. Verify table sharing

Outbound prerequisites

Before you can create outbound sharing between Amperity and Databricks the Databricks CLI must be installed and configured on your workstation and you must have permission to create providers and catalogs in Databricks.

Requirement 1.

The Databricks CLI must be installed and configured on your workstation.

For new users …

If you have not already set up and configured the Databricks CLI you will need to do the following:

  1. Install the Databricks CLI .

  2. Get a personal access token .

  3. Configure the Databricks CLI for your local machine.

    Run the databricks configure command, after which you will be asked to enter the hostname for your instance of Databricks along with your personal access token.

Requirement 2.

The user who will run the Databricks CLI and add a schema to Databricks for outbound sharing from Amperity must have CREATE PROVIDER permissions in Databricks.

Requirement 3.

The user who will add the schema to a catalog in Databricks must have CREATE CATALOG permissions in Databricks.

Requirement 4.

A user who will run queries against tables in a schema must have SELECT permissions in Databricks. SELECT permissions may be granted on a specific table, on a schema, or on a catalog.

Add outbound bridge

A bridge represents a connection between Amperity and an external lakehouse. Each bridge may be configured for one inbound and one outbound connection.

To add an outbound bridge

Step 1.

Open the Destinations page. Under Outbound shares click Add bridge. This opens the Create bridge dialog box.

Step 2.

Add the name for the bridge and a description, and then set the duration for which the token will remain active.

Add a bridge for an outbound share.

Optional. You may restrict access to specific IPs or to a valid CIDR (for a range of IPs). Place separate entries on a new line. Expand Advanced Settings to restrict access.

When finished, click Create. This will open the Select tables to share dialog box, in which you will configure any combination of schemas and tables to share with Databricks.

Select tables to share

A shared dataset represents all databases and/or database tables that are configured for outbound sharing with another organization.

You can configure Amperity to share any combination of schemas and tables that are available from the Customer 360 page.

To select schemas and tables to share

Step 1.

After you have configured the settings for the bridge, click Next to open the Select tables to share dialog box.

Select schemas and tables to be shared.

You may select any combination of schemas and tables.

If you select a schema, all tables in that schema will be shared, including all changes made to all tables in that schema.

When finished, click Save. This will open the Download credential dialog box, from which you will download the credentials.share file that is required by the Databricks CLI when creating a catalog in Databricks.

Step 2.

When a bridge is already configured, you may edit the list of schemas and tables that are shared. From the Destinations page, under Outbound shares, open the Actions for a bridge, and then click Edit. This will open the Select tables to share dialog box.

Download credential file

There are two ways to download the credential file:

Step 1.

Click the Download credential button as part of the steps shown when you configure a bridge by clicking the Add bridge button located under Outbound shares on the Destinations page.

Step 2.

Choosing the Download credential option from the Actions menu for an outbound share.

Add provider

Databricks supports a variety of methods for adding a provider to a catalog. Use the method that works best for your organization:

Databricks UI

You can create a provider directly from the Databricks user interface. Upload the Amperity share credentials directly as part of this process.

Step 1.

Open the Databricks user interface. Open Catalog Explorer, then Delta Sharing, and then Shared with me.

Step 2.

At the bottom of the Shared with me page, click the Import provider directly button. This opens the Import Provider dialog.

Add a provider using the Databricks user interface.

Give the provider a name, and then upload the credential for the Amperity share.

Click Import. This opens the providers page.

Step 3.

On the providers page, click Create catalog to add a catalog for the data that is shared from Amperity.

Databricks CLI

You can use the Databricks CLI to create a provider in Databricks. Attach the credentials that were downloaded from Amperity to the schema as part of the command that creates the bridge between Amperity and the provider in Databricks.

Step 1.

Open the Databricks CLI in a command window.

Step 2.

Run the databricks providers create command:

$ databricks providers create socktown \
  TOKEN \
  -recipient-profile-str "$(< path/to/config.share)"

where TOKEN is your Databricks personal access token, socktown is the name of the provider, and “path/to/config.share” represents the path to the location into which the Amperity credentials file was downloaded.

Databricks CLI and Windows environments

If you are running the Databricks CLI using Powershell, the command is similar to:

$ databricks providers create socktown \
  TOKEN \
  --recipient-profile-str \
    (Get-Content -Raw path\to\config.share)

If you are running the Databricks CLI using CMD, the command is similar to:

setlocal enabledelayedexpansion ^
set "str=" ^
for /f "delims=" %a in (path\to\config.share) ^
do set "str=!str!%a" ^
databricks providers create socktown TOKEN ^
--recipient-profile-str "!str!" ^
endlocal
Step 3.

A successful response from Databricks is similar to:

{
  "authentication_type":"TOKEN",
  "created_at":1714696789105,
  "created_by":"user@socktown.com",
  "name":"socktown",
  "owner":"user@socktown.com",
  "recipient_profile": {
    "endpoint":"URL for Amperity bridge endpoint",
    "share_credentials_version":1
  },
  "updated_at":1714696789105,
  "updated_by":"user@socktown.com"
}

You must have CREATE PROVIDER permissions

An error message is returned when a user who runs the databricks providers create command does not have CREATE PROVIDER permissions to the Databricks metastore.

This error is similar to:

Error: User does not have CREATE PROVIDER \
on Metastore '<metastore>'.

If you receive this error message:

  1. Ask your Databricks administrator to assign to your Databricks user account the CREATE PROVIDER permission.

  2. Rerun the databricks providers create command.

Python

You can use Python to create a provider from the Databricks UI. This requires the same information to be provided to Databricks as the CLI and is similar to:

import requests

headers = {
  'Authorization': f'Bearer {ACCESS_TOKEN}'
}
workspace = 'WORKSPACE_NAME'
endpoint = "api/2.1/unity-catalog/providers"
url = f"https://{workspace}.cloud.databricks.com/{endpoint}"

data = {
  "name": "BRIDGE_NAME",
  "authentication_type": "TOKEN",
  "comment": "Amperity Bridge",
  "recipient_profile_str": "path/to/config.share"
}

response = requests.post(url, headers=headers, json=data)
response.json()

Add catalog from share

A catalog is the first layer in a Unity Catalog namespace and is used to organize data assets within Databricks.

To add a schema to a catalog in Databricks

Step 1.

Log in to Databricks, and then open the Catalog Explorer.

Step 2.

In the Catalog Explorer, expand Delta Sharing, and then select Shared with me.

This will display the list of schemas to which you have access.

Step 3.

From the list of schemas, select the schema you just created.

Click the Create catalog button, and then in the Create a new catalog dialog add the catalog name. A catalog name should clearly identify that data tables are shared from Amperity. For example: “Amperity Socktown outbound share”. A catalog name cannot include a period, space, or forward slash. When finihsed, click Create.

You must have CREATE CATALOG permissions

An error message is returned when a user who attempts to add a schema to a catalog does not have CREATE CATALOG permissions to the Databricks metastore.

This error is similar to:

Requires permission CREATE CATALOG \
on Metastore '<metastore>'.

If you receive this error message:

  1. Ask your Databricks administrator to assign to your Databricks user account the CREATE CATALOG permission.

  2. Click the Create catalog button and retry adding the schema to the catalog.

Verify table sharing

Verify that the tables shared from Amperity are available from a catalog in Databricks.

To verify that tables were shared from Amperity to Databricks

Step 1.

From the Catalog Explorer in Databricks, expand Catalog, and then find the catalog that was created for sharing Amperity data.

Step 2.

Open the catalog, and then verify that the tables you shared from Amperity are available in the catalog.

Amperity data in a Databricks Unity Catalog.