Amperity Bridge

Amperity Bridge allows users to share data between Amperity and a data lakehouse using industry-standard data formats. Each bridge can be quickly configured to enable inbound and/or outbound connections that give your brand access to shared tables without replicating data.

Advantages of Amperity Bridge include:

  • Fast setup Connect Amperity to a lakehouse in minutes using sharing keys instead of integrations.

  • Zero copy Control access to shared tables without replicating data across platforms. Build pipelines faster and consolidate your brand’s storage costs into a single location.

  • Scalable processing Enrich massive volumes of data quickly. Data is not moved or transformed from where it resides. Model customer data directly in the lakehouse or model it in Amperity.

  • Live data View customer data at rest in a lakehouse or in Amperity through a shared catalog. Explore and query data without waiting for refreshes or updates.

Outbound shares

Delta Sharing is an open protocol for simple and secure sharing of live data between organizations without copying data to another system and regardless of which computing platforms are used.

A bridge represents a connection between Amperity and an external lakehouse. Each bridge may be configured for one inbound and one outbound connection.

An outbound share represents the configuration for how a shared dataset is made available to a lakehouse from Amperity, such as using the Delta Sharing open protocol to share data with Databricks.

An outbound share is configured in a series of steps across Databricks and Amperity.

Tip

If you have already installed and configured the Databricks CLI and have permission to configure catalogs and providers in Databricks, the configuration process for outbound shares takes about 5 minutes.

  1. Prerequisites

  2. Add bridge

  3. Select tables to share

  4. Download credential file

  5. Add provider

  6. Add catalog from share

  7. Verify table sharing

Prerequisites

Before you can create outbound sharing between Amperity and Databricks the Databricks CLI must be installed and configured on your workstation and you must have permission to create providers and catalogs in Databricks.

Requirement 1.

The Databricks CLI must be installed and configured on your workstation.

For new users …

If you have not already set up and configured the Databricks CLI you will need to do the following:

  1. Install the Databricks CLI .

  2. Get a personal access token .

  3. Configure the Databricks CLI for your local machine.

    Run the databricks configure command, after which you will be asked to enter the hostname for your instance of Databricks along with your personal access token.

Requirement 2.

The user who will run the Databricks CLI and add a schema to Databricks for outbound sharing from Amperity must have CREATE PROVIDER permissions in Databricks.

Requirement 3.

The user who will add the schema to a catalog in Databricks must have CREATE CATALOG permissions in Databricks.

Requirement 4.

A user who will run queries against tables in a schema must have SELECT permissions in Databricks. SELECT permissions may be granted on a specific table, on a schema, or on a catalog.

Add outbound bridge

A bridge represents a connection between Amperity and an external lakehouse. Each bridge may be configured for one inbound and one outbound connection.

To add an outbound bridge

Step 1.

Open the Destinations page. Under Outbound shares click Add bridge. This opens the Create bridge dialog box.

Step 2.

Add the name for the bridge and a description, and then set the duration for which the token will remain active.

Add a bridge for an outbound share.

Optional. You may restrict access to specific IPs or to a valid CIDR (for a range of IPs). Place separate entries on a new line. Expand Advanced Settings to restrict access.

When finished, click Create. This will open the Select tables to share dialog box, in which you will configure any combination of databases and tables to share with Databricks.

Select tables to share

A shared dataset represents all databases and/or database tables that are configured for outbound sharing with another organization.

You can configure Amperity to share any combination of databases and tables that are available from the Customer 360 page.

To select databases and tables to share

Step 1.

After you have configured the settings for the bridge, click Next to open the Select tables to share dialog box.

Select databases and tables to be shared.

You may select any combination of databases and tables.

If you select a database, all tables in that database will be shared, including all changes made to all tables in that database.

When finished, click Save. This will open the Download credential dialog box, from which you will download the credentials.share file that is required by the Databricks CLI when creating a catalog in Databricks.

Step 2.

When a bridge is already configured, you may edit the list of databases and tables that are shared. From the Destinations page, under Outbound shares, open the Actions for a bridge, and then click Edit. This will open the Select tables to share dialog box.

Download credential file

There are two ways to download the credential file:

Step 1.

Click the Download credential button as part of the steps shown when you configure a bridge by clicking the Add bridge button located under Outbound shares on the Destinations page.

Step 2.

Choosing the Download credential option from the Actions menu for an outbound share.

Add provider

Use the Databricks CLI to create a provider in Databricks. Attach the credentials that were downloaded from Amperity to the schema as part of the command that creates the bridge between Amperity and the provider in Databricks.

To add a provider to Databricks

Step 1.

Open the Databricks CLI in a command window.

Step 2.

Run the databricks providers create command:

$ databricks providers create socktown \
  TOKEN \
  -recipient-profile-str "$(< path/to/config.share)"

where TOKEN is your Databricks personal access token and “path/to/config.share” represents the path to the location into which the Amperity credentials file was downloaded.

Step 3.

A successful response from Databricks is similar to:

{
  "authentication_type":"TOKEN",
  "created_at":1714696789105,
  "created_by":"user@socktown.com",
  "name":"socktown",
  "owner":"user@socktown.com",
  "recipient_profile": {
    "endpoint":"URL for Amperity bridge endpoint",
    "share_credentials_version":1
  },
  "updated_at":1714696789105,
  "updated_by":"user@socktown.com"
}

You must have CREATE PROVIDER permissions

An error message is returned when a user who runs the databricks providers create command does not have CREATE PROVIDER permissions to the Databricks metastore.

This error is similar to:

Error: User does not have CREATE PROVIDER \
on Metastore '<metastore>'.

If you receive this error message:

  1. Ask your Databricks administrator to assign to your Databricks user account the CREATE PROVIDER permission.

  2. Rerun the databricks providers create command.

Create a provider using Python

You can use Python to create a rovider from the Databricks UI. This requires the same information to be provided to Databricks as the CLI and is similar to:

import requests

headers = {
  'Authorization': f'Bearer {ACCESS_TOKEN}'
}
workspace = 'WORKSPACE_NAME'
endpoint = "api/2.1/unity-catalog/providers"
url = f"https://{workspace}.cloud.databricks.com/{endpoint}"

data = {
  "name": "BRIDGE_NAME",
  "authentication_type": "TOKEN",
  "comment": "Amperity Bridge",
  "recipient_profile_str": "path/to/config.share"
}

response = requests.post(url, headers=headers, json=data)
response.json()

Add catalog from share

A catalog is the first layer in a Unity Catalog namespace and is used to organize data assets within Databricks.

To add a schema to a catalog in Databricks

Step 1.

Log in to Databricks, and then open the Catalog Explorer.

Step 2.

In the Catalog Explorer, expand Delta Sharing, and then select Shared with me.

This will display the list of schemas to which you have access.

Step 3.

From the list of schemas, select the schema you just created.

Click the Create catalog button, and then in the Create a new catalog dialog add the catalog name. A catalog name should clearly identify that data tables are shared from Amperity. For example: “Amperity Socktown outbound share”. A catalog name cannot include a period, space, or forward slash. When finihsed, click Create.

You must have CREATE CATALOG permissions

An error message is returned when a user who attempts to add a schema to a catalog does not have CREATE CATALOG permissions to the Databricks metastore.

This error is similar to:

Requires permission CREATE CATALOG \
on Metastore '<metastore>'.

If you receive this error message:

  1. Ask your Databricks administrator to assign to your Databricks user account the CREATE CATALOG permission.

  2. Click the Create catalog button and retry adding the schema to the catalog.

Verify table sharing

Verify that the tables shared from Amperity are available from a catalog in Databricks.

To verify that tables were shared from Amperity to Databricks

Step 1.

From the Catalog Explorer in Databricks, expand Catalog, and then find the catalog that was created for sharing Amperity data.

Step 2.

Open the catalog, and then verify that the tables you shared from Amperity are available in the catalog.

Amperity data in a Databricks Unity Catalog.