Connect Databricks to Snowflake

Some organizations choose to store their data in Snowflake, but then use Databricks to enable data scientists, engineers, developers, and data analysts within their organization to use that data, along with a combination of Databricks SQL, R, Scala, and/or Python, to build models and tools that support external BI applications and domain-specific tools to help end-users consume that data through the interface they are most comfortable with.

You may send data (tables and/or entire databases) from Amperity to Snowflake, and then connect to that data from Databricks.

What is Snowflake?

Snowflake is an analytic data warehouse that is fast, easy to use, and flexible. Snowflake uses a SQL database engine that is designed for the cloud. Snowflake can provide tables as a data source to Amperity.

Add workflow

Amperity can be configured to share data (tables and/or entire databases) directly with Snowflake. Databricks can be configured to connect to a Snowflake data warehouse, and then use that data as a data source.

Important

You must configure Amperity to send data to an instance of Snowflake that your organization manages directly.

Connect Databricks to Snowflake.

To connect Databricks to Snowflake

The steps required to configure Amperity to send data that is accessible to Databricks from a Snowflake data warehouse requires completion of a series of short workflows, some of which must be done outside of Amperity.

Step 1.

Configure Snowflake objects for the correct database, tables, roles, and users. (Refer to the :ref:Amazon S3 <destination-snowflake-aws-configure-objects> or Azure tutorial, as appropriate for your tenant.)

Note

Snowflake can be configured to run in Amazon AWS or Azure. When using the Amazon Data Warehouse you will use the same cloud platform as your Amperity tenant. When using your own instance of Snowflake, you should use the same Amazon S3 bucket or Azure Blob Storage container that is included with your tenant when configuring Snowflake for data sharing, but then connect Databricks directly to your own instance of Snowflake.

Step 2.

Send data to Snowflake from Amperity. (Refer to the Amazon S3 or Azure tutorial, as appropriate for your tenant.)

Step 3.

Connect Databricks to Snowflake , and then access the data sent from Amperity.

Note

The URL for the Snowflake data warehouse, the Snowflake username, the password, and the name of the Snowflake data warehouse are sent to the Databricks user within a SnapPass link. Request this information from your Amperity representative prior to attempting to connect Databricks to Snowflake.

Step 4.

Validate the workflow within Amperity and the data within Databricks.

Step 5.

Configure Amperity to automate this workflow for a regular (daily) refresh of data.