Connect Databricks to Azure Synapse Analytics

Some organizations choose to store their data in Azure Synapse Analytics, but then use Databricks to enable data scientists, engineers, developers, and data analysts within their organization to use that data, along with a combination of Databricks SQL, R, Scala, and/or Python, to build models and tools that support external BI applications and domain-specific tools to help end-users consume that data through the interface they are most comfortable with.

You may send an Apache Parquet, Apache Avro, CSV, or JSON file from Amperity to Azure Synapse Analytics, and then connect to that data from Databricks.

What is Azure Synapse Analytics?

Azure Synapse Analytics is a limitless analytics service and data warehouse. Azure Synapse Analytics has four components: SQL analytics, Apache Spark, hybrid data integration, and a unified user experience.

Add workflow

Amperity can be configured to send data to Azure Blob Storage, after which Azure Synapse Analytics is configured to load that data from Azure Blob Storage. Databricks can be configured to connect to Azure Synapse Analytics and use the Amperity output as a data source.

Important

You may use the Azure Blob Storage container that comes with your Amperity tenant for the intermediate step (if your Amperity tenant is running on Azure). Or you may configure Amperity to send data to an Azure container (Azure Blob Storage or Azure Data Lake Storage) that your organization manages directly.

Connect Databricks to Azure Synapse Analytics.

To connect Databricks to Azure Synapse Analytics

The steps required to configure Amperity to send data that is accessible to Databricks from Azure Synapse Analytics requires completion of a series of short workflows, some of which must be done outside of Amperity.

Step 1.

Use a query to return the data you want to send to Databricks.

Step 2.

Send an Apache Parquet, Apache Avro, CSV, or JSON file to an Azure container: Azure Blob Storage.

Step 3.

Load the file from the Azure container to Azure Synapse Analytics .

Step 4.

Connect Databricks to Azure Synapse Analytics , and then access the data sent from Amperity.

Step 5.

Validate the workflow within Amperity and the data within Databricks.

Step 6.

Configure Amperity to automate this workflow for a regular (daily) refresh of data.