About DataGrid

DataGrid pulls all your customer data from any source–online, offline, historical, streaming–without requiring schema planning or ETL. DataGrid sends results of any size and type to any of your downstream applications and workflows.

Important

DataGrid is configured by users of Amperity with administrative privileges. These users may belong to your organization and manage Amperity directly or they may be representatives from Amperity who manage DataGrid for you on your behalf. This documentation is written for the administrative users who belong to your organization. Some steps, such as configuring Google Ads and Facebook Ads as destinations, or configuring the SFTP site that is included with Amperity, require additional steps to be done by Amperity representatives and should be initiated by filing a support ticket or by contacting your Amperity representative and asking them to start that process on your behalf.

DataGrid is the multi-patented infrastructure that powers Amperity. DataGrid handles your customer data with speed, scale, and accuracy and provides optimal performance and interoperability when using any combination of AmpID, Amp360, and AmpIQ.

DataGrid is a cloud-native high-performance infrastructure that runs at scale on your choice of Amazon AWS or Microsoft Azure.

  • 100+ integration options to handle any type of data.

  • Pull unlimited amounts of structured and semi-structured raw data to Amperity from any source without the need for schema planning or pre-shaping data.

  • Use multiple databases, SQL database querying, and the Amperity data explorer to manage your workflows and data transformation options.

  • Send data shaped for any destination in any format, such as sending full databases to analytics environments, segments to campaign tools, or attributes to personalization engines.

  • Use the sandbox environment to safely make changes with zero downtime to the production environment, including data sources, data models, and workflows.

  • Rely on security features, such as SOC2 certification, SSO integration, PII obfuscation, user actions auditing, and more, to keep your data safe.

Common workflows

The most common workflows for DataGrid involve pulling data to Amperity from a customer data source or sending data from Amperity to a downstream workflow. There are five main areas:

  • Pulling data to Amperity

  • Configuring and running Stitch

  • Configuring the customer 360 database, along with any other custom databases and tables your data requires

  • Defining queries that interact with databases and tables in the Customer 360 page, some of which are used for QA purposes, others are used to generate results to be sent to downstream workflows outside of Amperity

  • Sending data from Amperity

After configuring DataGrid to pull data to and send data from Amperity, use these components to configure more complex and more valuable use cases, such as:

  • Consolidate data across brands

  • Consolidate historical data

  • Enterprise change management

  • Manage customer data directly

  • Reshape data for downstream workflows

Configure DataGrid

The following sections provide an overview of configuring DataGrid. The person who configures DataGrid depends on how you have chosen to run Amperity. This might be a member of your organization who has completed the DataGrid certification process or it might be a representative from Amperity who works closely with you during implementation to configure Amperity exactly as you require, but then over time as Amperity runs on a daily basis.

Tip

Some information about your configuration must be shared with Amperity, such as a username and passcode required to authenticate and access to various cloud storage services or REST APIs. This information, when it must be shared with an Amperity representative, should be shared using SnapPass.

SnapPass allows secrets to be shared in a secure, ephemeral way. Input a single or multi-line secret, along with an expiration time, and then generate a one-time use URL that may be shared with anyone. Amperity uses SnapPass for sharing credentials to systems with customers.

Open Snappass to send information to your Amperity representative.

Sandboxes

A sandbox is a snapshot of the configuration state of your production tenant that is made available as a copy. Use a sandbox to safely make configuration changes, and then promote those changes back to your production tenant.

The Allow sandbox administration policy allows full access to all sandboxes in a tenant, including the ability view details for any sandbox, access any sandbox, promote changes from any sandbox to production, and delete any sandbox.

Assign this policy to one (or more) users who are assigned the DataGrid Operator policy so those users can manage all sandboxes that exist for your production tenant.

Important

A DataGrid Operator can create a sandbox, and then open that sandbox to make configuration changes. While working in a sandbox, a DataGrid Operator is assigned the DataGrid Administrator policy, which allows a user to have full access to the configuration state of the sandbox.

A user must be assigned the Allow sandbox administration policy option to do any of the following:

  1. View details for all sandboxes

  2. Access any sandbox

  3. Promote changes from a sandbox to production.

  4. Delete a sandbox from the Users and Activity page or by selecting the Promote and delete sandbox option while promoting changes from a sandbox.

Amperity recommends the following patterns when working with sandboxes:

Pull data to Amperity

Pulling data to Amperity falls into four broad categories:

  • Pull data from cloud storage

  • Pull data from file transfer

  • Pull data from REST API

  • Pull data from warehouse

Another option for specific use cases is to use the Streaming Ingest API to pull data to Amperity.

The process of pulling data to Amperity is managed from the Sources page in the Amperity user interface.

The Sources page.

The Sources page contains the following components:

  • Saved queries reshape data after pulling it to Amperity and before making it available to a feed.

  • Feeds define the schema for each individual data source.

  • Domain tables represent each data source after it has been processed against a feed.

  • Domain transforms use existing domain tables and Spark SQL to add a custom domain table.

  • Couriers define how data is pulled to Amperity, along with specifying the location from which it is pulled.

  • Courier groups define schedules for pulling data to Amperity.

Configure and run Stitch

Stitch must be configured to run in a way that ensures that all data sources that contain customer records (names, email addresses, physical addresses, and other PII data) are made available to the Stitch process. The outcome of the Stitch process generates an Amperity ID for each unique customer record across all of your data.

Stitch is largely a transparent process, but there are ways to tune how it understands your data. And you can explore the results of the Stitch process against your data directly from the Stitch page.

The Stitch page.

Build databases

The customer 360 database starts with the output of the Stitch process, which is a collection of database tables from which you can build your customer 360.

The tables that the Stitch process outputs include:

  • Transaction Attributes

  • Unified Transactions

  • Unified Itemized Transactions

  • Unified Profiles

  • Unified Scores

  • Unified Customer

In addition to these tables, you must build a Merged Customers table that defines certain rollup behaviors for profile data and, if using AmpIQ, transactions, itemized transactions, and customer profile data.

The databases and tables that may be present in the Customer 360 page are not limited to only those output by the Stitch process. You can configure domain tables to be directly passed through to the customer 360 database and, using Spark SQL, you can build any custom database or table that you require.

All databases are managed from the Customer 360 page.

The Customer 360 page.

After you have build your customer 360 database and it is running against a representative collection of your data sources, you can start to look at extending the database for additional workflows and use cases, such as:

  1. Blocklisting values from Stitch or from customer 360 data

  2. Applying specific labels to data

  3. Adding calculations

  4. Extending data to focus on address-based householding, first-party, third-party data

  5. Adding CCPA or GDPR privacy rights workflows

  6. Extending for customer interactions, such as order-level and item-level transactions data and product catalogs

  7. Workflow-specific databases or tables to support PII consolidation or master data management (MDM) use cases

  8. Adding support to enable additional Amperity features

Run queries

The Queries page uses Presto SQL to interact with any database and table that is present in the Customer 360 database. You can use a visual SQL editor for simple queries and a SQL editor for more complex queries. Amperity supports nearly all of the functionality of Presto SQL that you would use when building a SELECT statement. See the Amperity Presto SQL reference for specific reference, but you may also refer to the Presto SQL documentation for anything not covered in the Amperity reference.

Use the Queries page to review the quality of Stitch output, the quality of transactions and itemized transactions data, and to build queries, the results of which can be sent from Amperity any downstream workflow.

The Queries page.

Send data from Amperity

Sending data from Amperity falls into similar categories:

  • Send data to cloud storage

  • Send data to file transfer

  • Send data to REST API

  • Send data to warehouse

The process of sending data from Amperity is managed from the Destinations page in the Amperity user interface.

The Destinations page.

The Destinations page contains the following components:

  • Destinations, which define how data is sent from Amperity and the location to which it is sent

  • Data templates, which map fields in the customer 360 database to the fields that are required by the downstream workflow

  • Orchestrations, which define schedules for sending data from Amperity