Load to Databricks Action Type
This article helps the admins configure the Load to Databricks action type using Horizon Rules in the Rules Engine.
Overview
Admins can configure the Load to Databricks Action to sync data from Gainsight to Databricks. Rules Engine provides the ability to prepare the dataset from various standard and custom objects of Gainsight, and the Load to Databricks action type sends the data to the tables in Databricks:
Gainsight recommends reading the configuration of the Rules Details, Data Setup, and Schedule steps before reading this article. For more information, refer to the Create New Rule section of the Rules Engine Horizon Experience Overview article.
Example Business Use Case:
Consider that the CSM wants to send customer touchpoints back to Databricks so that it is available in the organization.
The Load to Databricks Action type is used to upsert, update, and insert these values in the target fields of the Databricks table.
IMPORTANT: Ensure that the Databricks connection created has the tables to which you want to writeback the data.
Limitations
The following are known limitations of Databricks:
- No Partial Success State: If any data in CSV file is incorrect and rule is run, Databricks fails the entier process. Users need to identify, and rectify the error, and re-run the rule.
Note: The writeback can either be successful or failed it cannot be partial success as other rules. - Float Data Rounding: Float values are automatically rounded by Databricks, for example: 12.34 becomes 12.34000015258789, and 12.341 becomes 12.340999603271484.
- Timestamp Field Formatting: When ingesting data into a timestamp field, it appends .000 to the time, for example: 2022-02-07 17:35:24.0 becomes 2022-02-07T17:35:24.000.
- Timestamp NTZ Field Formatting: When data is ingested into a timestamp NTZ field, the appended timestamp (.000) is removed, for example: 2022-02-07 17:35:24.000 becomes 2022-02-07 17:35:24.
- Decimal Place Truncation: Decimal values are truncated based on the column configuration. For example, if a column is set for two decimal places, 123.123 becomes 123.12 and 123.126 becomes 123.13.
- Date Format Inconsistency: Writeback to Databrick fails if date fields are in different formats, for example: If YYYY-MM-DD works then DD-MM-YYYY fails.
- Duplicate Upsert Keys in CSV: If an input data has duplicate records with the same identifier key, Databricks fails the ingestion, citing multiple rows in the source table matching a single row in the target table, causing ambiguity.
- Identifiers: To writeback to Databricks tables, boolean, float, decimal, and double datatypes fields cannot be selected as Identifiers.
Configure Load to Databricks Action Type
To configure the Load to Databricks Action type:
- In the Action Setup step, click the Add Action icon. The Add Criteria slide-out panel appears.
- (Optional) Click Add Criteria to define filter criteria for the dataset.
- Click Skip This Step to navigate to the Add Actions step.
- From the Create Action drop-down list, select the Load to Databricks Action type.
- In the Load to Databricks section, provide the following details:
- From the Connectors drop-down list, select the Databricks connection.
Note: The Databricks Connection must be authenticated on the Connectors 2.0 page. - From the Object Name drop-down list, select the tables to which you want to transfer the data.
- From the Operation drop-down list, select one of the following:
- Update: Updates the existing record as per the field mapping.
- Upsert: Updates any matching records and creates new records if no matching records are found.
- Insert: Inserts new data from the source fields into the target fields.
- (Optional) Description: Enter a description for the rule action.
- From the Connectors drop-down list, select the Databricks connection.
- To load data to tables, field mapping must be done. To map fields, click Add Fields.
- From the Add Fields drop-down list, select either of the following types of fields to add to the action.
- Source Field: This option lets you select source field(s) from the dataset prepared in Gainsight objects while configuring the rule and map it to the target field in the Databricks object. To select source field(s):
- Click Source Field. The Select Fields slide-out panel appears.
- Select the checkbox of the source fields from where the data must be fetched.
- Click Select. The selected source fields are added to the Field Mappings section.
- From the Target Fields drop-down, select the target object.
- (Optional) Add a Default Value.
Note: This value is loaded in the target field in case the source field value is invalid or not available.
- Custom Field: This option lets you provide a custom value to the target field in the Databricks object. To provide a custom value to the target field in the Databricks object configure the following details:
- Click Custom Field.
- From the Target Fields drop-down list, select the target object.
- Enter value in the Source Field.
- Click Save Actions. The Load to Databricks Action Type is saved successfully.
Schedule the Rule
The rule that is ready with the dataset and required action setup can be scheduled as per time-based and Event-Based schedules. You can schedule the execution of an individual rule in chronological order.
For more information on how to schedule a rule, refer to the Schedule and Execute Rules article.