Formula Fields in Rules Engine
IMPORTANT - Articles Impacted due to 6.42 July CS Release
Due to the v6.42 July, 2024 release, this article has been impacted. Steps, images, and playable GIFs in this article will soon be updated to reflect the latest changes.
This article reviews the Formula fields supported in Rules Engine. After reading this article, you can apply various formulas to data.
Supported Formulas
This feature/functionality allows you to create different types of formulas so that you don’t have to create the formula directly on the Object. Formula Field allows you to perform the following tasks:
- Build a numerical expression on fields with all arithmetic operators.
- Generate a formula on the columns of a single record.
Numeric Formulas
- Admin Page Views + Config page views
- Current Score - Previous Score
- NPS® Avg * 10
- ((Current Score - Previous Score) / Previous Score) * 100
String Formulas
- First Name + Last Name
- CTA Comments + Custom Text
- Convert to Upper Case (Company ID)
- LEFT(field, substring) - extracts the left part of the string from the appearance of string character.
- RIGHT(field, substring) - extracts the right part of the string with the appearance of a string character.
- POSITION(field, substring) - provides the position of the substring in a string. For example, POSITION(“doe” in “johndoe”) returns 5. Counting of the position starts with an index 1.
- LENGTH(field) - provides the length of the string. For example, LENGTH(“string”) returns 6.
- SUBSTRING(field, position, length) - finds the substring based on the position and length.
Date Formulas
- (Last Modified Date - Created Date) in days
- Convert to Date time (Created Date)
Period Over Period Comparison
- Moving average of a metric over a time window
- Period Over Period Comparison in % (time window, Average out past N windows).
Limitations
- An expression cannot be added when the GroupBy field is added to a task.
- Constants in Formula Field are not considered as a token.
- The formula time span cannot exceed 2 years.
- Period Over Period Comparison or Period Over Period Comparison in % formula cannot be created unless a date identifier is present.
- Period Over Period Comparison and Period Over Period Comparison in % accept only integers, not decimal values.
- Negative values are not allowed in the Function for Period Over Period Comparison and Period Over Period Comparison in %.
- ((A-B) * (A+B)) cannot be simplified to (A-B) * (A+B).
- The Concat function allows only 10 arguments.
- A maximum of only 10 formulas can be added.
- You cannot use Period Over Period comparison Formula and String Concat formula simultaneously in the same rule.
- Add single fields in parentheses when setting up numerical fields in data preparation. With parentheses, the formula works as expected.
Create a Rule
To create a Rule:
- Navigate to Administration > Rules Engine.
- Click Create Rule.
- In the Create Rule screen, enter the following information:
- Rule For: Company
- Rule Name: Enter a name for the rule.
- Folder: Select a folder for the rule.
- (Optional) Description: Select a folder for the rule.
- Click NEXT. The Setup Rule screen is displayed.
- Click DATASET TASK.
- Enter a Task Name.
- Select Company as the Source Object.
- Populate the Show and Filter sections as shown in the image below (after Step 6).
- Click SAVE.
- Click <-, to navigate to the Setup Rule screen.
If you do not find any fields in your environment, you should add the same as a custom field. To learn more about adding custom fields, refer to the Gainsight Data Management article.
It is assumed that every Company’s data has been modified on the rule run date (today). So the Modified Date field is the same as the current date.
Application of Formulas
- Click + TASK and select Transformation.
Note: You can create a Formula Builder only for Transformation Task.
- Enter a Task Name.
- Select the Source Object (here, Select fields for formula).
String Formulas
The First name and Last Name fields of CEO and CSM for every Company is available. Instead of viewing the first name and last name separately, you can use the String function to concatenate these fields and merge them into a single field called CEO and CSM.
- Expand Formula Fields.
- Expand String Formulas.
- Perform Concat Operation to create CEO field as shown in the following image.
Similarly, you can create the CSM field by applying Concat function on CSM First Name and CSM Last Name fields.
The complete list of String formulas supported by Rules Engine and their description is given below:
- Concat: Concatenates the selected arguments. A maximum of 10 arguments can be added and one string can be used multiple times.
- To Lower Case: Converts the selected argument values to Lowercase.
- To Upper Case: Converts the selected argument values to Uppercase.
- LEFT(field, substring) - extracts the left part of the string from the appearance of string character. For example, LEFT(johndoe@gainsight.com, @) will return “johndoe” as the output.
- RIGHT(field, substring) - extracts the right part of the string with the appearance of a string character. For example, RIGHT(johndoe@gainsight.com, @) will return “gainsight.com” as the output.
- POSITION(field, substring) - provides the position of the substring in a string. For example, POSITION(“doe” in “johndoe”) returns 5. Counting of the position starts with an index 1.
- LENGTH(field) - provides the length of the string. For example, LENGTH(“string”) returns 6.
- SUBSTRING(field, position, length) - finds the substring based on the position and length. For example, Substring(“johndoe”, 2, 3) returns “ohn”.
Date Formulas
It is assumed that the Modified Date is equal to the current date. You can now use Date formula to calculate the number of days left until the renewal date.
- Expand Date Formulas.
- Drag Date Diff to Show section.
- Perform the following action in the Date Diff window.
The complete list of Date formulas supported by Rules Engine and their description is given below:
- Date: Converts Date-time argument to Date.
- DateDiff: Returns difference between the two selected dates in the selected unit.
IMPORTANT: When calculating the difference between two date time fields, the time component of the Run Date starts at the beginning of the day. The difference between the dates is always rounded down to the nearest whole number after dividing by 24 hours. For example, a difference of 2.5 days is rounded down to 2 days.
However, a difference of 29 days and 17 hours is still rounded down to 29 days instead of 30. This is due to a limitation in the database's date diff function.
The following functions allow only Date and Datetime data types as input arguments:
- First Day of Calendar Quarter: Returns the first day of the Calendar Quarter.
- First Day of Current Month: Returns the first day of Current Month.
- First Day of Current Week: Returns the first day of the Current Week.
- Last Day of Calendar Quarter: Returns the last day of the Calendar Quarter. For example, last day of calender quarter (3/27/2019) will return "3/31/2019" as the output.
- Last Day of Current Month: Returns the last day of Current Month.
- Last Day of Current Week: Returns the last day of Current Week. For example, last day of the week (6/2/2022) will return "5/29/2022" as the output.
- Add/Subtract Date: You can use this function to perform addition and subtraction operations on a date. These operations can be performed either on a date used in the dataset, or on the rule execution date. You can add or subtract any number of days or weeks to a date.
Note:
The international standard ISO 8601 for date and time representation states that Sunday is the seventh and last day of the week.
To use this function:
- Expand Date formulas.
- Drag and drop Add / Subtract Date to Show section.
- Enter a name in the Output Field Label field
- Select a date field either form the Dataset or Rule Date.
- Select either the + or - operator.
- Enter a number in the Value field.
- Select either Days or Weeks in the Value field.
- Click SAVE.
- Rule Date: The Rule Date function does not accept any arguments. It only returns the Current Date on which the rule is being executed. You can also use the Rule Date function as an argument in any of the above functions. However, for the Date Diff function, you can use Rule Date as an argument only in the end datetime field, and not in the start datetime.
You can see the following example on creating Date Formula.
You can use Date formula to calculate the number of days left for license renewal. To perform this in a Transformation task:
- Expand Formula Fields on the left pane.
- Expand Date Formulas.
- Drag and drop Date Diff to Show section. Date Diff formula field window appears.
- In the Date Diff formula field window, provide the following information:
- Output Field Label: Enter only alphanumeric and underscore characters in this field. The label should always start with an alphabetical character.
- Output Header: This name is used as a CSV Header to process the action. This is view only and is generated based on the Output Field Label (but without space).
- Decimal Places: Enter a number between 0 and 9.
- Get Signed Value: This option is applicable only for Date Diff function. If you select this option, the output of Date Diff function can be a negative integer. However, if you do not select this option, and if Date Diff function has a negative output, it is automatically converted to a positive integer.
For example, if your start datetime value is 5 Jan 2018 and end datetime value is 2 Jan 2018, Date Diff function subtracts start datetime from end datetime. So in this case, the output would be -3 days. If you do not select the Get signed value option, the output is displayed as 3 days. However, when you select the Get signed value option, the output is returned as -3.
- Perform Date Diff operation as shown in the following image.
In this example, a new column Days left till renewal is created. This column contains the difference value (in days) after subtracting the Renewal Date from Rule Date (assumed to be today's date). If any negative values are displayed in the output, it implies that the renewal date for that customer has been missed.
-
Renewal Date: The Date on which the Customer license is scheduled to renew.
- Rule Date: The date on which the rule is being executed. Generally, this reflects today's date. However, if you are running the rule historically, this field will not display today's date.
-
Click SAVE. Clicking RESET erases the data that you entered and allows you to refill the field with new information.
-
In the Setup Rule page, you can see the newly created Formula Field in the Show section as shown in the image.
Note:
The Add/Subtract Date Formula field in the Date Formulas accepts dynamic fields. The number of days you add to or subtract from a date dynamically changes with the value of the selected field.
You can select the unit of the field as Days only. The Weeks option is not available.
Number Formulas
You can use Number Formulas to the Formula Fields in Rules Engine. These enable you to perform powerful calculations in your Transformation Tasks while setting up a rule.
The following number formulas are available:
-
Abs: The Abs formula returns the absolute value of a number. You can provide your input in the formula field as a positive or negative integer, but the output is always a 7.3890561 positive integer. For more information on using the Abs formula, refer to the ABS function article.
-
Ceiling: The Ceiling formula maps a number to the least integer greater than or equal to the number.For more information on using the Ceiling formula, refer to the Ceiling function article.
-
Exp: The Exp formula returns Euler's number, e (~2.718) raised to a power.
Example: EXP(1) = Approximate value of e = 2.71828183
EXP(2) = Base of the natural logarithm e raised to the power of 2 = 7.3890561 -
Floor: The Floor formula maps a number to the greatest integer less than or equal to the number. For more information on using the Floor formula, refer to the Floor Function article.
-
Log: The Log formula returns the logarithm of a number given a base. For more information on using the Log formula, refer to the Log Function article.
-
Sqrt: The Sqrt formula returns the positive square root of a positive number.
-
In: The ln formula returns the natural logarithm of a number, base e (Euler's number). For more information on using the ln formula, refer to the LN function article.
Note: For Sqrt, Log and ln, formulas require positive integers as input values. However, the rule runs successfully and no error message is thrown for negative input.
If you enter negative values, these formulas return the following values:
Formula | Value |
---|---|
Sqrt | Zero |
ln | Zero |
Log | Null |
Usage of Period Over Period Comparison Functions
The following functions are available in Period Over Period Comparison:
- Period Over Period Comparison
- Period Over Period Comparison in %
These functions cannot be created unless a date identifier is present. Arguments are consumed inside a function. To use Period over Period comparisons, a Date field must be included in the Group By section.
We use Period over Period Comparison to demonstrate a situation in which CSM can create a table that shows MRR value in percentage over the past year.
To use Period Over Period Comparisons, you must have a Date field in the Group By section. But String functions do not work, if any field is inserted to the Group By section).
- Drag Modified Date field to GroupBy section.
- Drag Period Over period Comparision In % to Show section.
- Enter the values as shown in the below image
The Period Over Period Comparison function also performs similar tasks. But it considers the actual values for calculation and not percentile values.
The syntax to use Period over Period comparison functions is:
- Period Over Period Comparison ( Period([ DAY / WEEK / MONTH / YEAR ], <Duration>),
- <Metric [ field of type numeric ]>,
- <Date Identifier [ field of type date ]>,
- <Aggregation Method [ SUM,AVG,COUNT, COUNT DISTINCT, MIN, MAX ]>,
- <Num of Prior Periods>,< Treat Missing Data as Zero [ true/false ]>);
Statistical Functions
Gainsight introduces Statistical Functions in Rules engine. This is a first step towards introducing Data Science capabilities in the Rules Engine. Using Statistical Functions, you can derive insights from your datasets that will help you make more informed decisions regarding your product and customers.
- Correlation - The correlation function shows you how two variables are related. This function looks at the strength of the relationship between two variables and helps you make a decision based on the results.
- Covariance - Covariance is a type of correlation. It indicates whether there is a positive or negative correlation but does not indicate the strength of the correlation between two variables.
- Standard Deviation and Variance - Both Standard deviation and variance measure volatility/variability. A high value indicates that the dataset is spread out; a low standard deviation means that the dataset values are close to the mean.
- Variance - Variance measures how far a set of numbers are spread out from their average value.
- Median - Median is more useful than an average when your dataset has outliers; meaning if the dataset has a high standard deviation.
You can refer to the Application of Statistical Functions in Rules Engine article, to understand how Statistical functions can be applied in Rules Engine.
Best Practices in Statistical Functions
- Correlation and Median functions, if used together within the same task, may not provide accurate results if the data has nulls.
- Add filters with not null checks: if you are unsure if the dataset has null values or not and want to use correlation along with other statistical functions within the same task, it is suggested to add filters with not null checks. This will filter out the null data records and produce accurate results.
- Correlation/Median with other aggregations: It is suggested not to use correlation or median functions in conjunction with any other aggregation (like sum, min, max, avg) or function if your data has null values.
- For Nulls: Use "Treat null as Zero" for all the relevant metrics. You can use the formula field here to pick each of these fields and select the Treat null as Zero option.
Case Expression
You can use the Case Expression formula field to categorize data, based on a specific set of requirements. You can use this Formula function to create an output Field. This output field has values for the records that match your specific set of requirements.
Anatomy of Case Expression:
- Case Expression function is made up of multiple cases (a maximum of 10).
- Every Case consists of multiple Criteria (a maximum of 5). Each criteria is a specific requirement that any record should match. For example, a Criteria for a customer to be classified as Detractor can be NPS® score between 0-6.
Every Case has an associated Action when any record matches with the given criteria. A value (Ex: Detractor) is populated in the Output field. This value can be a custom value or fetched from another field in the source dataset.
Execution of the Case Expression in detail:
- Execution of the Case Expression begins with the evaluation of the first case on a record. If all the criteria, in this case, are satisfied by the record, the action associated with this case is executed. The execution of Case expression halts here for this record and none of the other cases are evaluated.
- However, if the first case is not satisfied, the system evaluates the second case on the same record, and so on. If none of the available cases are satisfied by the record, the default case is executed.
This process is applied to all the records.
The Default Case: The Case Expression also has a default case. This default case does not have any criteria. It only has an action; Default action. You cannot delete the default case. When a record does not match any of the specified criteria, the action associated with the default case is executed.
The various fields present in Case Expression formula field are:
- Output Field Label: Enter the name for the Output column. This name is displayed on the final output page for the Case Expression result.
- Output Header: This is a view-only field. The name is derived from the name given in the above field.
- Output Data Type: Select the data type for the output. The options present in Case Output depends on the Output data type. Three data types are available:
- Number
- String
- Boolean
- Decimal Places: If you select Number in the Output Data Type, select the number of decimal places up to which the output must be displayed, for floating-point numbers.
- Case: Case consists of the Criteria(s). You can add multiple criteria by clicking +
- Criteria. The various fields present in the Criteria are:
- Field: Select the field on which the Criteria must be applied. All the fields included in the Dataset are displayed here.
- Operator: Select an Operator to be applied in the field. The operators are displayed based on the field selected.
- Matching Criteria: Select the matching criteria which must be satisfied, for these criteria to be successful. You can either select:
- Value: Enter a value that must match the field value.
- Field: Select a field whose value must match the value of the field, selected in the first field.
- Null records checkbox: Select this checkbox to include records which are Null.
- Advanced Logic: When you have multiple criteria, select the logical operator to be applied between those criteria(s). By Default, the AND logic is applied.
- Then: If all the criteria(s) are matched, the action specified in this field is executed. The options available in this field depend on the value selected in the Output Data Type field.
- Default: This field represents the default case. There are no criteria for this case. It only has action. If all the cases fail, this case is executed.
In the below example, an output field called Customer on Verge of Churn is created. The output type for this is Boolean. So if there is a customer whose MRR value less than 1000 AND Trend Indicator field shows Down, this field displays the true value for such customers. For the rest of the customers, it displays false, which is the Default case Action.
The source input for this test scenario is as shown below
In this example, only 12 customers are included and two evaluation fields are chosen; MRR and Trend indicator. However, in the real-world with multiple customers and more number of evaluation fields, it becomes cumbersome to identify which customers are on the verge of getting churned. This task is easily accomplished by using Gainsight’s Case Expression formula field.
Case Working method: Case Expression evaluates all the 12 customers to check who matches the case. The first customer Abb Corp Ltd. is evaluated to check if the customer satisfies this case. Since ABB corp does not match any of the two given criteria (MRR not less than 1000 and Trend not equal to Down), this customer does not satisfy Case 1. If there were multiple cases, Case Expression would have had evaluated Abb Corp Ltd. to check if it satisfies some other case. Since there are no further cases to be evaluated, the default case is executed for Abb Corp Ltd., and a false value is assigned in the Output column for this customer. This process is repeated for all 12 customers.
Note: You can include the Customer Name field also in the Show section of Transformation task to easily map customers and his Churn status. In the output image below, you can see that five customers are on the verge of getting churned.
NPS, Net Promoter, and Net Promoter Score are registered trademarks of Satmetrix Systems, Inc., Bain & Company and Fred Reichheld. |