This connector helps you to fetch data in a CSV file from Amazon S3 into the Gainsight MDA repository. When using S3 buckets, the job still pushes the data to MDA objects/subject areas. The destination of the data does not change, only the method of how we ingest the data. S3 is a place to store files - it is not a database that you use for Rules or Reporting. The S3 connector allows you to set up mapping to read those files and bring them into MDA Objects, that you can then use for Rules and Reporting. Once the raw usage data resides on Gainsight MDA, the system performs aggregations on it to achieve optimal performance while generating reports. The S3 Connector is now available at Administration > Operations > Connectors > S3 Connector.
This article describes how to:
- Integrate Amazon S3 with Gainsight
- Create Projects for Data Insertion
- Upsert data into a subject area
- Viewing execution history and S3 Configuration
- Troubleshoot data load operation
- How to Use Cyberduck Tool
- S3 Limitations
- Make sure that the formats of the Date and DateTime values in the CSV file are supported in the MDA. For the list of supported formats in the MDA, refer Gainsight Data Management.
- All your projects will need their CSVs to be placed under: s3://gsext-lr7yqwhf1o0laliaqiqhiwemn...a-Ingest/Input.
Note: When using Cyberduck, you must copy the Bucket Access Path after s3://
- Before using this connector, you must contact Gainsight Customer Support and obtain credentials to integrate Amazon S3 bucket with Gainsight. For more information, refer to the Integrating Amazon S3 with Gainsight section in this article.
- Use S3-SDK, S3cmd, or S3 browsers to copy the CSV file in the input folder to perform data load operations. For windows, use either cyberduck or S3 browser. For Mac, use Cyberduck Version 5.2.2 (The newest version of Cyberduck Removes the ability to put Path.) For more information, refer to the How to Use Cyberduck Tool section.
- Multiple files with different column formats require creation of corresponding projects for the desired Matrix Data-Object.
- We recommend that the CSV file size not exceed 200MB.
Integrating Amazon S3 with Gainsight
To integrate Amazon S3 with Gainsight:
- Go to Administration > Operations > Connectors > S3 Connector.
- Click on S3 Connector; then click NEXT. The Gainsight application will automatically populate the Access Key and Security Token for the S3 bucket created for you. You might have to reload the page to view these credentials; then click the View S3 Config link. Optionally, you can click on the Reset button to reset Access Key and Security Token.
- Click Test to check whether S3 has been successfully integrated with Gainsight.
- Click NEXT to perform different data load operations on S3 data. For more information on operations that you can perform on S3 data, refer to Creating Projects for Data Insertion and How to Upsert Data into Subject Area.
Creating Projects for Data Ingest
Once you have integrated the S3 connector with Gainsight, you are now ready to create a project.
- You can create multiple projects on existing MDA custom object (Matrix Data-Object).
- To edit existing projects, click the individual project in the project list page.
To create a project:
- Navigate to Administration > Operations > Connectors > S3 Connector; then click + DATA INGEST JOB.
- Under Data Ingest Job Setup tab, enter the following details:
- Data Ingest Job Name: The desired data ingest job name.
- Matrix Data-Object: Select an existing Matrix Data-Object.
- Input: The path for the input file.
- Archived: The path for archiving file.
- Failed: The path to place the file in case the operation fails.
- Key Encryption: To encrypt the file to be uploaded.
- Note: If you are unable to see Key Encryption, contact Gainsight Support.
- Recommended/Verified tools to encrypt file: GPG Keychain and OpenPGP Studio.
- Select Type: The type of encryption. (this field appears only when you select the Key Encryption check box)
- Write to error folder: Select this if you want to write the error file at the path specified in the Failed field. ( this field appears only when you select the Key Encryption check box)
- Source CSV file: Enter the CSV file name that you would like to be picked from the Amazon S3 input folder.
- Select data load operation: You can select Insert or Upsert checkbox but you need to choose Matrix Data-object to select the Upsert check box. Once you select Upsert, you need to select the key fields to identify unique records.
- CSV Properties: Select appropriate CSV Properties. However,
weGainsight recommends the following CSV properties:
- Char (Character) Encoding: UTF-8
- Separator : , (Comma)
- Quote Char: “ (Double Quote)
- Escape Char: Backslash
- Header Line: 1 (Mandatory)
- It is required to select Character Encoding format in the S3 job configuration. By default, UTF-8 is selected but users can change it as required.
- User should use same separator in the job configuration which is used in the CSV file to upload. By default , (comma) is selected as separator but users can change it as required.
- Quote Character is used to import a value (along with special characters) specified in the Quotation while importing data. It is recommended to use same Quote Character in the job configuration which is used in the CSV file to upload. By default, Double Quote is selected in the job configuration but users can change to Single Quote as required.
- Escape character is used to include special character in the value. By default, Backslash is used as Escape Character before special character in the value. It is recommended to use Backslash in CSV file to avoid any discrepancy in the data after loading.
- Click PROCEED TO FIELD MAPPING; under the Field Mapping tab, map Target Object Field with Source CSV Field appropriately.
- You can map all or a few object fields with the header fields in the CSV file. You can choose multiple object fields and then click the Field Mapping icon to map the selected object fields with the CSV headers.
- You can click Select All to map all object fields with the CSV headers.
- Click the UnMap icon for a specific field mapping or UnMap All to unmap all the fields that you set for mapping.
- While mapping Date and DateTime fields between the Source CSV field and the Target MDA object, Click the Clock icon. Select a Timezone dialog box appears.
- Select a Timezone from the dropdown list and click Ok. This is to assign a timezone for the Date and DateTime values. These values are then converted into UTC from the selected timezone and are stored in the MDA object. If you do not select a timezone, the records are considered to be in the Gainsight Timezone. The Date and Datetime values are then converted into UTC from the Gainsight Timezone and are stored in the MDA object. For more information on Timezone standardization, refer Timezone Standardization at Gainsight.
- For Derived Field mappings, click the Show import lookup icon. Data import lookup configuration dialog appears. This is to lookup to the same or another standard object and match fields to fetch Gainsight IDs (GSIDs) from the looked up object and populate in the target field. Derived mappings can be performed only for target fields of GSID data type.
There are two types of lookups: Direct and Self. Direct lookup enables admins to lookup to another MDA standard object and fetch GSIDs of the records from the lookup object. Self lookup enables admins to lookup to the same standard object and fetch GSID of the another record to the target field. For more information, refer to Data Import Lookup.
- In the following example using Direct import lookup, we lookup to User object, match CSV file header CSM with User::CSM and bring the correct GSID from lookup object User into target field Company::CSM. Click the + button to match multiple fields between the CSV file and lookup object to import correct GSID from the standard object. When you have multiple matches or when no match is found, you can select from the given options as needed. Click Apply.
Note: If there are multiple Account and User Identifiers (multiple mappings), Admins can use Data import lookup.
- When Field Mappings and Derived Field Mappings are completed, click NEXT; then under the Schedule tab, enter the following details and click RUN NOW or set a recurring schedule.
- On Success: If a job has partial/full data import success, a success notification email is sent to the email ID entered here.
- On Failure: When all records fail to import, a failure notification email is sent to the email ID entered here.
Note: When you click Run Now, the data ingest configuration is saved automatically.
(Optional) If you do not want to schedule your data ingest job, you can choose to execute it whenever the file is uploaded to the Input folder using the Post file upload option. The following are the limitations for using this option.
- The file name/file cannot be used in other data ingestion projects. An error occurs if such an operation is performed.
- While editing an existing Data Ingest Job, you cannot modify the existing Matrix Data - Object. You need to create another Data Ingest Job with a different Matrix Data - Object for data ingestion.
- On any given day, you can upload up to five files with a maximum size of up to 200MB. Each file has to be uploaded with a minimum gap of two hours.
- Time based schedule - to scheduled daily, weekly, and then monthly.
Note: Learn about the success or failure of the data load through the notification mechanism while using S3 Connector for uploading the data (file) into MDA. A Webhook notification is available at the input Callback URL. The Callback URL must be HTTPS, support POST method, and return a Success response of 2XX. Header values in the form of key and value are submitted. Admins may test the URL by using the TEST IT ONCE button, which sends in a message of “TestMessage” to the endpoint.
Users receive two messages at the endpoint:
i. TestMessage, which is used for validating the URL.
ii. The notification at the endpoint that contains the following fields:
- S3 Job Id (Project Id)
- S3 Project Name
- Time taken (in milliseconds)
- Total number of rows
- Succeeded rows
- Failed rows
- S3 error file name
- Status (Failure, success, or partial success)
- Status Id
5. Click SAVE. A success message appears once the data ingest job is saved successfully. In addition, you can check the execution history using View Execution History.
In case of failure, you can click on the status of a particular data ingest job to view the cause. Also, the Failed column contains a link to download the error file which contains the failure reason of a job.
How to Upsert Data into Matrix Data-Object
You must create a data ingest job to perform an Upsert on an existing Matrix Data-Object.
To upsert data into a Matrix Data-Object:
- Navigate to S3 Connector > [Click on the desired data ingest job].
- Select the Upsert check box; then add appropriate fields in Select key fields to identify unique records.
- Click PROCEED TO FIELD MAPPING.
- In the Field mapping tab, map Target Object Field to Source CSV Field appropriately, if required.
- In the Schedule tab, click RUN NOW, or set a recurring schedule using the Set recurring schedule check box.
Viewing Execution History, S3 Configuration
- Once you have created a data ingest job and have performed a data load operation, you can view the execution history using View Execution History as shown in the image below.
- You can see Success and Failure jobs in the S3 Execution Logs page as shown below. In case of failure job, you can click the Failure status of a particular data ingest job to view the cause. Also, the Failed column contains a link to download the error file which contains the failure reason of a job.
- Click View S3 Config to view or to configure Amazon S3 for Gainsight as shown below. It provides Bucket Access Path, Access Key, and Security Token for S3 bucket connection.
- Click TEST to test the S3 bucket connection. When the connection is good, it shows a message Connection Successful.
Troubleshoot Data Load Operation
You can check the data load operation details on Amazon S3:
- archived: Once the CSV file is used for data load operation, the file is moved from the input folder of S3 bucket to the archived folder.
- error: This folder contains the error files which have records of data that has failed.
- input: This folder contains the CSV file to be used for data import.
How to Use Cyberduck Tool
1. Download and install the cyberduck tool from https://cyberduck.io/?l=en
2. Click Open Connection and fill in the required info as shown in the image below:
3. You will find the list of all the folders, one each for each gainsight123 configured in S3 connector. Inside this folder, there will be 3 subfolders: input, archived, and error.
4. Navigate to the “input” folder and upload files inside this folder by selecting the “upload” option in File Menu, or right click.
- Unique file name for post file upload: User has to specify a unique file name for all the Event Based Data Ingest projects (Post File Upload): Also, the file name used should not be a suffix of a file name that already exists.
- 500 mb file size limitation: S3 connector will support files up to 500 MB only. Suggested size is 200 MB.
- User has to setup the project and then upload file: The files that exist in the bucket before setting up "post file upload" will not be picked up.
- User has to upload the file with exact filename only: Uploading a file and renaming it to match the set file name in the project will not be ingested.
- No delay time can be configured: There is no provision to configure a delay time for "post file upload." It will be ingested immediately.
- User has to upload a new file only after the previous file processing has started, or previous file is moved to archive folder. If the previous file is still in the input folder, the new file will overwrite the older file.