Skip to main content
Version: v1.13 print this page

Dataload Limits

Note

Dataset file upload process totally depends on the below functionality. This is ONLY applicable for append and update type of datasets.

Dataload limits functionality mainly depends on the status of the Dataload throttling.

  • If the Dataload throttling is enabled then the dataset files will be processed as per dataload limits functionality (queue process) and gets completed eventually based on the number of files user uploaded.
  • If the Dataload throttling is disabled then the dataset files will be processed and completed immediately.

User can use the toggle button to enable/disable the dataload throttling.

Dataload Throttling

Important Information

There is no manual intervention required to enable/disable the dataload throttling. Amorphic system will handle the Dataload throttling automatically based on the dataload i.e. number of files processing in the entire application at that point of time. (Anyhow user can enable/disable the throttling setting manually if required)

Dataload Throttling automatic process

  • If the processing files are within the throttle limit then the Amorphic system will disable the throttling.
  • If the processing files are in huge number and if it reaches the limit then Amorphic system will enable the throttling automatically and files will be processed as per queue process.

Dataload Limits

Dataload limits helps user to set the batch limit for processing the files uploaded to the dataset. These limits are different for every target location (S3, S3Athena, Lakeformation, Dynamodb, AuroraMySQL).

Example: If the user uploads 1000 files to S3 type of dataset, the files will be processed based on the S3 limit mentioned below. If the S3 limit is 300, at-most 300 files will be processed in parallel at the system level and rest of the files will be in queue. System will poll the queue every 3 minutes and triggers processing of files according to limits mentioned below. The below specified ranges for every target location are calculated based on the AWS limits and performance tests.

User can also view the count of recent dataload executions and number of messages waiting in the SQS queues to be processed.

Dataload Limits is available in the 'Infra Management' page under 'Management' section on the side menu. Click on the 'Dataload Limits' tab to navigate.

Upon loading the tab, User can view the dataload limits for all the applicable target locations.

Dataload Limits Homepage

Update Dataload Limits

User can update the dataload limits for all the applicable target locations using the 'Set Limits' option in more options icon (... vertical ellipses)

In the 'Set Dataload Limits' popup, Enter the new values in the respective target location fields to update the limits and click on 'Update Limits' button to update the new limits.

The following picture depicts how to set the dataload limits:

Update Dataload Limits

Note
  • Minimum and maximum values for limits will be displayed in the respective helper tooltips.
  • Once updated successfully, If the page is not displaying the updated limits then please refresh the limits after couple of seconds to reflect the updated values. This might be due to delay in AWS SSM (Parameter store) service.

View Lambda Concurrency and Dataload Statistics

User can view the account level lambda concurrency and can also refresh it to get the latest value. UI will get the latest value on every page load and user doesn't have to refresh it manually if not required.

User can view the below dataload statistics for all the applicable target locations:

  • Recent Executions: Number of recent file load executions across the application. Hover on the 'time' icon beside the count to view the recent execution time.
  • Current Messages in Queue: Number of messages available in the queue across the application.

User can always get the latest dataload statistics using the 'Refresh' option beside the section labels.

The following picture displays the lambda concurrency and dataload statistics:

View Dataload Statistics