Skip to main content
Version: v1.13 print this page

Workflows

Amorphic workflows helps you to visualize and orchestrate complex analytical pipelines using amorphic jobs (ETL), machine learning model inference tasks, email notification tasks, Textract, Translate, Comprehend and Medical Comprehend tasks.

Workflows

Amorphic workflows manages execution and monitoring of all its components. You can create a dependency chain (Directed acyclic graph) of several components of types Jobs, ML model inference jobs, Email notifications, Textract, Translate, Comprehend and Medical Comprehend to perform complex analytical tasks.

Amorphic Workflows page consists of options to list or create a new Workflow. You can sort through the workflows list using entities like name, created by , creation time etc.

Create Workflow

You can create new workflows in Amorphic by using the "Create Workflow" functionality of Amorphic application.

In order to create a new workflow, you would need at least one node.

To create a node, you can import a pre-existing module provided by Amorphic. Following are the fields needed to create a node:

AttributeDescription
Module TypeModule is a pre-defined entity on which a node is built. As of now Amorphic supports these module types: ETL Jobs, ML model inference jobs, Email notifications, Textract, Translate, Comprehend, Medical Comprehend, Sync To S3 and File Load Validation.
ResourceBased on the module type selected a list of resources are shown. For example if module type ETL Job is selected, all the etl jobs that a user has access are displayed for the user to choose from.
Node NameName given to the node for quick and easy identification.
Input ConfigurationsArguments which can be used in the job.

Below image shows how to create a new workflow:

Create_workflow

User can also create a workflow by using the "Navigator" which would direct the user to workflow Creation page from any where in the application. To get the option displayed, the user need to double tap on "Ctrl" button in the keyboard.

Below is a simple graphic to demonstrate Navigator.

Navigator

Workflow execution properties

Workflow execution properties are the key-value properties that can be defined while creating a workflow or editing an existing workflow. The properties can be retrieved and optionally modified programmatically during the workflow execution.

Workflow execution properties

Retrieving workflow execution properties:

import sys
import boto3
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from awsglue.context import GlueContext
from pyspark.context import SparkContext

glue_client = boto3.client("glue")
args = getResolvedOptions(sys.argv, ['JOB_NAME','WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
workflow_params = glue_client.get_workflow_run_properties(Name=workflow_name,
RunId=workflow_run_id)["RunProperties"]

email_to = workflow_params['email_to']
email_body = workflow_params['email_body']
email_subject = workflow_params['email_subject']
file_name_ml_model_inference = workflow_params['file_name_ml_model_inference']

Modifying workflow execution properties:

import sys
import boto3
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from awsglue.context import GlueContext
from pyspark.context import SparkContext

glue_client = boto3.client("glue")
args = getResolvedOptions(sys.argv, ['JOB_NAME','WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
workflow_params = glue_client.get_workflow_run_properties(Name=workflow_name,
RunId=workflow_run_id)["RunProperties"]

workflow_params['email_subject'] = 'Coupon: Grab and go!'
glue_client.put_workflow_run_properties(Name=workflow_name, RunId=workflow_run_id, RunProperties=workflow_params)

Workflow nodes

View list of workflow nodes

Edit Workflow

Workflow metadata can be changed, nodes can be added/deleted from workflow by clicking edit workflow button.

Edit_workflow

Run Workflow

Amorphic workflows can be triggered on-demand or based on a schedule. Run Workflow button can be found on workflow details page.

Workflow_execution

On-demand execution

Workflow can be triggered on demand using run button and executions are listed under Executions tab as shown below:

Ondemand_workflow_execution

Scheduled execution

A schedule can be created to trigger workflow periodically. Schedule can be enabled/disabled anytime.

Schedule_workflow_execution

Stop Workflow execution

Workflow execution can be stopped by using the 'Stop Execution' option in more options icon (... vertical ellipses)

Stop workflow execution

Stop workflow execution

Once the workflow execution is stopped successfully, user can again restart the workflow execution using the 'Resume Execution' option in more options icon (... vertical ellipses)

Resume workflow execution

Resume workflow execution

Resume workflow execution

Flexibility to trigger nodes

With this latest feature users will be able to choose whether a node runs on preceding node's success or failure. This feature provides much needed fexibility to solve wide vareity of use cases. A simple use case like triggering email node on failure of an etl job to ochestrating complex etl workflows are all possible use cases.

The following graphic shows how a sample workflow is created: Here SendPromotionalEmails etl job is confured to run after success of ReadCustomerDetails. If ReadCustomerDetails job fails an email node called FailureAlertGenerator fires up emails to concerned entities. Workflow demonstrating flexible node trigger In the following workflow execution since the job ReadCustomerDetails failed the email node FailureAlertGenerator got triggered and job SendPromotionalEmails stays in not_started state. Workflow execution demonstrating flexible node trigger

Below graphic shows a sample complex etl process: Workflow demonstrating complex etl process In the above workflow:

  • node_one runs when workflow starts.
  • node_two runs only when all the following cases are true: node_one succeeds.
  • node_three runs only when all the following cases are true: node_two succeeds.
  • node_four runs only when all the following cases are true: node_six succeeds, node_two succeeds and node_three fails.
  • node_five runs when workflow starts.
  • node_six runs only when all the following cases are true: node_five fails and node_one succeeds.

All existing workflows and workflow executions will work as usual without user intervention. Users will be able to edit existing workflows and setup node triggers of their choice.

Execution Logs

Amorphic supports retrieval of node logs for only nodes of type ETL Job. Logs are available to download from execution details of the workflow. User can download execution output logs(if any) and error logs(if any) through more (3 dots) option. The logs option is of 3 types.

If the user opts to download the full logs then it initiates the log file creation and the status will be 'triggered'. Status will be changed to 'available' and user will receive the email once log file is created. User can download the full logs using the same 'Output Logs (All)' option.

  • Output Logs (latest 1 MB): The latest 1 MB of the output logs for the job execution. This option will download the latest 1 MB of output logs immediately. If logs are not available, It'll display that 'No output logs available for the execution' message.
  • Output Logs (All): All of the output logs for the job execution. This option will initiate log file creation. If there are no output logs then the log file will be empty.
  • Error Logs: Error logs for the job execution. If logs are not available, It'll display that 'No error logs available for the execution' message.

Download workflow execution logs

View Workflow executions

All the executions of a workflow are listed under "Executions" tab in workflow details page.

Workflow_executions

Clicking on execution details shown details of a particular execution like visual workflow, execution statistics, execution status of each node.

Workflow_execution_details

Node level details like node execution time, error messages, workflow Id (in case of workflow node), start time, end time etc., can be found by clicking on more details in the node list grid below the visual workflow. Also, ChildResourceName will be displayed as WorkflowName and ChildResourceExecutionId as WorkflowExecutionId.

Workflow_node_details

List Workflows

Users will be able to see the list of workflows they have access to. They can also limit the results shown per page using Results Per Page option, and can sort the them based on desired field and its order.

List_of_workflows

View Workflow Details

Authorized Users

This tab shows the list of users authorized to perform operations on the workflows. The owner, user who created or have owner access to the workflow, can provide workflow access to any other user in the system.

There are two type of access types:

Access TypeDescription
OwnerThis User has permissions to edit the workflow and provide access to other user for the workflow.
Read-onlyThis user has limited permission to worlflow, such as view the details of the selected workflow.

Authorized Groups

This tab shows the list of groups authorized to perform operations on workflows. A group is a list of users given access to a resource. Groups are created by going to User Profile -> Profile & Settings --> Groups

There are two type of access types:

Access TypeDescription
OwnerThis group of users has permissions to edit the resources and provide access to other user/groups for the resources.
Read-onlyThis group has limited permission to resources, such as view the details.

Clone Workflow

User can clone a workflow in Amorphic by clicking on clone button on the top right corner of the workflow Details page.

Clone workflow page auto-populates with the metadata of workflow from which it is being cloned, reducing the effort to fill every field required for registering the workflow.

The only field user needs to input/change is the "Workflow Name", as workflow with the existing workflow Name can not be created. User can edit any field if he wants to before clicking the "submit" button at the bottom right corner of the form.

Below is the graphic pointing to the populated fields in clone workflow form.

Clone workflow

Once the user clicks the "Submit" button, a new workflow will be created. The created workflow will show up in the workflows page.

Delete Workflow

Workflow can be deleted using the "Delete" (trash) icon on the right corner of the page. Once workflow deletion is triggered, it'll immediately delete all the related metadata.

Delete workflow