Skip to main content
Version: v1.13 print this page

Endpoints

Endpoint is an environment that you can use to develop and test or run AWS Glue scripts. Endpoint is the platform needed to install jupyter notebooks and run scripts. It enables the machine to AWS connectivity.

The following picture depicts the Glue Endpoint page in Amorphic:

Glue Endpoint Homepage

Amorphic Endpoints contain the following information:

Endpoints Information

TypeDescription
Glue Endpoint NameGlue Endpoint Name, which uniquely
DescriptionA brief explanation of the glue endpoint
Glue Endpoint StatusStatus of Glue Endpoint. Ex: provisioning, ready etc.
CapacityThe number of AWS Glue Data Processing Units (DPUs) allocated to this Endpoint
Glue Python VersionPython version indicates the version supported for running your ETL scripts on development endpoints.
Auto TerminateStatus of the auto-termination. Ex: Enabled, Disabled
Network ConfigurationSubnet(Public or Private) in which the endpoint is deployed and provisioned
Remaining TimeAmount of time (in hr) left for auto-termination
Auto Termination TimeTime at which the system auto terminates the glue endpoint.
Public KeysA list of public keys to be used by the Endpoints for authentication
Extra Jars S3 PathThe path to one or more Java .jar files in an S3 bucket that should be loaded in the Endpoint.
Extra Python Libs S3 PathThe paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in the Endpoint.
CreatedByUser who created the glue endpoint.
LastModifiedByUser who has recently updated the glue endpoint.

Endpoint Operations

Amorphic Endpoint provides below operations for a Glue Endpoint.

Create Endpoint

Endpoint Create 1

In order to create a glue development endpoint in the platform, following information is required:

  • Endpoint Name: Name of the endpoint which uniquely identifies the functionality of the endpoint.

  • Description: Brief description of the endpoint.

  • Capacity: Relative measure of DPUs to allocate to this DevEndpoint.

  • Glue Python Version: Python version for Glue. Select either 2 or 3.

  • Auto Terminate: Whether to enable or disable auto termination on the endpoint. This option enables endpoint termination to save resource costs based on the termination time value provided by the user. Auto termination process will be triggered every hour and looks for any endpoints that needs to be notified or deleted and sends an email when one of the below criteria met.

    User will receive a notification email when:

    • the difference between the auto-terminate process trigger run (every whole hour) and the termination time is less than 30 minutes.
    • the auto-termination process was successfully able to delete the endpoint after the termination time
    • the auto-termination process wasn't able to delete the endpoint due to a dependent ETL Notebook or if any other error occurs.
  • Auto Termination Time: Denotes the time at which the user wants the endpoint to be auto terminated. The maximum auto termination time that a user can set will be less than 168 hours (7 days). Once the current time is greater than the termination time then the termination process will be deleting this endpoint at the next whole hour. User will also be able to modify the termination time by selecting "Edit Endpoint" in the details page and the maximum time that can be set must be less than 168 hours (7 days).

    Note
    • Auto-termination process is scheduled to run every hour on the hour (6:00, 7:00, 8:00, 9:00).
    • User will receive a email notification only when the user is subscribed to alerts. Please refer to Alert Preferences to enable alerts.
    • When the termination time elapses, auto termination process will terminate/delete the endpoint and also deletes all the metadata related to the endpoint and this process cannot be undone.
  • Network Configuration: There are three types of network configurations i.e. Public, App-Public and App-Private.

    • Public and App-Public endpoints have direct access to internet.
    • App-Public deploys endpoints in public subnet of Amorphic application whereas Public endpoint is deployed in AWS Default VPC subnets.
    • App-Private endpoints doesn't have direct access to internet. It is deployed in private subnet of Amorphic application VPC.
  • Extra Python Libs S3Path: User can share the path/paths to one or more Python libraries in an S3 bucket that should be loaded in Endpoint. Multiple paths can be specified separated by comma.

  • Extra Jars S3Path: User can share the path/paths to one or more Java Jars in an S3 bucket that should be loaded in Endpoint. Multiple paths can be specified separated by comma. Only pure java/scala libraries can be used.

  • Datasets Write Access: User can select datasets with the write access required for the endpoint

  • Datasets Read Access: User can select datasets with the read access required for the endpoint

  • Keywords: User can specify keywords required for the endpoint

  • Public Keys: The user can specify a list of public keys which are used by the Endpoints for authentication. This is an optional field.

    You can generate the key using:

    ssh-keygen -t rsa -C your_email@example.com

    The format of the key generated in the file will be as following:

    ssh-rsa <key> <email>

    User can use <key> as a public key in the platform to create an endpoint.

Endpoint Create 2

View Endpoint

If the user has sufficient permissions to view an endpoint then all the endpoint information can be viewed by clicking on the Endpoint name in the "Endpoints" under ETL section.

All the information specified while creating the endpoint will be displayed in the details page. Along with these, a new Message field will be displayed based on the below scenarios:

  • If the endpoint status is failed then user can view the failure information in the Message field.
  • If the user doesn't have all the datasets access required for the endpoint, then the user cannot view the IPAddress of the endpoint and missing datasets access information will be displayed in the Message field.

Following details will be displayed in the endpoint details page:

View Endpoint

Following details will be displayed when the user enables auto-termination on the endpoint. Remaining Time denotes the amount of time (rounded to nearest upper hour) left for auto-termination.

In below image, the auto termination time is set to 02 Jun, 2021 07:18 PM but the endpoint will be deleted at 02 Jun, 2021 08:00 PM because the termination process is scheduled to run at whole hour.

In the details page, Estimated Cost of the endpoint is also displayed to show approximate cost incurred since the creation time.

Endpoint Termination

Edit Endpoint

Endpoint details can be edited using the Edit Endpoint button and changes will be reflected in the Details page immediately for few changes and for few changes it will get updated asynchronously in the backend. See below list of fields for more details:

Fields that get updated immediately:

  • Description
  • Auto Terminate and Auto Terminate Time
  • Keywords
  • Datasets Write/Read Access

Fields that are updated asynchronously:

  • Glue Python Version
  • Extra Python Libs S3Path
  • Extra Jars S3Path
  • Public Keys

When the asynchronous fields are edited then the status changes to update_in_progress. A page refresh after few minutes will update the status to ready state.

The Edit Endpoint page is divided into two sections:

  • Basic Info: User can use this section to update all the basic details of an endpoint.

    Edit Endpoint details (Basic Info)

  • Datasets: User can use this section to update datasets which requires access permissions.

    Edit Endpoint details (Datasets)

Delete Endpoint

If the user has sufficient permissions to delete an endpoint then it can be deleted using the Delete (trash) button on the right side.

Note

The endpoint must be in ready state in order to delete it.

Update Extra Resource Access

To provide parameter or shared libraries or dataset access to an endpoint in large number, use the documentation on How to provide large number of resources access to an ETL Entity in Amorphic