Skip to main content
info
This documentation is for version current of the product.
For the latest version(v2.7) documentation click here
 print this page

Use Glue Sessions in Amorphic Studio

Users can leverage Glue Sessions in AWS Glue Studio using Jupyter Lab with Glue PySpark kernel to streamline data processing and transformation workflows

Usage

  1. Create a Sagemaker Studio: Follow the detailed instructions on how to Create a Studio to set up your Studio environment.

  2. Copy the Studio ID: Once your studio is created, make sure to copy the unique studio ID for future reference.

  3. Locate the User Profile Role: In the AWS console backend, search for the user profile role associated with your sagemaker studio. The format for the user profile role is: {ProjectShortName}-custom-{studio_id}-usr-Role.

IAM user profile role for studio

  1. Update Trust Relationship: Ensure that you add glue.amazonaws.com to the Trust relationship of the user profile role to allow Glue services to assume the role.

IAM Trust relationship for user profile role

  1. Modify Inline Policy: Update the custom inline policy to include the necessary permissions for Glue operations. Add the following statements to the inline policy, and include the user profile role ARN you previously identified in the Resource field for the iam:PassRole action:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "GlueSessionPermissions",
    "Effect": "Allow",
    "Action": [
    "glue:RunStatement",
    "glue:GetStatement",
    "glue:ListStatements",
    "glue:CancelStatement",
    "glue:StopSession",
    "glue:DeleteSession",
    "glue:GetSession",
    "glue:CreateSession",
    "glue:ListSessions",
    "glue:TagResource",
    "glue:UntagResource"
    ],
    "Resource": "*"
    },
    {
    "Effect": "Allow",
    "Action": "iam:PassRole",
    "Resource": "arn:aws:iam::{account_id}:role/{ProjectShortName}-custom-{studio_id}-usr-Role"
    }

    ]
    }
  2. Create a JupyterLab Notebook: Open the studio and create a new JupyterLab notebook to start your data processing tasks.

  3. Launch Glue PySpark Kernel: Finally, launch the Glue PySpark kernel within your JupyterLab notebook to begin utilizing Glue Sessions for your data workflows.

Note: After the 3.0.1 release, Glue Sessions will be available in studios without any additional setup required. For a comprehensive guide on how to use Glue Sessions, refer to the provided link.