Skip to main content
Version: v2.2 print this page

S3 Connections

info

From version 2.2, encryption(in-flight, at-rest) for all jobs and catalog is enabled. All the existing jobs(User created, and also system created) were updated with encryption related settings, and all the newly created jobs will have encryption enabled automatically.

S3 Connections are used to migrate data from a remote S3 bucket to Amorphic Data Cloud. There are two types of S3 Connections available: Bucket Policy and Access Keys.

How to create an S3 Connection?

S3 Connections

To create an S3 Connection using a Bucket Policy, you have to first select the bucket from which the data needs to be migrated. Once the connection is created, a bucket policy will be available for the user to download. The bucket policy generated should be attached to the source bucket which was added to the connection during creation.

To create an S3 Connection using Access Keys, you have to select the bucket from which the data has to be migrated and provide access key and secret access key of the user who has permission to read the data from the bucket.

AttributeDescription
Connection NameGive your connection a unique name.
Connection TypeType of connection, in this case, it is S3.
DescriptionYou can describe the connection purpose and important information about the connection.
Authorized UsersAmorphic users who can access this connection.
S3 BucketName of the bucket from which the dataset files have to be imported.
Connection Access TypeThere are two access types for this connection Access Keys and Bucket Policy.
VersionEnables you to select which version of ingestion scripts to use (Amorphic specific). For any new feature/Glue version that gets added to the underlying ingestion script, new version will be added to the Amorphic.
S3 Bucket RegionRegion where source S3 bucket is created. If the source bucket is in one of the regions (eu-south-1, af-south-1, me-south-1,ap-east-1) then this property needs to be provided and the region needs to be enabled in Amorphic else ingestion fails.
Note

For Redshift use cases with a large number of incoming files, the user should turn ON dataload throttling and set a maximum limit of 90 for redshift.

Upgrade S3 Connection

You can upgrade a connection if a new version is available. Upgrading a connection upgrades the underlying Glue version and the data ingestion script with new features.

Downgrade S3 Connection

You can downgrade a connection to a previous version if the upgrade is not meeting their needs. A connection can only be downgraded if it has been upgraded. The option to downgrade is available on the top right corner if the connection is downgrade compatible.

Connection Versions

1.6

In this version of s3 connection, the data ingestion happens by considering and comparing ETag of the files in source and target.

First we check the file name, if file exists, size is same then we get the ETags of the source and current file in dataset. If they match then we do the ingestion. This is because an ETag of a file doesn't change when the file name changes and if the user intends to duplicate the files by changing the names then he won't be able to if only ETag is considered.

In this version, only files stored in S3 Standard class are supported for S3 data ingestion. If there exist files from other storage classes, the ingestion process will fail.

1.7

In this version of s3 connection, the storage classes of files do not affect the flow of ingestion.

That means, the ingestion process will not be terminated even if there exist files from S3 Glacier type of storage classes. We just skip that files from ingestion, then show the details of skipped files and successfully complete the ingestion of all other files without any failure.

Note

Files that are stored in S3 Glacier and S3 Glacier deep archive classes will be skipped during ingestion.

1.8

In this version of s3 connection, we added support of skip LZ feature.

This feature enables users to directly upload data to the data lake zone by skipping the data validation. Please refer Skip LZ related docs for more details.