Skip to main content
Version: v2.5 print this page

External API Connections

info

From version 2.2, encryption(in-flight, at-rest) for all jobs and catalog is enabled. All the existing jobs(User created, and also system created) were updated with encryption related settings, and all the newly created jobs will have encryption enabled automatically.

External API connections are used to import data from APIs to Amorphic Dataset. Only API Authentication of type BASIC is supported as of now.

Below are the ways to create an External API

BASIC

To create an External API connection, user has to enter API Endpoint, HTTP Method and Query String Parameters. Below image shows how to create an External API Connection

External API basic connection

AttributeDescription
Connection NameName of the connection Amorphic
Connection TypeType of connection. In this case it is ExternalAPI
DescriptionConnection related information user wants to store
Authorized UsersAmorphic users to whom user wants to have access to this connection
API EndpointEndpoint URL from which data needs to be extracted
API AuthenticationAs of version 1.1.3 only BASIC is supported
MethodHTTP Method, as of version 1.1.3 only GET and POST are allowed
Query ParamsQuery string parameters which the API URL takes as input.
VersionEnables the user to select what version of ingestion scripts to use (Amorphic specific). For any new feature/Glue version that gets added to the underlying ingestion script, new version will be added to the Amorphic.

Additionally, the timeout for the ingestion process can be set during connection creation by adding a key IngestionTimeout to ConnectionDetails in the input payload. The value should be between 1 and 2880 and is expected in minutes. If the value is not provided the default value of 480(8hours) would be used. Please note that this feature is available exclusively via API.

{
"ConnectionDetails": {
"url": "https://example.com/datafile.csv",
"auth_mechanism": "basic",
"query_parameters": {},
"method": "GET",
"IngestionTimeout": 222
},
}
info

This timeout can be overridden during schedule creation and schedule run by providing an argument MaxTimeOut.

External API details

External API Details

In the details page, Estimated Cost of the Connection is also displayed to show the approximate cost incurred since creation.

Edit

There is an option to edit an External API Connection. To edit an External API Connection, click the edit button which on the right corner.

Description and Authorised users of an External API Connection can be changed.

Upgrade

Users have the option to upgrade a connection if it's available, and this upgrade option will be displayed in the available options. The upgrade option is visible only when a new version is available; otherwise, it won't be shown.

Connection upgrade, upgrades the underlying Glue version and the data ingestion script with new features.

Downgrade

Users have the capability to downgrade a connection to a previous version if they believe the upgrade isn't meeting their requirements. It's important to note that a connection can only be downgraded if it has previously been upgraded. For connections that have been newly created, the option to downgrade is not available. If a connection is compatible with downgrading, you will find the downgrade option in the top right corner.

Deletion

In the upper right corner, there is a button featuring an icon of a trash can. Click on it to delete.

Connection Versions

1.1

In this version of external api connections, we added auto-reload feature for datasets of type reload.

From this version onwards, data reloads process will trigger automatically as soon as the file upload finishes through the external api connections. So that users don't need to manually trigger reload process after completion of file upload when ingesting data through external api ingestion connection.

1.2

In this version we made code changes in the underlying glue script for the support dataset custom partitioning.

From this version onwards, the data will be loaded into into a S3 LZ with the prefix containing the partition key(if you specified any) for the targets which supports dataset partitioning.

Eg. For the partition keys KeyA, KeyB with the values ValueA, ValueB respectively, the S3 prefix will be in the format Domain/DatasetName/KeyA=ValueA/KeyB=ValueB/upload_date=Unix_Timestamp/UserName/FileType/.

To understand more about custom data partitioning, read the docs about dataset custom partitioning here.

1.3

In this version of external API connection, we added support of skip LZ feature.

This feature enables users to directly upload data to the data lake zone by skipping the data validation. Please refer Skip LZ related docs for more details.

1.4

No major changes were made to the underlying glue script or design, but the logging has been enhanced.

1.5

The update in this version is specifically to ensure FIPS compliance, with no changes made to the script.