Skip to main content
Version: v2.5 print this page

ML Models

Amorphic's ML Models portal is a tool that helps you create and import machine learning models using Amazon Web Services (AWS) SageMaker. The resulting machine learning model object can be used to make predictions on datasets.

How to Create a ML Model?

ML Models

To create ML Models:

  1. Click on + New ML Model
  2. Fill in the details shown in the table:
AttributeDescription
Model NameThis is the model name in the Amorphic portal.
DescriptionDescribe the notebook purpose and important details.
Model ResourceThere are three ways to integrate Sagemaker model with the Amorphic portal
Existing Model ResourceTo import a model from the Amazon SageMaker Marketplace into the Amorphic Portal, submit a request to the administrator. The administrator will create a support ticket for the AWS Marketplace model using support@amorphicdata.com. The Amorphic team will ensure that the model is then available for selection.
Artifact LocationYou can use Notebooks to create models in an S3 location. To upload a model file directly from this location, refer to the Notebook section for the respective bucket details.
Select fileby selecting this option you can upload a SageMaker model tar file directly into the Amorphic portal. You can upload any tar or tar.gz file into the portal.
Output TypeYou have two options: Dataset Data or Metadata. Select Dataset Data when you need to run a model on a dataset file. Select Metadata when you want to view AI/ML Results, such as metadata on dataset files (which will be explained later). Most of the time, you should use Dataset Data.
Dataset Data would require two additional inputs - Input and Output
Schema.
AttributeDescription
Input SchemaThis schema is used to identify the schema of the dataset on which the pre-processing type of ETL job or model will be run.
Output SchemaThis schema identifies the dataset where the post-processing type of job or model output will be saved.
Both the schema should have the same following format matching the
respective Datasets:

[
{
"type": "Date",
"name": "CheckoutDate",
"Description": "description"
},
{
"type": "String",
"name": "MajorProdDesc",
"Description": "description"
},
{
"type": "Double",
"name": "counts",
"Description": "description"
}
]

:::info Note
You can import the schema of an Amorphic dataset using the "Import from Dataset" functionality
:::
AttributeDescription
Algorithm UsedThe platform currently supports all the major AWS Sagemaker models
Supported file formatsSelect the appropriate file type for predictions. If you need a file format other than the available options, select "Others". This will default to no file type required for batch predictions. Note: if a model is selected as "Others" file type, it can only be run on a "Others" file type dataset.
Preprocess Glue JobSelect the pre-processing ETL jobs created using Amorphic ETL functionality.
Postprocess Glue JobSelect the post-processing ETL jobs created using Amorphic ETL functionality.

Apply Amorphic Model

Apply Model 2

Once an Amorphic model object is created, you can run a model on a Dataset file in the Amorphic portal by following these steps:

  1. Select a Dataset in the Amorphic portal.
  2. Go to the Files tab and select the file on which you want to run the model.
  3. Click on the top right options for the file.
  4. Click on Apply ML.
  5. Select the ML-model from the model dropdown. All Amorphic model objects that match the corresponding input schema of the Dataset will be available for selection.
  6. Select the required instance types. Note: certain AWS Marketplace subscribed models run on specific instance family types.
  7. Select the Target Dataset. The Datasets matching the output schema of the Amorphic model object will be available for selection.
  8. Click on “Submit”.

run ML Models data set

How ML Pipeline works in the Amorphic?

The below figure shows how a typical ML pipeline of Amorphic platform looks like. During the Amorphic model object creation process, the pre-processing and post-processing ETL job functionality provides a way to drag and drop ETL workflows for a smooth user access.

Model Pipeline