Before we introduce Managed Online Endpoints in Azure Machine Learning, let’s revisit deployment in Azure Machine Learning. For real-time model deployment in Azure Machine Learning, one needs to perform the following steps:
- Create Scoring Script
- Define the Environment Config
- Create an Inference Config
- Create the Deployment Config
- Deploy the model
For more details, refer to Section 1.4 i.e. Model Serving, in our article. One key challenge is the provisioning and maintenance of the Deployment targets like Azure Kubernetes Service/Azure ML Compute Clusters. Fortunately, Microsoft has come up with Managed Endpoints in Azure Machine Learning, using which we could bypass the hassles around maintaining the deployment targets/environments.
Similar to the aforementioned article, we will create a real-time/online endpoint and test it. But there are some pre-requisites.
Pre-Requisites
- Azure Machine Learning Workspace.
- Azure CLI (v2) is installed locally
Step 0: Register a Model to Azure ML Workspace
Before Deployment, a Model has to be registered to the Azure ML workspace. Refer to the aforementioned article titled Azure Databricks and Azure Machine Learning make a great pair, for detailed steps. Here is an example of the same:
model_name = 'california-housing-prices' model_description = 'Model to predict housing prices in California.' model_tags = {"Type": "GradientBoostingRegressor", "Run ID": aml_run.id, "Metrics": aml_run.get_metrics()} registered_model = Model.register(model_path=model_file_path, #Path to the saved model file model_name=model_name, tags=model_tags, description=model_description, workspace=ws)
Step 1: Install Azure CLI ml extension
Apart from Azure CLI(v2), you need to install the new ‘ml’ extension. It enables you to train and deploy models from the command line, with features that speed up data science while tracking the model lifecycle. Follow this document for step-by-step instructions. Once the extension is added, open the command prompt and run az -v. Make sure that the ml extension version is the latest (2.2.3 as of now).
Step 2: Create the Online Endpoint Configuration (YAML).
There are two steps for creating a Managed Online Endpoint:
- Create an Endpoint
- Create the Deployment
The following YAML structure creates the endpoint:
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: california-housing-service
auth_mode: key
Step 3: Create the Online Deployment Configuration (YAML).
The next step is creating the Deployment. Below is an example of the Deployment YAML Schema:
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: default
endpoint_name: california-housing-service
model: azureml:california-housing-prices:1
code_configuration:
code: ./
scoring_script: score.py
environment: azureml:california-housing-env:1
instance_type: Standard_F2s_v2
instance_count: 1
There are two aspects to note here, i.e. a scoring script and environment. A scoring script has two functions viz. init() and run(). The init function loads a registered model to be scored against, while the run executes the scoring logic. The Environment comprises the details of scoring dependencies like the libraries.
The scoring script has the following structure:
import os import json import numpy as np import pandas as pd import sklearn import joblib from azureml.core.model import Model columns = ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', 'AveOccup','Latitude','Longitude'] def init(): global model model_filename = 'california-housing.pkl' model_path = os.path.join(os.environ['AZUREML_MODEL_DIR'], model_filename) model = joblib.load(model_path) def run(input_json): # Get predictions and explanations for each data point inputs = json.loads(input_json) data_df = pd.DataFrame(np.array(inputs['data']).reshape(-1, len(columns)), columns = columns) # Make prediction predictions = model.predict(data_df) # You can return any data type as long as it is JSON-serializable return {'predictions': predictions.tolist()}
As far as the environment is concerned, either you can use a pre-built one or a custom one. In Azure Machine Learning, you can define your custom environments and register the same in the AML workspace. In this example, we use a custom environment. Run the following script, in the azure machine learning compute instance jupyter notebooks, to create an environment:
from azureml.core import Workspace from azureml.core import Environment from azureml.core.environment import CondaDependencies ws=Workspace.from_config() my_env_name="california-housing-env" myenv=Environment.get(workspace=ws, name='AzureML-Minimal').clone(my_env_name) conda_dep=CondaDependencies() conda_dep.add_pip_package("numpy==1.18.1") conda_dep.add_pip_package("pandas==1.1.5") conda_dep.add_pip_package("joblib==0.14.1") conda_dep.add_pip_package("scikit-learn==0.24.1") conda_dep.add_pip_package("sklearn-pandas==2.1.0") conda_dep.add_pip_package("azure-ml-api-sdk") myenv.python.conda_dependencies=conda_dep print("Review the deployment environment.") my_env_name=myenv # Register the environment my_env_name.register(workspace=ws)
Step 4: Create the Online Endpoint.
Now, we have all the pieces in place. The folder structure looks like this:
Open a windows command prompt and change the directory to the above folder. Use the following command to log in to the appropriate Azure tenant:
az login
Further, set the subscription using the following command:
az account set --subscription <name or id>
Now, create the Online Endpoint using the following command:
az ml online-endpoint create -f endpoint.yml --resource-group <your-resource-group> --workspace-name <your-azureml-workspace>
Step 6: Create the Online Endpoint Deployment.
The previous step creates an endpoint, which is a shell. To associate it to an appropriate compute environment, you create a deployment, using the deployment configuration defined in deployment.yml. Note that you can create multiple deployments for an endpoint. Here is the command to create the online deployment:
az ml online-deployment create -f deployment.yml --resource-group <your-resource-group> --workspace-name <your-azureml-workspace>
Once the deployment is completed, the endpoint looks like this:
Note the secure HTTPS Endpoint.
Step 7: Test the Endpoint
Once the deployment is complete, we can test the endpoint in two steps:
- Create a request JSON
- Use Az CLI to invoke the endpoint and get back results
Here is the sample JSON data:
{ "data": [ [8.1, 41,4.04, 1.2, 900.0, 3.560606, 37.50, -127.00], [1.5603, 25, 5.045455, 1.133333, 845.0, 2.560606, 39.48, -121.09] ] }
Next is the Az CLI command to invoke the endpoint:
az ml online-endpoint invoke --name california-housing-service --deployment default --resource-group <your-resource-group> --workspace-name <your-azureml-workspace> --request-file Sample_Request.json
And, here are the results:
"{\"predictions\": [4.731896241931953, 0.6704102705036317]}"
However, it is not practical to use CLI commands to score the endpoint. Hence, we will use the python script. This template for the python script can be found in the consume tab shown above. However, before that, the deployment has to be updated to accept the 100% of traffic. Here is the CLI command for the same:
az ml online-endpoint update --name california-housing-service --resource-group <your-resource-group> --workspace-name <your-azureml-workspace> --traffic "default=100"
Finally, here is the python script to infer from Managed Online Endpoints in Azure Machine Learning:
import urllib.request import json import os import ssl def allowSelfSignedHttps(allowed): # bypass the server certificate verification on client side if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None): ssl._create_default_https_context = ssl._create_unverified_context allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service. X = { "data": [[8.1, 41,4.04, 1.2, 900.0, 3.560606, 37.50, -127.00], [1.5603, 25, 5.045455, 1.133333, 845.0, 2.560606, 39.48, -121.09]] } body = str.encode(json.dumps(X)) url = '<your-managed-online-endpoint>' api_key = '<your-online-endpoint-key>' # Replace this with the API key for the web service headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)} req = urllib.request.Request(url, body, headers) try: response = urllib.request.urlopen(req) result = response.read() print(result) except urllib.error.HTTPError as error: print("The request failed with status code: " + str(error.code)) # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure print(error.info()) print(error.read().decode("utf8", 'ignore'))
Here are the results:
b'{"predictions": [4.731896241931953, 0.6704102705036317]}'
Also read: Introducing Machine Learning System Design