Model inference

Model Inference

After the model is trained and stored in S3 bucket, the next step is to use that model for inference.

This chapter explains how to use the previously trained model and run inference using TensorFlow and Keras on Amazon EKS.

Run inference pod

A model from training was stored in the S3 bucket in previous section. Make sure S3_BUCKET and AWS_REGION environment variables are set correctly.

curl -LO
envsubst <mnist-inference.yaml | kubectl apply -f -

Wait for the containers to start:

kubectl get pods -l app=mnist,type=inference
NAME                    READY   STATUS      RESTARTS   AGE
mnist-96fb6f577-k8pm6   1/1     Running     0          116s

Port forward inference endpoint for local testing:

kubectl port-forward `kubectl get pods -l=app=mnist,type=inference -o jsonpath='{.items[0]}' --field-selector=status.phase=Running` 8500:8500 &

Run inference

Use the script to make prediction request. It will randomly pick one image from test dataset and make prediction.

curl -LO
$ python --endpoint http://localhost:8500/v1/models/mnist:predict

Data: {"instances": [[[[0.0], [0.0], [0.0], [0.0], [0.0] ... 0.0], [0.0]]]], "signature_name": "serving_default"}
The model thought this was a Ankle boot (class 9), and it was actually a Ankle boot (class 9)