Hey everyone!

There are so many tools — MLflow, Docker, CI/CD pipelines — it was overwhelming at first when getting started to learn about MLOps.

So I want to share what I learned. Maybe it can help you too.

What is MLOps?

Think about it like this. You build a machine learning model on your laptop and it works! But then what? How do you put it somewhere so other people can use it? How do you update it when you have better data?

That’s what MLOps is about. It’s like DevOps, but for machine learning. In this project, I built a system that:

Trains a model automatically
Puts it in a container (it contains everything the model needs)
Deploys it to staging first (to test it)
Then promotes it to production

Here’s the architecture of what we’re building:

Google Cloud Services

The main services I used on Google Cloud for this project are:

Vertex AI Pipelines: A managed service that orchestrates ML workflows using Kubeflow Pipelines under the hood. It runs the training workflow. Think of it as a recipe that tells Google Cloud: “First, load the data. Then, train the model. Then, save it.”
Cloud Deploy: Google’s managed continuous delivery service that handles rollouts, approvals, and rollbacks automatically. It moves the model from staging to production — like a delivery service for your model.
GKE (Google Kubernetes Engine): A managed Kubernetes service that runs containerized applications with auto-scaling and load balancing. This is where the model runs — like the house where your model lives.
Cloud Build: A serverless CI/CD platform that builds Docker images and pushes them to Artifact Registry. Think of it as a factory that produces the containers for your model.

Prerequisites

Before we start, make sure you have:

A Google Cloud account (you can use free credits if you’re new)
Basic Python knowledge
Google Cloud SDK installed on your computer (or use Cloud Shell)

Step 1: Set Up Your Environment

First, let’s set up everything we need. Open your terminal (or Cloud Shell) and run these commands:

# Set your project ID
export PROJECT_ID=$(gcloud config get-value project)
export REGION="us-central1"
export BUCKET_NAME="${PROJECT_ID}-mlops-lab"

# Clone the repository
git clone https://github.com/misskecupbung/mlops-vertex-ai-cloud-deploy.git
cd mlops-vertex-ai-cloud-deploy

Now run the setup script:

chmod +x scripts/setup.sh
./scripts/setup.sh

What does this script do?

Enables all the Google Cloud APIs we need

Creates a storage bucket for our model files

Creates an Artifact Registry to store our containers

Creates a GKE cluster with staging and production namespaces

Sets up permissions using IAM and Service Accounts, e.g creating a Workload Identity so your GKE pods can access Cloud Storage to load the model file

This takes about 5–10 minutes.

Step 2: Build the Training Container

After the setup script finishes, you should have a GKE cluster running, a storage bucket created, and Artifact Registry ready to store your container images.

Now let’s build the container that will train our model. The training code is in src/train/py. It uses the Iris dataset — a classic machine learning dataset about flowers. The script loads the data, trains a Random Forest classifier, evaluates the accuracy, and saves the model to Cloud Storage.

Run this command to build the training container:

gcloud builds submit --config=cloudbuild-training.yaml

You can watch the build progress in the terminal. Or go to Cloud Console → Cloud Build → History.

When it’s done, your training container is stored in Artifact Registry.

Step 3: Build the Serving Container

Next, we build the serving container. This is different from the training container. The serving container runs a FastAPI server that accepts prediction requests.

gcloud builds submit --config=cloudbuild-serving.yaml

You can watch the build progress in the terminal. Or go to Cloud Console → Cloud Build → History.

Now we have both containers ready:

Training container → trains the model
Serving container → serves predictions

Step 4: Run the Vertex AI Pipeline

Now let’s run the training pipeline on Vertex AI.

First, set up Python environment and install the requirements:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Now compile and submit the pipeline:

# Compile the pipeline
python src/compile_pipeline.py

# Submit the pipeline
python src/submit_pipeline.py

Go to Vertex AI → Pipelines in the console. You’ll see your pipeline running!

Click on it to see the details and you can watch each step:

Data Preparation: Loads the iris dataset
Model Training: Trains a Random Forest model
Model Evaluation: Checks how good the model is
Model Upload: Saves the model to Cloud Storage

Wait for the pipeline to finish. It takes about 5 minutes.

You can see the model stored in Cloud Storage bucket

Step 5: Set Up Cloud Deploy

Cloud Deploy will move our model through environments. First, let’s create the delivery pipeline:

# Get your project number
export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT_ID} --format='value(projectNumber)')

# Create the Cloud Deploy config
envsubst < clouddeploy.yaml > clouddeploy-rendered.yaml

# Apply it
gcloud deploy apply --file=clouddeploy-rendered.yaml --region=${REGION}

Now prepare the namespaces. This creates ConfigMaps that tell the serving pods where to find the model:

chmod +x scripts/prepare-namespaces.sh
./scripts/prepare-namespaces.sh

Go to Cloud Deploy in the console. You should see your pipeline with two targets: staging and production.

Step 6: Create Your First Release

Now let’s deploy! Create a release to push the serving container to staging:

gcloud deploy releases create release-001 \
  --project=${PROJECT_ID} \
  --region=${REGION} \
  --delivery-pipeline=mlops-model-pipeline \
  --images=serving-image=${REGION}-docker.pkg.dev/${PROJECT_ID}/mlops-lab/serving:v1

Cloud Deploy will automatically deploy to staging. Watch the progress:

gcloud deploy releases describe release-001 \
  --delivery-pipeline=mlops-model-pipeline \
  --region=${REGION}

Or check the Cloud Deploy console:

Step 7: Test the Staging Deployment

Let’s make sure our model works! First, check if the pods are running:

kubectl get pods -n staging

You should see 1/1in the READY column. If it shows 0/1, wait a moment and check again.

Now get the staging IP and test it:

# Get the IP
STAGING_IP=$(kubectl get svc model-serving -n staging -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Staging IP: $STAGING_IP"

# Test prediction
curl -X POST http://${STAGING_IP}:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

You should get a response like this:

{
  "prediction": "setosa",
  "confidence": 1.0,
  "probabilities": {
    "setosa": 1.0,
    "versicolor": 0.0,
    "virginica": 0.0
  },
  "model_version": "v1"
}

It works! The model predicted sentosaflower with 100% confidence.

Step 8: Promote to Production

Staging looks good. Let’s promote to production:

gcloud deploy releases promote \
  --release=release-001 \
  --delivery-pipeline=mlops-model-pipeline \
  --region=${REGION}

Production requires approval. This is a safety feature. You can approve from the console

Or you can approve via command line:

gcloud deploy rollouts approve release-001-to-production-0001 \
  --delivery-pipeline=mlops-model-pipeline \
  --release=release-001 \
  --region=${REGION}

Wait for the deployment to finish. Check the Cloud Deploy console:

Step 9: Test Production

Let’s test production the same way:

# Get production IP
PROD_IP=$(kubectl get svc model-serving -n production -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Test with a different flower
curl -X POST http://${PROD_IP}:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"features": [6.7, 3.0, 5.2, 2.3]}'

You should get a response like this:

{
  "prediction": "virginica",
  "confidence": 0.99,
  "probabilities": {
    "setosa": 0.0,
    "versicolor": 0.0004,
    "virginica": 0.99
  },
  "model_version": "v1"
}

This time the model predicts virginica— a different type of iris flower!

The full code is here: https://github.com/misskecupbung/mlops-vertex-ai-cloud-deploy

If you have questions or feedbacks, feel free to reach out.

Keep Learning!

<hr><p>MLOps Pipeline with Vertex AI and Cloud Deploy on Google Cloud was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>