This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Integrating Virtual Machines into the mesh

Istio service mesh primarily provides its features to Kubernetes-based resources. However, in some cases it makes sense to integrate bare metal machines or virtual machines into the mesh:

  • Temporary integration: To migrate the non-k8s native workloads into the mesh, providing temporary access to the machine’s network resources until the migration is done.
  • Long-term integration: Sometimes is impractical to migrate the given workload into Kubernetes due to its size (for example, when huge, bare metal machines are required) or when the workload is stateful and it is hard to support it on Kubernetes.

Service Mesh Manager provides support for both use-cases building on top of Istio’s support for Virtual Machines.

For an overview of how Service Mesh Manager implements VM Integration based on Istio’s framework, see Istio resources.

Architecture

Service Mesh Manager takes an automation-friendly approach to managing the virtual machines by providing an agent that runs on the machine. This component enables Service Mesh Manager to provide the same observability features for virtual machines as for native Kubernetes workloads, such as Topology view, Service/Workload overview, integrated tracing, or traffic tapping.

The agent continuously maintains the configuration of the machine so that any change in the upstream cluster is reflected in its configuration. This behavior ensures that if the meshexpansion-gateways IP addresses change, the machine retains the connectivity to the mesh.

In case the machine is available for an extended period of time, Istio must to be upgraded on the machines. The upgrade flow is aligned with the Canary control plane upgrades that Service Mesh Manager uses for the Istio control plane upgrade: the agent ensures that the host has the latest version of Istio installed and provides a validation warning in case the istio process needs to be restarted.

When the virtual machine is part of the mesh, it is like to a Kubernetes pod. It belongs to a specific namespace, and cannot communicate with other namespaces. The name of the pod is the hostname of the virtual machine.

Ease of use

After a virtual machine has been integrated into the mesh, Service Mesh Manager automatically updates the configuration of the virtual machine to ensure that it remains a part of the mesh and receives every configuration updates it needs to operate in teh mesh. In addition, the observability features available for Kubernetes pods are available for the virtual machines as well, for example:

Getting started

To try out VM integration, we highly recommend using the VM integration quickstart guide.

For more details on Service Mesh Manager’s capabilities for handling machines, see Istio resources.

For detailed examples for more complex use-cases such as migrating an existing workload into the mesh, see the Use-cases section.

1 - Prerequisites

Before trying to attach a virtual machine to your mesh, make sure to:

You can attach VMs only to active Istio cluster that’s running the Service Mesh Manager controlplane.

Configuration prerequisites

To attach external machines, the Service Mesh Manager dashboard needs to be exposed so that smm-agent can fetch the required configuration data. For details, see Exposing the Dashboard.

Supported operating systems

Right now the following operating systems are verified to be working to be added to the mesh:

  • Ubuntu 20.04+ (64-bit)
  • RedHat Enterprise Linux 8 (64-bit)

However, any operating system using Deb or RPM package managers and systemd as init should be able to execute the same procedure.

Package dependencies

OS Required Packages Example Install Command
Ubuntu curl iptables sudo hostname apt-get install -y curl iptables sudo hostname
RHEL curl iptables sudo hostname yum install -y curl hostname iptables sudo

Network prerequisites

Because of the way Istio operates, the VM is only able to resolve services and DNS names from the same Kubernetes namespace as it’s attached to. This means that communication from the VM to other Kubernetes namespaces is not possible.

Cluster access to VM

The cluster must be able to access the following ports exposed from the VM:

  • TCP ports 16400, 16401
  • Every port you define for the workloadgroup

The Kubernetes clusters in the mesh must be able to access every port on the VM that is used to serve mesh-traffic. For example, if the VM runs a web server on port 80, then port 80 must be accessible from every pod in the member clusters. (The workloadgroup defined for the service should indicate that the service is available via port 80).

Determining the VM’s IP address

From the clusters point of view, the VM’s IP address may not be the IP address that appears on the network interfaces in the VM’s operating system. For example, if the VM is exposed via a load balancer instance of a cloud service provider, then the Service Mesh Manager clusters can reach the VM via the IP address (or IP addresses) of the load balancer.

While it is expected that the administrators integrating VM’s into the service-mesh have the ability to identify the VM’s IP from the point of view of the service-mesh, the fallback behavior of

The smm-agent application queries the https://ifconfig.me/ip site to determine the IP that the public internet sees for the VM. If the IP that the site returns is not the IP that the clusters in the service mesh should use to reach the VM, then set the VM’s IP address to use for the service mesh communication during the smm-agent setup.

Note: This document is not a comprehensive guide on how to expose the VMs' via IP.

VM access to cluster

Istio can work in two distinct ways when it comes to network topologies:

  • If the virtual machine has no direct connection to the pod’s IP addresses, it can rely on a meshexpansion gateway and use the different network approach. Unless latency is of uttermost importance, we highly recommend using this approach as it allows for more flexibility when it comes to attaching VMs for multiple separated networks.
  • If the virtual machine can access the pod’s IP addresses directly, then you can use the same network approach.

Different network

To configure the different network model, the WorkloadGroup’s .spec.network field must be set to a different network than the networks used by the current Istio deployment.

To check which network the existing Istio control planes are attached to, run the following command:

kubectl get istiocontrolplanes -A

The output should be similar to:

NAMESPACE      NAME       MODE     NETWORK    STATUS      MESH EXPANSION   EXPANSION GW IPS                 ERROR   AGE
istio-system   cp-v115x   ACTIVE   network1   Available   true             ["13.48.73.61","13.51.88.187"]           9d

Istio uses the network1 network name, so set the WorkloadGroup’s network setting to something different, such as vm-network-1.

Firewall settings

From the networking perspective the machines should be able to access:

  • the meshexpansion-gateways, and
  • the exposed dashboard ports.
  1. To get the IP addresses of meshexpansion-gateways, check the services in the istio-system namespace:

    kubectl get services -n istio-system istio-meshexpansion-cp-v115x
    

    The output should be similar to:

    NAME                                    TYPE           CLUSTER-IP     EXTERNAL-IP                                                               PORT(S)                                                                                           AGE
    istio-meshexpansion-cp-v115x            LoadBalancer   10.10.82.80    a4b01735600f547ceb3c03b1440dd134-690669273.eu-north-1.elb.amazonaws.com   15021:30362/TCP,15012:31435/TCP,15017:30627/TCP,15443:32209/TCP,50600:31545/TCP,59411:32614/TCP   9d
    
  2. To get the IP addresses of exposed dashboard ports, check the services in the smm-system namespace:

    kubectl get services -n smm-system smm-ingressgateway-external
    

    The output should be similar to:

    smm-ingressgateway-external      LoadBalancer   10.10.153.139   a4dcb5db6b9384585bba6cd45c2a0959-1520071115.eu-north-1.elb.amazonaws.com                   80:31088/TCP
    
  3. Configure your firewalls.

    1. Make sure that the DNS names shown in the EXTERNAL-IP column are accessible from the VM instances.

Same network

To configure the same network model, the WorkloadGroup’s .spec.network field must be set to the same network as the one used by the current Istio deployment.

To check which network the existing Istio control planes are attached to, run the following command:

kubectl get istiocontrolplanes -A

The output should be similar to:

NAMESPACE      NAME       MODE     NETWORK    STATUS      MESH EXPANSION   EXPANSION GW IPS                 ERROR   AGE
istio-system   cp-v115x   ACTIVE   network1   Available   true             ["13.48.73.61","13.51.88.187"]           9d

Istio is using the network1 network name, so set the WorkloadGroup’s network setting to network1, too.

2 - Quickstart

To get started with integrating virtual machines to your service mesh using Service Mesh Manager, we strongly recommend to add the first virtual machine by following the examples in this documentation and using the Service Mesh Manager demo application. That allows you to get familiar with the process without the added complexity of dealing with a real application.

The high-level steps of the procedure are the following:

  1. If the dashboard of your Service Mesh Manager deployment is not available via a public URL, expose it. Otherwise, the virtual machines won’t be able to retrieve their Istio configuration from the control plane. For details, see Exposing the Dashboard.
  2. Complete the rest of the Prerequisites using the different network approach, and set up your firewall.
  3. Complete the procedure described in Add a virtual machine to the mesh.

3 - Dashboard

The virtual machines that you integrate into the mesh also become available on the Service Mesh Manager dashboard. Workloads running on virtual machines are treated as regular Kubernetes workloads, with the following differences.

  • On the MENU > TOPOLOGY page, workloads running on virtual machines are shown as workloads with the Virtual machine workload icon in their corner.

    VM Topology VM Topology

  • On the MENU > WORKLOADS page, workloads running on virtual machines are marked with the Virtual machine workload icon.

    VM Workloads VM Workloads

  • When drilling down into the details of workloads, workloads running on virtual machines have a WorkloadEntry level instead of the Pod and Node levels of Kubernetes workloads.

  • On the HEALTH details of the workloads, CPU and memory saturation data is labeled as VM SATURATION: CPU and VM SATURATION: MEMORY.

  • When using traffic tapping to a workload running on virtual machine, the name of the pod in the output is actually the hostname of the virtual machine.

4 - Use-cases

4.1 - Add a virtual machine to the mesh

This guide shows you how to manually add a virtual machine to a workload, using the analytics-v1 workload of the Service Mesh Manager demo application as an example.

If you already understand the procedure and want to configure your virtual machine to be automatically added to the mesh, see Autoscaling VM groups.

Prerequisites

  • You already have a virtual machine available.
  • You have root access to the virtual machine.
  • You have completed the Prerequisites.
  • If you are using the different network approach make sure to set up your firewall.
  • If you want to exactly replicate the steps of this guide for testing purposes, install the Service Mesh Manager demo application on your cluster, and the jq tool on your computer.

If you are performing this procedure on a clean installation of the Service Mesh Manager demo application, the topology view of the smm-demo namespace should look similar to this:

Topology of the demo application running on a single cluster

Scale down the analytics service

Note: If you are not using the demo application to test VM integration, skip this step. You can install the demo application on your Service Mesh Manager cluster by running smm demoapp install

This step and the examples in other steps of this guide rely on the analytics-v1 workload of the Service Mesh Manager demo application.

  1. Scale it down to have zero replicas:

    kubectl scale deploy -n smm-demo analytics-v1 --replicas=0
    
  2. Verify that there are no pods belonging to the analytics deployment. The following command should return an empty response:

    kubectl get pods -n smm-demo | grep analytics
    

Add an external workload to the mesh

The attached machines will behave as Kubernetes workloads. This means that it will have a set of labels assigned that could be used by Services to match the machine. All Kubernetes Pods have a service account assigned (usually the default one in the namespace if not specified otherwise). The machine uses this service account to authenticate to the Istio control plane.

To add an external workload to the mesh, create a WorkloadGroup in the namespace where the machine will be attached to. This object represents a group of machines serving the same service. This is analogous to the Kubernetes concept of a Deployment.

For example, to add a virtual machine serving the analytics traffic in the demo application, use the following object:

  apiVersion: networking.istio.io/v1alpha3
  kind: WorkloadGroup
  metadata:
    labels:
      app: analytics
      version: v0
    name: analytics-v0
    namespace: smm-demo
  spec:
    metadata:
      labels:
        app: analytics
        version: v0
    probe:
      httpGet:
        path: /
        host: 127.0.0.1
        port: 8080
        scheme: HTTP
    template:
      network: vm-network-1
      ports:
        http: 8080
        grpc: 8082
        tcp: 8083
      serviceAccount: default

For details on these settings, see Exposing the Dashboard.

mTLS settings (optional)

After Istio is started on the virtual machine, Istio takes over the service ports defined in the WorkloadGroup resource. Depending on your settings, it will also start enforcing mTLS on those service ports.

If external (non-mesh) services communicate with the virtual machine, ensure that communication without encryption is permitted on the service ports. To do so, create a PeerAuthentication object in the smm-demo namespace. Make sure that the matchLabels selector only matches the WorkloadGroup, and not any other Kubernetes deployment, to avoid permitting unencrypted communication where it’s not needed. In the following example, the matchLabels selector includes both the app and the version labels.

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: analytics
  namespace: smm-demo
spec:
  mtls:
    mode: PERMISSIVE
  selector:
    matchLabels:
      app: analytics
      version: v0

We recommend setting the mTLS mode to PERMISSIVE as it allows unencrypted traffic. The in-mesh traffic will use mTLS.

Note: In case of any compatibility issues, you can set the mode to DISABLE, but it this case all traffic will be un-encrypted.

Set up the virtual machine

Required packages on the virtual machines

In addition to the OS package dependencies prerequisites, this example requires python3. To install the python3 package, run the following command:

  • On Ubuntu:

    apt-get update && apt-get install -y python3
    
  • On RHEL:

    yum install -y python3
    

Executing the Example VM-based Service Workload

Before you can register the virtual machine, the workload must already be running on the VM. The following instructions start an example HTTP server workload on the virtual machine.

In the example using the demo application, open a terminal on the machine (for example, using SSH), and start a simple web server serving files from the empty-dir (the demo application requires only the availability of http://<pod-id>:8080/**).

Note: The nohup shell command keeps python3 and http.server running when the shell is logged out.

mkdir -p empty-dir
cd empty-dir
nohup python3 -m http.server 8080 &

Collect the required data

To attach the VM to the mesh, you’ll need the following information:

  • The URL of the dashboard
  • The namespace and name of the WorkloadGroup (smm-demo.analytics-v1 in the example)
  • The bearer token of the service account referenced in the .spec.template.serviceAccount of the WorkloadGroup
  • (Optional) The IP address that the clusters in the service mesh can use to access the VM. If this is the same as the IP the public internet sees for the VM, then Service Mesh Manager detects the VM’s IP automatically.

To acquire the bearer token of the ServiceAccount, complete the following steps.

  1. On Kubernetes 1.24 and newer, the token secrets for service accounts are not created automatically. Create the token manually. For details, see the Kubernetes documentation.

  2. Download and run the following script. This script fetches the bearer token for the service account in namespace SA_NAMESPACE with the name of SA_SERVICEACCOUNT and saves it into the ~/bearer-token file.

    #!/bin/bash -e
    
    SA_NAMESPACE="smm-demo"
    SA_BEARER_TOKEN_FILE=~/bearer-token
    
    SA_SECRET_NAME=$(kubectl get secret -n ${SA_NAMESPACE} -o jsonpath='{.items[?(@.metadata.annotations.kubernetes\.io/service-account\.name=="default")].metadata.name}')
    if [ -z "$SA_SECRET_NAME" ]; then
            echo "Cannot find secret that contains the token for the service account"
            exit 1
    fi
    
    mkdir -p $(dirname $SA_BEARER_TOKEN_FILE)
    if ! kubectl get secret -n $SA_NAMESPACE ${SA_SECRET_NAME} -o json | jq -r '.data.token | @base64d' > $SA_BEARER_TOKEN_FILE ; then
            echo "cannot get service account bearer token"
            exit 1
    fi
    

Prepare the virtual machine

To prepare the virtual machine to be attached to the mesh, complete the following steps.

  1. Open a terminal (for example, SSH) to the virtual machine.

  2. Install smm-agent on the virtual machine. The agent ensures that the machine’s Istio configuration is always up-to-date. Run the following command as the root user:

    curl http://<dashboard-url>/get/smm-agent | bash
    

    The output should be similar to:

    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                    Dload  Upload   Total   Spent    Left  Speed
    100  1883  100  1883    0     0  13744      0 --:--:-- --:--:-- --:--:-- 13744
    Detecting host properties:
    - OS: linux
    - CPU: amd64
    - Packager: deb
    - SMM Base URL: http://a6bc8072e26154e5c9084e0d7f5a9c92-2016650592.eu-north-1.elb.amazonaws.com
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                    Dload  Upload   Total   Spent    Left  Speed
    100 20.3M    0 20.3M    0     0  22.2M      0 --:--:-- --:--:-- --:--:-- 22.2M
    Selecting previously unselected package smm-agent.
    (Reading database ... 63895 files and directories currently installed.)
    Preparing to unpack /tmp/smm-agent-package ...
    Unpacking smm-agent (1.9.1~snapshot.202203311042-SNAPSHOT-ab2e8684a) ...
    Setting up smm-agent (1.9.1~snapshot.202203311042-SNAPSHOT-ab2e8684a) ...
    Created symlink /etc/systemd/system/multi-user.target.wants/smm-agent.service → /lib/systemd/system/smm-agent.service.
    ✓ dashboard url set url=<dashboard-url>
    
  3. Specify which WorkloadGroup and namespace you want to attach the machine to by running the following command:

    smm-agent set workload-group <namespace> <workloadgroup>
    

    For example:

    smm-agent set workload-group smm-demo analytics-v0
    

    The output should be similar to:

    ✓ target workload group set namespace=smm-demo, name=analytics-v0
    
  4. Specify the bearer token you have acquired in a previous step (replace <token> with the actual token, not the filename):

    smm-agent set bearer-token <token>
    

    The output should be similar to:

    ✓ bearer token set
    
  5. (Optional) Set the IP address the service mesh should use to access the VM.

    smm-agent set node-ip <VM's IP>
    

    The output should be similar to:

    ✓ node-ip is set ip=<VM's IP>
    
  6. Validate the configuration of smm-agent by running the following command. If the configuration is invalid an error is shown.

    smm-agent show-config
    

    The output should be similar to:

    ✓ dashboard url=http://a6bc8072e26154e5c9084e0d7f5a9c92-2016650592.eu-north-1.elb.amazonaws.com
    ✓ target workload-group namespace=smm-demo, name=analytics-v0
    ✓ no additional labels set
    ✓ bearer token set
    ✓ node-ip is set
    ✓ configuration is valid
    

Attach the virtual machine to the mesh

Now that you have started the workload (HTTP server) and configured smm-agent, you can attach the VM to the mesh. To do so, run a reconciliation on this host. This step will:

  • configure and start Istio, so the virtual machine becomes part of the mesh
  • ensure that the cluster configuration is properly set
  • start smm-agent in the background so that the system is always up-to-date

Run the following command:

smm-agent reconcile

The output should be similar to:

✓ reconciling host operating system
✓ configuration loaded config=/etc/smm/agent.yaml
✓ install-pilot-agent ❯ downloading and installing OS package component=pilot-agent, platform={linux amd64 deb 0xc00000c168}
✓ install-pilot-agent ❯ downloader reconciles with exponential backoff downloader={pilot-agent {linux amd64 deb 0xc00000c168} true  0xc0002725b0}
...
✓ systemd-ensure-smm-agent-running/systemctl ❯ starting service args=[smm-agent]
✓ systemd-ensure-smm-agent-running/systemctl/start ❯ executing command command=systemctl, args=[start smm-agent], timeout=5m0s
✓ systemd-ensure-smm-agent-running/systemctl/start ❯ command executed successfully command=systemctl, args=[start smm-agent], stdout=, stderr=
✓ changes were made to the host operating system
✓ reconciled host operating system

Verify connectivity

If the attachment was successful, a new WorkloadEntry has been created for the new node in the namespace of the WorkloadGroup (if you followed the examples, in the smm-demo namespace). Verify it by completing the following steps.

  1. Check that the new WorkloadEntry exists:

    kubectl get workloadentries -n smm-demo
    

    The output should be similar to:

    NAME                                    AGE     ADDRESS
    analytics-v0-3.68.232.96-vm-network-1   2m40s   3.68.232.96
    
  2. Check the healthiness of the service:

    kubectl describe workloadentries analytics-v0-3.68.232.96-vm-network-1
    

    The output should be similar to:

    Name:         analytics-v0-3.68.232.96-vm-network-1
    Namespace:    smm-demo
    Labels:       app=analytics
    ...
    Status:
      Conditions:
        Last Probe Time:       2022-04-01T05:47:47.472143851Z
        Last Transition Time:  2022-04-01T05:47:47.472144917Z
        Status:                True
        Type:                  Healthy
    
    
  3. On the Service Mesh Manager dashboard, navigate to MENU > TOPOLOGY and verify that the VM is visible and that it is getting traffic. If you have performed this procedure on a clean installation of the Service Mesh Manager demo application, the difference on the topology view of the smm-demo namespace is that the analytics-v1 workload is now running on a virtual machine (indicated by the blue icon on the workload), and should look similar to this:

    Topology page with VMs

4.2 - VM to Kubernetes migration

When migrating an existing workload to the mesh (and Kubernetes), you have to complete the following main steps:

  1. Add the virtual machine to the mesh, so the original workload that is running in the virtual machine is available in the mesh.
  2. Configure traffic shifting that will allow you to route traffic from the virtual machine to the Kubernetes workload.
  3. Add the Kubernetes workload that will replace the virtual machine.
  4. Shift traffic to Kubernetes gradually and test that the Kubernetes workload works properly, even under high load.
  5. Remove the virtual machine when you have successfully completed the migration.

Note: The configuration examples use the analytics-v1 workload of the demo application. Adjust them as needed for your environment.

Add the VM to the mesh

Complete the prerequisites and attach the virtual machine to the mesh as described in Add a virtual machine to the mesh.

Set up traffic shifting

The migration must be a controlled process, especially in case of a production system. This step ensures that the entire traffic goes to the virtual machine even when the Kubernetes workload is started. This method avoids service disruptions and allows you to test the Kubernetes workload, and transfer the traffic gradually. Otherwise, traffic would be split in round-robin fashion between the VM and the Pod.

For details on creating the routing rule, see Routing.

Make sure to set the weight of the routing rule corresponding to the virtual machine to 100.

traffic-shift traffic-shift

Alternatively, create and apply Kubernetes resources similar to the following:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: analytics
  namespace: smm-demo
spec:
  host: analytics.smm-demo.svc.cluster.local
  subsets:
  - labels:
      version: v0
    name: v0
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: analytics-smm-demo-lmj7m
  namespace: smm-demo
spec:
  hosts:
  - analytics.smm-demo.svc.cluster.local
  http:
  - route:
    - destination:
        host: analytics.smm-demo.svc.cluster.local
        port:
          number: 8080
        subset: v0
      weight: 100

Add the Kubernetes workload

Now that you have guaranteed that the traffic will be still flowing to the virtual machine, you can deploy the new workload that is going to replace the virtual machine.

In this example we just upscale the analytics-v1 deployment (that was scaled down as part of Adding a VM to the mesh).

kubectl scale deploy -n smm-demo analytics-v1 --replicas=1

Wait until it’s up:

kubectl get pods -n smm-demo

The output should be similar to:

NAME                                READY   STATUS    RESTARTS   AGE
analytics-v1-7b96898ddc-9czpp       2/2     Running   0          18s
bombardier-66786577f7-tnjll         2/2     Running   0          18h
bookings-v1-7d8d76cd6b-68h6s        2/2     Running   0          18h
catalog-v1-5864c4b7d7-fvnqs         2/2     Running   0          18h
database-v1-65678c5dd6-lr2hh        2/2     Running   0          18h
frontpage-v1-776d76965-zbx67        2/2     Running   0          18h
movies-v1-6f7958c8c4-76ksk          2/2     Running   0          18h
movies-v2-568d4c4f4b-nrtkm          2/2     Running   0          18h
movies-v3-84b4887764-h2bzv          2/2     Running   0          18h
mysql-58458785-d4wx7                2/2     Running   0          18h
notifications-v1-544d6f77f7-jcdq6   2/2     Running   0          18h
payments-v1-7c955bccdd-l2czq        2/2     Running   0          18h
postgresql-75b94cdc9c-h6w64         2/2     Running   0          18h

Note: The workload will not show up on the topology view, as it does not receive any traffic yet.

For production systems, verify that the workload functions as expected before routing traffic to it.

Shifting traffic

Now that the new Kubernetes workload is functional, route a portion of the traffic to it, and verify that it is working as expected. For details on creating the routing rule, see Routing.

traffic-shift traffic-shift

Alternatively, adjust the related Kubernetes resources, for example:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: analytics
  namespace: smm-demo
spec:
  host: analytics.smm-demo.svc.cluster.local
  subsets:
  - labels:
      version: v0
    name: v0
 - labels:
      version: v1
    name: v1
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: analytics-smm-demo-lmj7m
  namespace: smm-demo
spec:
  hosts:
  - analytics.smm-demo.svc.cluster.local
  http:
  - route:
    - destination:
        host: analytics.smm-demo.svc.cluster.local
        port:
          number: 8080
        subset: v0
      weight: 90
    - destination:
        host: analytics.smm-demo.svc.cluster.local
        port:
          number: 8080
        subset: v1
      weight: 10

Repeat this step to gradually increase the traffic to the Kubernetes workload.

Completing the migration

If you have verified that the mixed setup works, change the traffic shifting to route 100% of the traffic to the Kubernetes workload.

If the Kubernetes workload is handling 100% of the traffic without problems, you can:

4.3 - Upgrading Istio

Istio regularly gets security updates (patch version updates) and new features (minor/major version updates). Regarding upgrades, Service Mesh Manager uses the same approach for virtual machines integrated into the mesh as for Kubernetes workloads. For details, see Canary control plane upgrades.

Patch version updates

In case the Kubernetes deployment is upgraded to a new version of Service Mesh Manager that contains a newer (patch) version of Istio, the smm-agent running on the host will:

  • Automatically upgrade the smm-agent and restart it.
  • Automatically upgrade Istio, but does not restart it.

Upgrading the smm-agent and restarting it ensures that Service Mesh Manager configures Istio the best possible way according to the latest tests. Since smm-agent does not serve live traffic, it does not endanger the availability of the production environment.

Restarting Istio would case a service disruption that is not acceptable in production environments. Given that there’s no standard way of determining how to temporarily drain traffic from a VM, or even to check if the VM has a Highly Availability, you must restart Istio when you see fit, for example, during a dedicated maintenance window.

The Service Mesh Manager dashboard shows the virtual machines that you need to restart as a validation error for the given WorkloadEntry.

The old Istio version keeps running until you restart the VM (or Istio itself). The new version start up automatically after the restart.

To restart Istio, run the following command on the virtual machine:

systemctl stop istio

Minor/major version updates

When the namespace hosting the VM is migrated to a new version of the control plane (see Canary control plane upgrades), smm-agent automatically notices that a new version of Istio is available.

At this point it executes the same steps as with patch version updates, but you must restart Istio (or the virtual machine), when traffic characteristics allow for that downtime.

4.4 - Autoscaling VM groups

Add a virtual machine to the mesh details how to add a VM manually to the mesh. However, Service Mesh Manager also allows for automated addition as part of any autoscaling activity, such as relying on AWS’s AutoScaling Groups or Google Cloud’s Managed Instance Groups.

One way to achieve mesh membership is by just adding the commands mentioned in Add a virtual machine to the mesh to the init/cloud-init script of the VM to start at boot time. However, if the VM image is custom built using packer or any other solution, you can embed an already configured smm-agent into the image.

Prerequisites

Build a VM image for automatic Istio attachment

During the build process of the virtual machine image, execute the following commands as the root user. Replace the parameters between brackets with suitable values for your environment (replace <token> with the actual token, not the filename).

curl http://<dashboard-url>/get/smm-agent | bash
smm-agent set workload-group <namespace> <workloadgroup-name>
smm-agent set bearer-token <token>
systemctl enable smm-agent

This is a subset of the steps required by the manual registration. These steps ensure that:

  • smm-agent is installed on the VM image
  • smm-agent is configured to be able to connect to the mesh
  • smm-agent is scheduled to be started on the next startup (systemctl enable smm-agent)

When a new instance of this VM image starts, smm-agent contacts the Kubernetes cluster running in the mesh, downloads the current version of Istio, and starts it as soon as it’s fully configured.

This approach ensures that newly started VMs run the right version of Istio.

4.5 - Maintenance

To perform scheduled maintenance on the virtual machine (for example, a restart), you can use one of the following methods to stop traffic to the machine. If you want to completely remove the VM from the mesh, see Remove VM from the mesh.

Shut down the service

This first approach is to shut down the service the WorkloadGroup has health checks defined against. As a result, Istio will not route any traffic to the virtual machine.

De-register the VM from the mesh

Another approach is open a terminal to the virtual machine, and stop the following services to de-register the virtual machine from the mesh:

systemctl stop smm-agent    # smm-agent would restart istio automatically if it's not running
systemctl stop istio

After you have finished the maintenance, re-register the VM.

Re-register the VM to the mesh

After you have performed the required maintenance, complete the following steps.

  1. Run the following command to re-register the VM to the mesh:

    systemctl start smm-agent
    

    The smm-agent will automatically synchronize the Istio configuration from the mesh’s cluster and start Istio.

  2. Verify that the WorkloadEntry for the virtual machine has been re-created in the namespace of the workload.

    Check that the new WorkloadEntry exists:

    kubectl get workloadentries -n smm-demo
    

    The output should be similar to:

    NAME                                    AGE     ADDRESS
    analytics-v0-3.68.232.96-vm-network-1   2m40s   3.68.232.96
    

4.6 - Remove VM from the mesh

If you have successfully migrated your workload from a virtual machine to a Kubernetes workload, or if the virtual machine is not needed in the mesh anymore, you can uninstall Istio and smm-agent from the virtual machine.

Disconnect the VM from the mesh

Note: If you want to decommission the virtual machine, you can simply delete the instance. Disconnecting from the mesh is only needed if you want to keep using the virtual machine without having smm-agent and Istio running.

  1. To remove Istio from the virtual machine, stop the background services:

    systemctl stop smm-agent # Prevents auto restart of istio and node-exporter
    systemctl stop smm-node-exporter
    systemctl stop istio
    

    The last command will not just stop Istio, but will also cause the VM’s WorkloadEntry to be removed from the VM’s namespace.

  2. Use the package manager of the virtual machine’s operating system to remove the istio-sidecar and the smm-agent packages. For example:

    • On Ubuntu-based systems:

      dpkg -r istio-sidecar smm-agent
      
    • On RedHat-based systems:

      rpm -e istio-sidecar smm-agent
      
  3. Remove smm-agent downloads cache:

    rm -f /var/cache/smm-agent/downloads/*
    

Remove Kubernetes resources

Remove the associated WorkloadGroup and PeerAuthentication objects from your workload’s namespace.

5 - Istio resources

When adding an external workload to the mesh, there are two crucial Istio resources that are used.

  • WorkloadGroup needs to be created in the namespace where the machine will be attached to. This object represents a group of machines serving the same service. This is analogous to the Kubernetes concept of a Deployment.
  • Each virtual machine attached to the mesh will be represented by a WorkloadEntry object in the workload’s namespace. This is analogous to the Pod concept of Kubernetes.

The VM attachment flow used in Service Mesh Manager relies on the PILOT_ENABLE_WORKLOAD_ENTRY_AUTOREGISTRATION and PILOT_ENABLE_WORKLOAD_ENTRY_HEALTHCHECKS features.

Autoregistration

To understand the autoregistration feature first take a look at a WorkloadGroup resource:

  apiVersion: networking.istio.io/v1alpha3
  kind: WorkloadGroup
  metadata:
    labels:
      app: analytics
      version: v1
    name: analytics-v1
    namespace: smm-demo
  spec:
    metadata:
      labels:
        app: analytics
        version: v1
    template:
      ports:
        http: 8080
      serviceAccount: default

If autoregistration is enabled, the Istio pilot-agent running on the virtual machine connects to the istio-meshexpansion-gateway in the istio-system namespace and presents the specified ServiceAccount’s bearer token (and some registration details that Service Mesh Manager sets automatically) to authenticate itself to the Istio control plane. If the authentication is successful the Istio control plane creates a WorkloadEntry in the cluster, like this:

apiVersion: networking.istio.io/v1beta1
kind: WorkloadEntry
metadata:
  annotations:
    istio.io/autoRegistrationGroup: analytics-v1
    istio.io/connectedAt: "2022-03-31T06:52:14.739292073Z"
    istio.io/workloadController: istiod-cp-v115x-df9f5d556-9kvqs
  labels:
    app: analytics
    hostname: ip-172-31-22-226
    istio.io/rev: cp-v115x.istio-system
    service.istio.io/canonical-name: analytics
    service.istio.io/canonical-revision: v1
    topology.istio.io/network: vm-network-1
  name: analytics-v1-3.67.91.181-vm-network-1
  namespace: smm-demo
  ownerReferences:
  - apiVersion: networking.istio.io/v1alpha3
    controller: true
    kind: WorkloadGroup
    name: analytics-v1
    uid: d01777d5-4294-44e7-a311-3596c2f63bb1
spec:
  address: 1.2.3.4
  labels:
    app: analytics
    hostname: ip-172-31-22-226
    istio.io/rev: cp-v115x.istio-system
    service.istio.io/canonical-name: analytics
    service.istio.io/canonical-revision: v1
    topology.istio.io/network: vm-network-1
  locality: eu-central-1/eu-central-1a
  network: vm-network-1
  serviceAccount: default

Any attached machine that has a corresponding WorkloadEntry resource behaves as a Kubernetes workload, and has a set of labels assigned that could be used by Services to match the machine.

For example, the following Service will route traffic to the virtual machine due to the .spec.selector matching the WorkloadEntry’s labels (.metadata.labels):

apiVersion: v1
kind: Service
metadata:
  name: analytics
  namespace: smm-demo
spec:
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: analytics
  sessionAffinity: None
  type: ClusterIP

Autoregistration is also crucial when removing Workloads from the mesh. In case the istio sidecar process is stopped on the host, the Istio control plane automatically removes the related WorkloadEntry custom resource. This can be used to temporarily remove a VM for maintenance or troubleshooting purposes from the mesh, but it also ensures that if Istio is uninstalled from the node, it automatically de-registers itself without needing to manually update any Kubernetes resources.

Health checks

The PILOT_ENABLE_WORKLOAD_ENTRY_HEALTHCHECKS setting provided by Istio allows health checks to be defined for VMs. If the health check fails, Istio will not route any traffic to the workload.

In case of Service Mesh Manager, the health checks are defined in the WorkloadGroup resource and our agent running on the VM ensures that Istio uses that setting. For example, the following WorkloadGroup defines an HTTP health check:

apiVersion: networking.istio.io/v1alpha3
kind: WorkloadGroup
metadata:
  labels:
    app: analytics
    version: v1
  name: analytics-v1
  namespace: smm-demo
spec:
  metadata:
    labels:
      app: analytics
      version: v1
  probe:
    httpGet:
      host: 127.0.0.1
      path: /
      port: 8080
      scheme: HTTP
  template:
    network: vm-network-1
    serviceAccount: default

The .spec.probe’s definition is the same as the Probe object of the official Kubernetes API. The defined probe is analogous to the liveliness probe of the Pod: it is checked constantly while Istio is running on the machine. The only difference is that Istio will not restart the VM if the probe fails, instead it stops routing any traffic to the WorkloadEntry.

You can query the status of the health checks from Kubernetes by checking the machine’s WorkloadEntry:

apiVersion: networking.istio.io/v1beta1
kind: WorkloadEntry
metadata:
  annotations:
    istio.io/autoRegistrationGroup: analytics-v1
    istio.io/connectedAt: "2022-03-31T06:52:14.739292073Z"
    istio.io/workloadController: istiod-cp-v115x-df9f5d556-9kvqs
  labels:
    app: analytics
    hostname: ip-172-31-22-226
    istio.io/rev: cp-v115x.istio-system
    service.istio.io/canonical-name: analytics
    service.istio.io/canonical-revision: v1
    topology.istio.io/network: vm-network-1
  name: analytics-v1-3.67.91.181-vm-network-1
  namespace: smm-demo
  ownerReferences:
  - apiVersion: networking.istio.io/v1alpha3
    controller: true
    kind: WorkloadGroup
    name: analytics-v1
    uid: d01777d5-4294-44e7-a311-3596c2f63bb1
spec:
  address: 1.2.3.4
  labels:
    app: analytics
    hostname: ip-172-31-22-226
    istio.io/rev: cp-v115x.istio-system
    service.istio.io/canonical-name: analytics
    service.istio.io/canonical-revision: v1
    topology.istio.io/network: vm-network-1
  locality: eu-central-1/eu-central-1a
  network: vm-network-1
  serviceAccount: default
status:
  conditions:
  - lastProbeTime: "2022-03-31T07:23:07.236758604Z"
    lastTransitionTime: "2022-03-31T07:23:07.236759090Z"
    status: "True"
    type: Healthy

In the status field of the custom resource, the conditions array contains an entry with the type field set to Healthy. If the same objects status is set to True, then the machine is considered healthy and will receive traffic.

6 - Known limitations

Before trying to attach a virtual machine to the mesh, make sure that you understand the following limitations of the solution:

  • Because of the way Istio operates, the VM is only able to resolve services and DNS names from the same Kubernetes namespace as it’s attached to. This means that communication from the VM to other Kubernetes namespaces is not possible.

  • When you are installing Istio on the node for the first time, network connections might be disrupted (TCP reconnections will happen) as Istio initializes the iptables. Prepare for such a micro-outage when first provisioning Istio on a production node.
  • You can attach VMs only to active Istio cluster that’s running the Service Mesh Manager controlplane.

VMs in active-active scenarios

If you are running Service Mesh Manager in a multi-cluster scenario with multiple active Istio controlplanes, note that:

  • You can attach VMs only to active Istio cluster that’s running the Service Mesh Manager controlplane.

  • The VMs can’t automatically reconnect to the mesh expansion gateway when there is an outage on the cluster to which the VM is connected.

    After the outage, outdated IP addresses remain in the /etc/hosts file of the VM, so the VM can’t connect to any of the mesh expansion gateways. Update the IP addresses manually in /etc/hosts file of the VM to the current IPs of a mesh expansion gateway to restore traffic to the VM. You can get the Ingress IP address of an IstioMeshGateway by running the following command:

    kubectl get istiomeshgateways.servicemesh.cisco.com -n <namespace-of-the-meshgateway> <name-of-the-meshgateway> -o jsonpath='{.status.GatewayAddress[0]}'