This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Multi-cluster - single mesh

Multi-cluster overview

Service Mesh Manager is able to construct an Istio service mesh that spans multiple clusters. In this scenario you combine multiple clusters into a single service mesh that you can manage from either a single or from multiple Istio control planes.

multi-cluster multi-cluster

Single mesh scenarios are best suited to use cases where clusters are configured together, sharing resources, and are generally treated as one infrastructural component within an organization.

Istio clusters and SMM clusters

When you are working with Service Mesh Manager in a multi-cluster scenario, you must understand the following concepts:

  1. Every Istio cluster you attach to the mesh is either a remote Istio cluster or a primary Istio cluster. Remote Istio clusters don’t have a separate Istio control plane, while primary Istio clusters do. To understand the difference between the remote Istio and primary Istio clusters, see the Istio control plane models document.
  2. When you install Service Mesh Manager on a cluster, it installs a primary Istio cluster. This cluster is effectively the primary Service Mesh Manager cluster.
  3. Even if you add multiple primary Istio clusters to the mesh, Service Mesh Manager runs only on the primary Service Mesh Manager cluster (even though some of its components are replicated to the other clusters).
  4. You can deploy Service Mesh Manager in an active-passive model. The active Service Mesh Manager control plane has all components installed on a primary Istio cluster. The passive Service Mesh Manager control plane has only a limited number of components installed on a primary or remote Istio cluster. Only one Service Mesh Manager control plane is active, all other Service Mesh Manager control planes are passive.

This means that when using the Service Mesh Manager CLI (for example, to attach or detach a new cluster), you must run it in the context of the active Service Mesh Manager cluster, even if there are multiple primary Istio clusters in the mesh.

Creating a multi-cluster mesh

Read the multi-cluster installation guide for details on how to set up a multi-cluster mesh.

1 - Cluster network

A multi-cluster mesh connects multiple clusters into a single service mesh. The topology of the mesh – how the different clusters are grouped into networks and how each cluster connects to the mesh – determines how the clusters connect to each other and how the pods, services, and workloads can access resources in other clusters.

Communication between clusters

In a multi-cluster mesh, every cluster belongs to a specific network. Clusters belonging to the same mesh can access the services of each other, but how this happens depends on which network the cluster belongs to.

  • If the clusters belong to the same network, their pods can access each other directly over a flat network, without using a cluster gateway.
  • If the clusters belong to different networks, the services of the cluster can be accessed only through the gateway of the cluster. Since Service Mesh Manager assigns each cluster to its own network by default, this is the default behavior.

The networkName label of the cluster determines which network the cluster belongs to. By default, every cluster belongs to its own network, where the name of the network is the name of the cluster.

Note: If the name of the cluster cannot be used as a Kubernetes resource name (for example, because it contains the underscore, colon, or another special character), you must manually specify a name to use when you are attaching the cluster to the service mesh. For example:

smm istio cluster attach <PEER-CLUSTER-KUBECONFIG-FILE> --name <KUBERNETES-COMPLIANT-CLUSTER-NAME> --active-istio-control-plane

Otherwise, the following error occurs when you try to attach the cluster:

could not attach peer cluster: graphql: Secret "example-secret" is invalid: metadata.name: Invalid value: "gke_gcp-cluster_region": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.'**

You can specify the network of the cluster when you are attaching the cluster to the mesh.

Assigning clusters to different networks allows you to optimize the topology of your mesh network. Depending on your cloud provider, there might be differences in cross-cluster latencies and transfer costs between the different connection types.

Network connectivity requirements

For a multi-cluster scenario the necessary networking configurations are listed in this section.

  • If the clusters belong to the same network, then network connectivity should be fine nothing else needs to be done.
  • If the clusters belong to different networks, and the endpoints in the networks are publicly accessible without restrictions, then again nothing needs to be done.
  • If the clusters belong to different networks, but there are restrictions on which endpoints can be accessed, at least the following endpoints must be accessible for a proper multi-cluster setup with Service Mesh Manager:
    • From all clusters:
    • From the primary cluster(s):
      • All peer clusters' k8s API server address
      • All IP addresses or host names of the meshexpansion-gateway LoadBalancer type services on the peer clusters on port 15443
    • From peer clusters:
      • IP address or host name of the meshexpansion-gateway LoadBalancer type service on the primary cluster(s) on ports 15443,15012
      • IP address or host name of the meshexpansion-gateway LoadBalancer type service on the primary cluster where Service Mesh Manager is installed on ports 50600,59411

CAUTION:

To change the network of a cluster already attached to the mesh, you have to detach and then re-attach the cluster. Simply updating the networkName label is NOT enough. To detach a cluster, see Detach a cluster from the mesh.

2 - Attach a new cluster to the mesh

Service Mesh Manager automates the process of creating the resources necessary for the peer cluster, generates and sets up the kubeconfig for that cluster, and attaches the cluster to the mesh.

Note: If you are using Service Mesh Manager with a commercial license in a multi-cluster scenario, Service Mesh Manager automatically synchronizes the license to the attached clusters. If the peer cluster already has a license, it is automatically deleted and replaced with the license of the primary Service Mesh Manager cluster. Detaching a peer cluster automatically deletes the license from the peer cluster.

To attach a new cluster to the service mesh managed by Service Mesh Manager, complete the following steps. For an overview of the network settings of the cluster, see Cluster network.

Prerequisites

  • The Service Mesh Manager CLI tool installed on your computer.
  • Access to the KUBECONFIG file of the cluster you want to attach to the service mesh.
  • Access to the KUBECONFIG file of the cluster that runs the primary Service Mesh Manager service.
  • Network connectivity properly configured between the participating clusters.

Steps

  1. Find out the name of the network you want to attach the cluster to.

    • By default, every cluster belongs to its own network, where the name of the network is the name of the cluster.
    • If you want to attach the cluster to an existing network, you must manually specify the name of the network when you are attaching the cluster to the service mesh using the --network-name option in the next step.

    If you have to specify the network name manually, note the name of the network you want to use. You can check the existing network names using the smm istio cluster status command.

  2. On the primary Service Mesh Manager cluster, attach the peer cluster to the mesh using one of the following commands.

    Note: To understand the difference between the remote Istio and primary Istio clusters, see the Istio control plane models section in the official Istio documentation. The short summary is that remote Istio clusters do not have a separate Istio control plane, while primary Istio clusters do.

    The following commands automate the process of creating the resources necessary for the peer cluster, generate and set up the kubeconfig for that cluster, and attach the cluster to the mesh.

    • To attach a remote Istio cluster with the default options, run:

      smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE>
      
    • To attach a primary Istio cluster (one that has an active Istio control plane installed), run:

      smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --active-istio-control-plane
      

      Note: If the name of the cluster cannot be used as a Kubernetes resource name (for example, because it contains the underscore, colon, or another special character), you must manually specify a name to use when you are attaching the cluster to the service mesh. For example:

      smm istio cluster attach <PEER-CLUSTER-KUBECONFIG-FILE> --name <KUBERNETES-COMPLIANT-CLUSTER-NAME> --active-istio-control-plane
      

      Otherwise, the following error occurs when you try to attach the cluster:

      could not attach peer cluster: graphql: Secret "example-secret" is invalid: metadata.name: Invalid value: "gke_gcp-cluster_region": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.'**
      
    • To override the name of the cluster, run:

      smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --name <kubernetes-compliant-cluster-name>
      
    • To specify the network name, run:

      smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --network-name <network-name>
      

    Note: If you are using Service Mesh Manager with a commercial license in a multi-cluster scenario, Service Mesh Manager automatically synchronizes the license to the attached clusters. If the peer cluster already has a license, it is automatically deleted and replaced with the license of the primary Service Mesh Manager cluster. Detaching a peer cluster automatically deletes the license from the peer cluster.

  3. Wait until the peer cluster is attached. Attaching the peer cluster takes some time, because it can be completed only after the ingress gateway address works. You can verify that the peer cluster is attached successfully with the following command:

    smm istio cluster status
    

    The process is finished when you see Available in the Status field of all clusters.

  4. (Optional) Open the Service Mesh Manager dashboard and verify that the new peer cluster is visible on the MENU > TOPOLOGY page.

3 - Deploy applications on multiple clusters

After you have one or more clusters attached to the mesh, here are some best practices to deploy applications on multiple clusters.

Deploy demo application

If you just want to get started with any demo application in a multi-cluster mesh, the easiest is to install the built-in Service Mesh Manager demo application.

  1. You can deploy the demo application in a distributed way to multiple clusters with the following commands:

    smm demoapp install -s frontpage,catalog,bookings
    smm -c <PEER_CLUSTER_KUBECONFIG_FILE> demoapp install -s movies,payments,notifications,analytics,database --peer
    

    After installation, the demo application automatically starts generating traffic, and the dashboard draws a picture of the data flow. (If if doesn’t, run the smm demoapp load start command, or Generate load on the UI. If you want to stop generating traffic, run smm demoapp load stop.)

  2. Open the dashboard and look around.

    smm dashboard
    

Deploy custom application

Here is how you can deploy your own application on multiple clusters with Service Mesh Manager.

  1. Create the namespace where you would like to run your applications on every cluster:

    kubectl create ns test
    
  2. In the cluster where Service Mesh Manager is installed, enable sidecar injection in that namespace:

    smm sidecar-proxy auto-inject on test
    

    This will place an istio.io/rev label and set it to the appropriate Istio control plane (if there are multiple control planes, you get to choose which one). (The sidecar injection can be enabled from the Service Mesh Manager dashboard as well.)

    Service Mesh Manager, more particularly the Istio operator, will take care of adding the same label to this namespace on all other clusters. (If not, check the istio-operator pod logs on the particular cluster for any potential issues.)

  3. Deploy your application on the clusters as you would usually do:

    One caveat is that you should deploy all kubernetes service resources on all clusters even if pods are only present on a subset of clusters. This is needed for Istio to be able to do proper routing across clusters.

  4. Make sure that sidecar pods are indeed injected to your application pods.

    If not, check the official Istio documentation for potential issues.

  5. Send traffic to your applications, then open the dashboard and look around.

    smm dashboard
    

4 - Detach a cluster from the mesh

To detach a cluster from the service mesh managed by Service Mesh Manager, complete the following steps.

Prerequisites

  • The Service Mesh Manager CLI tool installed on your computer.
  • Access to the KUBECONFIG file of the cluster you want to detach from the service mesh.
  • Access to the KUBECONFIG file of the cluster that runs the primary Service Mesh Manager service.

Steps

  1. On the primary Service Mesh Manager cluster, detach the peer cluster from the mesh by running the following command.

    smm istio cluster detach <PEER_CLUSTER_KUBECONFIG_FILE>
    
  2. Wait until the peer cluster is detached. You can check the status of peer clusters by running the following command:

    smm istio cluster status
    
  3. (Optional) Navigate to the MENU > MESH page of the Service Mesh Manager dashboard and verify that the cluster you have detached is not shown in the Clusters list.

5 - Cluster registry controller

Service Mesh Manager uses the cluster registry controller to synchronize any Kubernetes resources across the clusters in a multi-cluster setup. That way, the necessary resources are automatically synchronized, so the multi-cluster topologies of Istio and the multi-cluster features (for example, observability, multi-cluster topology view, tracing, traffic tapping) of Service Mesh Manager work in a multi-cluster environment.

In addition, you can use the resource synchronization capabilities of Service Mesh Manager to synchronize any Kubernetes resources on demand between the clusters of your mesh.

Overview

When installing Service Mesh Manager in imperative mode from the command line, Service Mesh Manager automatically deploys the cluster registry controller to every cluster of the mesh, and creates the Cluster CRs, with default values that are suitable for most common scenarios.

The Cluster resource represents a Kubernetes cluster. The cluster registry controller fills the status of the Cluster CR with cluster related metadata, and distributes the Cluster CRs to all participating Kubernetes clusters. In addition, the credentials for all clusters are automatically distributed to all clusters (these are usually stored in Kubernetes secrets) to help bootstrap the cluster group itself.

Note: You have to manually configure the Cluster CR or the operator’s Helm values file if your clusters have some unique networking requirements, for example, by setting the KubernetesAPIEndpoints of the cluster.

In such a multi-cluster setup, here is how the cluster registry controller works:

  • The controller only writes to the local cluster where it is deployed to
  • The controller only reads from peer clusters

By default, the required resources are kept in sync between all clusters. You can define your own ResourceSyncRule resources to sync other Kubernetes resources between these clusters. The ResourceSyncRules can be further adjusted to specify from which clusters and to which clusters certain resource should be synced.

Service Mesh Manager operator mode

When you are using Service Mesh Manager in operator mode in a multi-cluster environment, note the following points:

  1. You must explicitly enable the cluster registry in the ControlPlane CR or the operator’s Helm values file.

    Replace <cluster-name> with the name of your cluster. The cluster name format must comply with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters). Otherwise, you get an error message starting with: Reconciler error: cannot determine cluster name controller=controlplane, controllerGroup=smm.cisco.com, controllerKind=ControlPlane

    spec:
      clusterName: <cluster-name>
      clusterRegistry:
        enabled: true
        namespace: cluster-registry
    
  2. To create trust between the clusters, you must exchange the Secret CRs of the clusters. For an example, see GitOps - multi-cluster installation.

Networking requirements

The cluster registry controller instances running on the clusters must be able to reach the API server of every other cluster in the cluster group, so every cluster can read the relevant resources from the other clusters.

The cluster registry controller pod connects directly to Kubernetes API server of the peer clusters. This works automatically if the API servers are publicly available. Otherwise, configure a reachable endpoint for them in the Cluster CR spec. (For security reasons, we recommend to make the API server addresses available only from the IP ranges of the peer clusters.)

ResourceSyncRule example usage

Sync everywhere

  1. Create a sample secret on the third cluster, which will be copied around:

    apiVersion: v1
    kind: Secret
    metadata:
      name: test-secret
    data: {}
    
  2. Create a ResourceSyncRule on the first cluster to synchronize the secret to all clusters:

    apiVersion: clusterregistry.k8s.cisco.com/v1alpha1
    kind: ResourceSyncRule
    metadata:
      name: test-secret-sink
    spec:
      groupVersionKind:
        kind: Secret
        version: v1
      rules:
      - match:
        - objectKey:
          name: test-secret
          namespace: cluster-registry
    

    This ResourceSyncRule resource itself and the secret resource as well should appear shortly on all clusters of the cluster group.

    At this point, if a secret from any of the clusters (except from the one where it originates from) is deleted or modified, it will be synced back immediately by the cluster registry controller.

Sync to a set of clusters

Cluster registry controller can be configured to sync only to specific clusters in the cluster group (instead of all of them). To do that, you must add an annotation to the cluster where you don’t want to sync to.

  1. Add the following annotation to the ResourceSyncRule on the first cluster:

    annotations:
      cluster-registry.k8s.cisco.com/resource-sync-disabled: "true"
    
  2. Delete the ResourceSyncRule from the second cluster.

    The ResourceSyncRule resource will not be recreated because of the annotation, which was just added.

    If the annotation is not added as described in the previous step, then the ResourceSyncRule will be recreated.

  3. Delete the test-secret from the second cluster.

    The secret will not be recreated because the ResourceSyncRule resource does not exist on the second cluster.

Sync from a set of clusters

Cluster registry controller can be configured, to only sync from specific clusters in the cluster group (instead of all of them). To do that, you must create a ClusterFeature resource on the clusters where you want to sync from and add a clusterFeatureMatch field to the ResourceSyncRule resources on the clusters where you want to sync to.

  1. Add the following field to the ResourceSyncRule spec on the first cluster:

    clusterFeatureMatch:
    - featureName: test-secret-feature
    

    This causes that the secret will only be synced from clusters where there are ClusterFeature resources defined.

    At this point, there is no ClusterFeature present on any cluster, so if the secret would be deleted now from the first cluster, it would not be recreated.

  2. Apply the following ClusterFeature to the third cluster:

    apiVersion: clusterregistry.k8s.cisco.com/v1alpha1
    kind: ClusterFeature
    metadata:
      name: test-secret-source
    spec:
      featureName: test-secret-feature
    
  3. Delete the test-secret from the first cluster.

    It should be recreated now, because it can sync the secret from the third cluster.

RBAC considerations

The cluster registry controller only writes to local clusters and only reads from peer clusters. By default, it has access to read namespace, node and secret resources. If you want to sync other resources, expand the RBAC rules of the operator as needed (it uses aggregated ClusterRoles).

  • On the cluster, where the resources are read from (usually where ClusterFeature resources are present) a ClusterRole should be defined with the correct read roles and the following label should be added:

    labels:
      cluster-registry.k8s.cisco.com/reader-aggregated: "true"
    
  • On the cluster, where the resources are written to (usually where ResourceSyncRule resources are present) a ClusterRole should be defined with the correct write roles and the following label should be added:

    labels:
      cluster-registry.k8s.cisco.com/controller-aggregated: "true"