Kubernetes is a popular choice for hosting Orleans applications. Orleans will run in Kubernetes without specific configuration, however it can also take advantage of extra knowledge which the hosting platform can provide.
Microsoft.Orleans.Hosting.Kubernetes package adds integration for hosting an Orleans application in a Kubernetes cluster. The package provides an extension method,
ISiloBuilder.UseKubernetesHosting, which performs the following actions:
SiloOptions.SiloNameis set to the pod name.
EndpointOptions.AdvertisedIPAddressis set to the pod IP.
EndpointOptions.GatewayListeningEndpointare configured to listen on any address, with the configured
GatewayPort. Defaults port values of
30000are used if no values are set explicitly).
ClusterOptions.ServiceIdis set to the value of the pod label with the name
ClusterOptions.ClusterIdis set to the value of the pod label with the name
- Early in the startup process, the silo will probe Kubernetes to find which silos do not have corresponding pods and mark those silos as dead.
- The same process will occur at runtime for a subset of all silos, in order to remove the load on Kubernetes' API server. By default, 2 silos in the cluster will watch Kubernetes.
Note that the Kubernetes hosting package does not use Kubernetes for clustering. For clustering, a separate clustering provider is still needed. For more information on configuring clustering, see the Server configuration documentation.
This functionality imposes some requirements on how the service is deployed:
- Silo names must match pod names.
- Pods must have an
orleans/clusterIdlabel which corresponds to the silo's
ClusterId. The above-mentioned method will propagate those labels into the corresponding options in Orleans from environment variables.
- Pods must have the following environment variables set:
The following example shows how to configure these labels and environment variables correctly:
apiVersion: apps/v1 kind: Deployment metadata: name: dictionary-app labels: orleans/serviceId: dictionary-app spec: selector: matchLabels: orleans/serviceId: dictionary-app replicas: 3 template: metadata: labels: # This label is used to identify the service to Orleans orleans/serviceId: dictionary-app # This label is used to identify an instance of a cluster to Orleans. # Typically, this will be the same value as the previous label, or any # fixed value. # In cases where you are not using rolling deployments (for example, # blue/green deployments), # this value can allow for distinct clusters which do not communicate # directly with each others, # but which still share the same storage and other resources. orleans/clusterId: dictionary-app spec: containers: - name: main image: my-registry.azurecr.io/my-image imagePullPolicy: Always ports: # Define the ports which Orleans uses - containerPort: 11111 - containerPort: 30000 env: # The Azure Storage connection string for clustering is injected as an # environment variable # It must be created separately using a command such as: # > kubectl create secret generic az-storage-acct ` # --from-file=key=./az-storage-acct.txt - name: STORAGE_CONNECTION_STRING valueFrom: secretKeyRef: name: az-storage-acct key: key # Configure settings to let Orleans know which cluster it belongs to # and which pod it is running in - name: ORLEANS_SERVICE_ID valueFrom: fieldRef: fieldPath: metadata.labels['orleans/serviceId'] - name: ORLEANS_CLUSTER_ID valueFrom: fieldRef: fieldPath: metadata.labels['orleans/clusterId'] - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: DOTNET_SHUTDOWNTIMEOUTSECONDS value: "120" request: # Set resource requests terminationGracePeriodSeconds: 180 imagePullSecrets: - name: my-image-pull-secret minReadySeconds: 60 strategy: rollingUpdate: maxUnavailable: 0 maxSurge: 1
For RBAC-enabled clusters, the Kubernetes service account for the pods may also need to be granted the required access:
kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pod-reader rules: - apiGroups: [ "" ] resources: ["pods"] verbs: ["get", "watch", "list"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pod-reader-binding subjects: - kind: ServiceAccount name: default apiGroup: '' roleRef: kind: Role name: pod-reader apiGroup: ''
Liveness, Readiness, and Startup Probes
Kubernetes is able to probe pods to determine the health of a service. For more information, see Configure Liveness, Readiness and Startup Probes in Kubernetes' documentation.
Orleans uses a cluster membership protocol to promptly detect and recover from process or network failures. Each node monitors a subset of other nodes, sending periodic probes. If a node fails to respond to multiple successive probes from multiple other nodes, then it will be forcibly removed from the cluster. Once a failed node learns that is has been removed, it terminates immediately. Kubernetes will restart the terminated process and it will attempt to rejoin the cluster.
Kubernetes' probes can help to determine whether a process in a pod is executing and is not stuck in a zombie state. probes do not verify inter-pod connectivity or responsiveness or perform any application-level functionality checks. If a pod fails to respond to a liveness probe, then Kubernetes may eventually terminate that pod and reschedule it. Kubernetes' probes and Orleans' probes are therefore complimentary.
The recommended approach is to configure Liveness Probes in Kubernetes which perform a simple local-only check that the application is performing as intended. These probes serve to terminate the process in the event that there is a total freeze, for example due to a runtime fault or another unlikely event.
Kubernetes works in conjunction with the operating system to implement resource quotas. This allows CPU and memory reservations and/or limits to be enforced. For a primary application which is serving interactive load, we recommend not implementing restrictive limits unless necessary. It is important to note that requests and limits are substantially different in their meaning and where they are implemented. Before setting requests or limits, take the time to gain a detailed understanding of how they are implemented and enforced. For example, memory may not be measured uniformly between Kubernetes, the Linux kernel, and your monitoring system. CPU quotas may not be enforced in the way that you expect.