Deploying OmniSci with Resiliency in Google Kubernetes Engine
Google Kubernetes Engine (GKE) is a managed, production-ready environment for deploying containerized applications in Google Cloud Platform (GCP). GKE provides Kubernetes-as-a-Service, allowing users to run Docker containers in a fully managed Kubernetes environment. GKE also includes support for hardware accelerators, allowing users to provision nodes with Nvidia GPUs and deploy the Nvidia drivers across all the nodes in the cluster with a single command.
The support for GPUs makes GKE a perfect fit for running OmniSci - an open-source GPU-accelerated SQL engine (OmniSciDB) and visual analytics platform (OmniSci Immerse). OmniSci offers Docker images for both Enterprise Edition and Open Source Edition. In this post, I will show how to deploy OmniSci's Open Source Edition Docker image in a GKE cluster. I will also demonstrate how GKE automatically ensures that the desired number of OmniSci instances are always running using the Kubernetes replication controller.
Setting Up the Environment for GKE Deployment
As the first step, go to your Google Cloud Platform (GCP) portal and enable the Google Kubernetes Engine API.
Protip: Refresh the page to make sure that the API access is enabled as shown below:
The Google Cloud Platform provides a Cloud SDK which includes the command line utilities to deploy applications using the Kubernetes engine in addition to important links to download the platform specific packages. For the purposes of this exercise, I followed the instructions on Quickstart for macOS to install and setup the SDK. After the SDK is setup, you can confirm the properties in your SDK configuration using:
gcloud config list
The above image shows the active user account and project along with the region and zone where the compute resources will be deployed.
Next, set up an NFS server in Google Cloud using Single Node File Server. Once deployed, take note of the DNS name (singlefs-1-vm) for the server and the exported mount point (/data).
Deploying GPU-enabled Nodes in GKE
To take advantage of the performance benefits provided by OmniSci, you need to provision nodes with Nvidia GPUs. To know the exact number of GPUs and sizing information for the nodes, please refer to the OmniSci Hardware Configuration Reference Guide.
For this test, we will use the Nvidia Tesla K80 GPU that is available in the us-east1 region using the instructions below.
Create a cluster with a default node pool with a single CPU-only node:
gcloud container clusters create k80-cluster --num-nodes=1
Confirm that the cluster was created successfully:
gcloud container clusters list
In order to make cost-effective utilization of GPUs on GKE, and to take advantage of cluster autoscaling, it is recommended to create a separate node pool for the GPU nodes.
To accomplish this, create a pool called poolk80 within the cluster with 2 nodes, each node containing a single K80 GPU:
gcloud container node-pools create poolk80 --accelerator \
type=nvidia-tesla-k80,count=1 --num-nodes=2 \
Confirm a node pool called poolk80 was successfully created in addition to the default pool:
gcloud container node-pools list --cluster k80-cluster
The nodes in the cluster are by default launched running the Container-Optimized OS (COS) image. COS is an operating system image for your Compute Engine VMs that is optimized for running Docker containers. Container-Optimized OS is maintained by Google and based on the open source Chromium OS project.
Confirm whether the GPU nodes were launched successfully using this command:
kubectl get nodes
You can then use the name of the node to get a detailed description:
kubectl describe node gke-k80-cluster-poolk80-a691ce86-1t0t
Here gke-k80-cluster-poolk80-a691ce86-1t0t is the name of the node:
From the node description, you will see that GKE automatically taints the GPU nodes with the following node taint:
Additionally, GKE automatically applies the corresponding tolerations to Pods requesting GPUs by running the ExtendedResourceToleration admission controller. This causes only Pods requesting GPUs to be scheduled on GPU nodes, which enables a more efficient use of resources.
We apply the following toleration when requesting the GPU node for our Pod:
- key: "nvidia.com/gpu"
Installing NVIDIA GPU Device Drivers
GKE provides a DaemonSet to install Nvidia drivers on all the nodes that have a GPU by identifying them with the label key of cloud.google.com/gke-accelerator.
From the description of the GPU nodes that were launched, you should see this label set as:
The DaemonSet selects the nodes in the cluster that match this criteria and automatically install the Nvidia drivers needed for the Google Container Operating System (COS).
You can confirm that the DaemonSet deployment has kicked in by opening an SSH session into the node and running the command dmesg. You will see a caption as shown below, which indicates that the Nvidia driver install has started, and at the end of the screen it shows that the driver has loaded.
Creating OmniSci Pod on GPU Node
The OmniSci Docker container can be launched on the GPU node using a Deployment object.
Use the command below to create the deployment object:
kubectl apply -f omnisci_gke.yml
You can find a copy of the YAML file on GitHub.
In this deployment, the replicas is set to 1 so only a single instance of the OmniSci pod will be launched. You can set the replicas to a larger number if you want to provision a number of simultaneously running OmniSci pods. The NFS share (/data) from server singlefs-1-vm is mounted as /omnisci-storage on the Pod. By default, OmniSci stores all the database files and logs under /omnisci-storage. The tolerations setting allows the Pod deployment on the node that is tainted with a GPU which will enable selecting either one of the two available nodes with GPU. The OmniSci Pod exposes port 6274 for client API access using Apache Thrift protocol.
Display information about the Deployment:
kubectl get deployments
kubectl describe deployments omnisci
Get details about the Pod and on which node it was launched:
kubectl get pods
kubectl describe pod omnisci-b67b4485c-bz4zs
Note: The Pod is running on the node gke-k80-cluster-poolk80-a691ce86-1t0t.
The description also gives information about the NFS mounted volume, and the sequence of events that lead to a successful running state for the Pod.
Now, we can create a Kubernetes Service object that provides an external public IP address to access the port exposed by the Pod:
kubectl expose deployment omnisci --type=LoadBalancer --name=omnisci-elb
Find the status of the launched service:
kubectl get service
Note: The external IP address is initially unassigned and takes a few minutes to publish.
Accessing OmniSci Pod Using the Kubectl Command
Now that the OmniSci Pod is running, let us access it and examine the database. You can open a shell into the running Pod using the kubectl command:
kubectl exec -it omnisci-b67b4485c-bz4zs -- /bin/bash
Login to the database using the default password for the admin account:
List the tables, and you will notice that there is only one table preloaded in the database:
Add a new table by running the program insert_sample_data.
Select option #1 to create the Flights table with 7 million records.
Run \t (list tables), and you will see the flights dataset:
Run the following SQL query and ensure you get results similar to the output shown below:
SELECT origin_city AS "Origin", dest_city AS "Destination", AVG(airtime) AS "Average Airtime" FROM flights_2008_7M WHERE distance < 175 GROUP BY origin_city, dest_city;
Test OmniSci Pod Resiliency
It is observed from the Pod description (kubectl describe pod) that it is deployed on GPU node gke-k80-cluster-poolk80-a691ce86-1t0t. We can test GKE’s in-built high availability feature by stopping that node, and have GKE re-provision that Pod on another GPU node. As the deployment YAML specification specifies a replica count of 1, GKE will try to satisfy this requirement using the available nodes in the cluster.
Delete the node instance running the Pod using the GCP portal:
Querying the status of the Pods running in the cluster will show that the previously running OmniSci Pod is in state unknown and a new OmniSci Pod is now running:
Get the full details on the running Pod using kubectl describe pod <pod id>.
You will notice that the Pod is now running on the other GPU node (gke-k80-cluster-poolk80-a691ce86-t322) in the cluster.
In the full Pod description you will also observe that the NFS share (/data) is now mounted on the new running Pod thus providing data continuity. You can open a shell into the Pod using kubectl exec -it <pod id> -- /bin/bash, and confirm that the OmniSci database still has the flights table you created on the previously running Pod.
If you prefer Python, the pymapd client interface provides a Python DB API 2.0-compliant interface to OmniSci to perform typical database operations like creating a table, appending data and running SQL queries on OmniSci. Please visit Using pymapd to Load Data to OmniSci Cloud or check out this Jupyter Notebook for a walkthrough on how to connect to OmniSci database and run SQL queries along with other interactions.
Don’t forget to show us what you’ve built by visiting the OmniSci community forum to learn more and share your experience.