Leveraging Kubernetes Data-Oriented Projects with Portworx
Author: Cameron Laird
Other contributors: Adam Overa
View edit history on GitHub → Originally authored by Cameron LairdTraducciones al EspañolEstamos traduciendo nuestros guías y tutoriales al Español. Es posible que usted esté viendo una traducción generada automáticamente. Estamos trabajando con traductores profesionales para verificar las traducciones de nuestro sitio web. Este proyecto es un trabajo en curso.
Management of data at scale is crucial for deriving actionable insights, and an effective data platform can provide those kinds of insights. A data platform is the technology infrastructure used for the collection, storage, transaction processing, and analysis of varied data at scale. It simplifies engineering tasks such as expanding the storage available to an application or encrypting project secrets.
Portworx handles advanced storage and data management capabilities for cloud-native environments. This guide provides step-by-step instructions for installing Portworx on an existing Kubernetes cluster. It then walks through setting up a model project to demonstrate Portworx’ capabilities.
What Is Portworx?
Portworx enables deployment and management of storage and data services specifically in containerized environments. It also handles data replication, snapshots, backups, and data recovery, allowing application systems to focus on their own specific requirements. Since Portworx itself is cloud-native, it plays a crucial role in helping other systems maximize the capabilities of the cloud.
Current cloud computing practices face several challenges, particularly the difficulty of managing Kubernetes instances in the real-world. Portworx mitigates some of these challenges.
A limited version of the Portworx Storage Platform is available for free. It allows for an implementation of object storage for a single distributed cluster. This guide focuses on the free, downloadable software that you can install and run for your own educational and small-scale uses.
How Portworx Relates to Kubernetes, Kafka, and Cassandra
Portworx integrates with widely known software systems such as Kubernetes, Kafka, and Cassandra:
- Kubernetes serves as the foundation of most Portworx implementations. However, Portworx is also compatible with other container orchestration systems. 
- Cassandra is an open source distributed database management system that emphasizes economical operation, high availability, and wide-column semantics. Portworx addresses several of the challenges involved in configuring and operating Cassandra. For example, when running Cassandra in containers managed by Kubernetes, Portworx can effectively control memory, resource quotas, and/or CPU cores per Kubernetes cluster. 
- Kafka is a widely used open source distributed event store and stream-processing platform. In much the same way a traditional database system manages records of data, Kafka manages events. For Kafka to perform optimally, it needs a high-performance underlying storage system, and Portworx is a good choice. Teams and individuals often initially adopt Portworx to meet requirements for hosting or upgrading Kafka. Portworx also offers white papers specifically on the operation of Kafka in a Kubernetes environment. 
Before You Begin
- Create a Kubernetes cluster that meets the Portworx installation prerequisites. A Shared CPU, Linode 8 GB plan is suitable. You must have - kubectlconfigured on your local machine to interact with the cluster. See our Getting Stated with Kubernetes guide for instructions. Also, take note of the Kubernetes version running on your cluster as it is needed later.
- The Portworx installation prerequisites also include a backing drive (i.e. Volume) for each of three nodes, which must be at least 8 GB. Follow our Getting Started with Block Storage guide to create and attach a 10 GB Volume to each node. Creating volumes via the Storage tab of the individual Kubernetes instances is more efficient than via Volumes, as it creates and attaches in one step. 
- Sign up for a personal account on Portworx Central. 
sudo. If you’re not familiar with the sudo command, see the
Users and Groups guide.Portworx Installation
To install Portworx, use the basic installation model on an existing Kubernetes cluster. This can be any existing Kubernetes cluster, whether using Linode Kubernetes Engine or a manually constructed setup. You can also use kind for an installation purely within your desktop development environment.
Portworx is not an open source system, though it supports many individual open source components, and some of its licenses involve no fee. However, installation is generally done through the Portworx website and not via standard command-line package managers such as apt or brew.
Follow the steps in the below sections to install Portworx on an existing Kubernetes cluster.
The Wizard
- Open a web browser and log in to Portworx Central. 
- Select Get Started from the Welcome to Portworx section of the Portworx Central home page:     
- Choose the Portworx Essentials/Portworx CSI fee-free license for demonstration or proof-of-concept workloads:     
- Choose - DAS/SANas Platform and- Nonefor Distribution Name. Retain- portworxas the default Namespace, but change the K8s Version to match the Kubernetes version of your cluster (e.g.- 1.30.2).    - Note - Use the following command to check your version of Kubernetes: - kubectl version
- Select Save Spec to generate - kubectlcommands for- Operatorand- StorageCluster, which reflect the specifications chosen for the Portworx installation. Copy the- kubectlcommands for use in the next section.    
- To save this configuration, fill in Spec Name and Spec Tags then click Save Spec again. 
- Your generated spec manifest is now available in the Spec List section of Portworx Central. You can download it at any time by clicking the three vertical dots under Actions and choosing Download.     
Deployment
- Use the first - kubectlcommand generated in the previous section to deploy the- Operatorspecification. The command structure should follow that of the example command below, with PORTWORX_VERSION_NUMBER and KUBERNETES_VERSION_NUMBER matching your respective Portworx and Kubernetes versions:- kubectl apply -f 'https://install.portworx.com/PORTWORX_VERSION_NUMBER?comp=pxoperator&kbver=KUBERNETES_VERSION_NUMBER&ns=portworx'- Sample output: - namespace/portworx created serviceaccount/portworx-operator created clusterrole.rbac.authorization.k8s.io/portworx-operator created clusterrolebinding.rbac.authorization.k8s.io/portworx-operator created deployment.apps/portworx-operator created
- Use the second - kubectlcommand generated in the previous section to deploy the- StorageClusterspecification. The command structure should resemble the example command below, with PX_USER_ID and PX_CLUSTER_ID being unique to your Portworx Central account:- kubectl apply -f 'https://install.portworx.com/PORTWORX_VERSION_NUMBER?operator=true&mc=false&kbver=KUBERNETES_VERSION_NUMBER&ns=portworx&oem=esse&user=PX_USER_ID&b=true&iop=6&c=px-cluster-PX_CLUSTER_ID&stork=true&csi=true&mon=true&tel=true&st=k8s&promop=true'- Sample output: - storagecluster.core.libopenstorage.org/px-cluster-PX_CLUSTER_ID created secret/px-essential created- Note Should you receive any errors, you can use the Generate Spec screen to view up-to-date commands.
Verification
- Monitor the status of Portworx nodes with the following command: - kubectl -n portworx get storagenodes -l name=portworx- Once the deployments finish, each Portworx node appears as - Online:- NAME ID STATUS VERSION AGE lke194968-280433-369bf4810000 f2522d07-0b59-482a-a8ae-bd2854fd7bc4 Online 3.1.2.0-fb52ced 4m46s lke194968-280433-438a8b610000 b1920afd-5326-48dc-9572-af8e638fd92b Online 3.1.2.0-fb52ced 4m45s lke194968-280433-527b95040000 58649b41-b4c9-4c56-b983-e27a61c9f582 Online 3.1.2.0-fb52ced 4m46s
- Use the following command to monitor the status of an individual node, replacing NODE_NAME with the - NAMEof one of the nodes listed in the prior command’s output:- kubectl -n portworx describe storagenode NODE_NAME
At this point, your working Kubernetes cluster includes a small Portworx deployment with a permanent fee-free license. You can use your cluster for educational practice, proofs-of-concept, or other demonstrations of Portworx’ capabilities.
Run a Model Portworx Project
Among the examples found in the Portworx documentation is Run Kafka on Kubernetes at Scale with Portworx. While thousands of organizations already deploy Kafka manually, Portworx can enhance the process. Replacement of a manual deployment with Portworx’ mediation automates disaster recovery, application-specific high availability, backup services, and capacity management.
Using the Portworx installation from the preceding section, follow the steps below to get started:
- Create a specification file named - sc-kafka-rf2.yaml:- nano sc-kafka-rf2.yaml- Paste in the following contents, and save your changes: - File: sc-kafka-rf2.yaml
- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15- kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: px-sc-kafka-repl2 provisioner: kubernetes.io/portworx-volume allowVolumeExpansion: true parameters: repl: "2" priority_io: "high" io_profile: "db_remote" cow_ondemand: "true" disable_io_profile_protection: "1" nodiscard: "false" group: "kafka-broker-rep2" fg: "false"
 - This storage specification provides several automations, including replication. The - repl: "2"parameter maintains two full replicas of broker data (i.e. Kafka’s content) across the failure domains of the hosting Kubernetes cluster. This ensures that Kafka continues without downtime should a node fail.
- Use the following command to apply the - StorageClass:- kubectl apply -f sc-kafka-rf2.yaml- Sample output: - storageclass.storage.k8s.io/px-sc-kafka-repl2 created
Storage-level replication makes it possible for Portworx to identify and re-assign healthy storage in the event of failure. This keeps data available while replicating almost immediately.
While Portworx generally requires several dozens of lines of configuration files, it’s typically less than what administrators may use to maintain a Kubernetes cluster. When terabytes of data are involved, Portworx’s efficient data utilization can result in large cost savings.
More Information
You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.
This page was originally published on




