Pure Cloud Block Store for AWS White Paper
Introduction
As the IT industry continues to transform, the public cloud is playing a larger role in most companies' IT infrastructure. The public cloud provides users quick access to compute, network, and storage services. However, due to various factors such as cost, resiliency, efficiency, compliance, and enterprise requirements, companies often find it difficult to fit every application in a single private or public cloud. Therefore, many enterprise companies are seeking a hybrid strategy where they can leverage both their private cloud and the public cloud to address the diverse demands of their IT infrastructure. To address these demands, Pure Storage has introduced Cloud Block Store to help customers with their transformation into a hybrid world. Pure Storage's Cloud Block Store is a software defined storage solution that leverages the native storage resources of the public cloud to provide an enhanced storage service with enterprise features. Cloud Block Store addresses the fundamental challenges companies face when looking to use the public cloud.
With a hybrid or multi-cloud strategy, developers gain the flexibility and freedom to navigate and circumvent the potential limitations of a single private or public cloud. Organizations would be able to run in the environment that suits each of their unique applications. Users may want to copy or migrate data between their private and public cloud environments as their environment changes over time. However, private and public cloud environments have unique services, APIs, cost models, performance characteristics, management tools, and architectures. Therefore it can be challenging to run identical applications between two different private or public clouds.
Cloud Block Store solves these problems by providing a common data services layer. The common data services layer provides consistency in storage services across heterogeneous private and public clouds. Developers can redeploy their applications across any environment without the need to refactor, redesign, or re-architect their applications. While based on the Purity Operating Environment of the FlashArray, Pure Storage refactored Cloud Block Store to run in Amazon Web Services (AWS). Cloud Block Store provides industrial-strength block storage with the same industry-leading features and benefits of the FlashArray. The following are the key features and benefits of Cloud Block Store:
- Simplicity
- With a simple AWS CloudFormation template provided by Pure Storage, customers can deploy Cloud Block Store into their Amazon Virtual Private Cloud (Amazon VPC) in a few minutes. Cloud Block Store offers the same simplicity and user experience that all FlashArray customers appreciate. RAID pools or manual data tiering is not required. Once iSCSI connections between Amazon Elastic Compute Cloud (EC2) compute hosts and Cloud Block Store instances are established, customers can easily create and mount Cloud Block Store volumes to their Amazon EC2 hosts in seconds using the same GUI, CLI, or APIs as the FlashArray.
- Industry-leading data reduction services
- The Purity Operating Environment helps customers reduce the underlying cloud storage resources required to house their data. Data is deduplicated in-line and compressed before landing on the underlying storage resources, ultimately reducing storage costs. Thin Provisioning, which is significant differentiator compared to the public cloud's block storage, allows developers to freely overprovision volumes without the associated costs. Since only unique data blocks that are written by the host application would consume storage resources, thin provisioning truly provides the "set-it-and-forget-it" experience. Also, customers benefit from instantaneous snapshots and clones, which are pointer-based and do not consume additional storage. Imagine the ability to take thousands of snapshots and clones without incurring additional costs.
- Resiliency
- When customers deploy mission-critical applications, they must ensure that their applications are resilient against single points of failure (SPOF). To protect against SPOF within native Elastic Block Store (EBS), applications must send copies of their data to secondary EBS volumes, which effectively doubles their storage consumption. Cloud Block Store offers built-in protection against multiple concurrent backend storage failures using RAID-HA. If a backend storage resource (Virtual Drive) fails, Cloud Block Store will auto-heal and automatically deploy a new replacement. Furthermore, Cloud Block Store incorporates spread placement groups into its architecture, which reduces the physical fault domains. If a physical Amazon Web Services (AWS) failure occurs within an availability zone (AZ), the fault domains limit the failures from affecting multiple resources of a Cloud Block Store instance. For the ultimate level of data protection and business continuity, customers can replicate data between AZs or regions.
- APIs
- When moving data to and from the public cloud, developers must redesign their applications to use different APIs between the various cloud environments, which is a major obstacle. However, with a common underlying storage services layer, developers can continue to use the same APIs and operational workflows whether they use the FlashArray on-premises, Cloud Block Store on AWS, or Cloud Block Store on Azure. Documentation for API Tools can be found on GitHub.
- Mobility
- With the native replication features of Cloud Block Store, customers can easily copy or move data for purposes such as disaster recovery, migration, test/development, and backup. It is important to emphasize that data mobility is not just about moving data. It is also about minimizing the effort to change the applications or operational workflows between different private or public cloud environments. Customers can access and manage their data the same way between their on-premises FlashArray or with Cloud Block Store.
- Furthermore, the CloudSnap feature, available on both FlashArray and Cloud Block Store, offers an additional low-cost alternative to back up, copy, or move data to the public cloud.
- Enterprise Capabilities
- Cloud Block Store provides enterprise features that customers expect in their evolving IT environment. These features include data reduction services, instantaneous snapshot creation/restores, always-on encryption, QoS, asynchronous replication, and Purity ActiveCluster synchronous replication.
Cloud Block Store Architecture and Core Components for AWS
Cloud Block Store uses the existing Purity Operating Environment to deliver the enterprise features and efficiencies available on the FlashArray. Also, it is important to emphasize that the deliberate architecture of Cloud Block Store provides superior resiliency and consistent performance compared to other third-party storage vendors deployed in the public cloud.
Similar to the highly resilient FlashArray, the core components of Cloud Block Store for AWS include dual controllers, NVRAM, and NVMe flash storage. In AWS, Amazon EC2 compute instances are employed for each of the dual controllers where data is processed. For NVRAM, Cloud Block Store uses Amazon Elastic Block Storage (Amazon EBS), more specifically io2 volumes. Customer data will be stored onto backend flash storage by using a component called Virtual Drives (VDrives). VDrives are composed of EC2 instances and instance store, which provide the blazing NVMe performance for customer read operations. Lastly, all data is mirrored onto Amazon Simple Storage Service (Amazon S3) to ensure an additional layer of durability. Each of these components guarantee data is highly available, durable, and performs consistently.
- Controllers
- The Purity Operating Environment (POE) runs on the Cloud Block Store controllers, which consist of two AWS EC2 instances. The dual controllers ensure high data accessibility and availability in the event of a single controller (EC2) failure. iSCSI connections can receive and transmit IO traffic through both controllers. The Cloud Block Store controllers process the data (deduplicate, compress, and encrypt) before writing to the underlying storage resources.
- NVRAM
- Cloud Block Store uses high-performance io2 volumes as the NVRAM modules. All host write IOs are initially cached to the io2 volumes. A write IO is immediately acknowledged back to the application host after it has been securely written and mirrored onto two io2 volumes.
- Virtual Drives
- The flash modules of a Cloud Block Store instance are called Virtual Drives (VDrives). Each VDrive is composed of an EC2 instance and a direct attached high-speed NVMe instance store volume. Once a host IO is mirrored to two NVRAM modules, data is eventually flushed to the instance store volumes with the VDrives. Since Instance Store provide high speed NVMe data access, it is ideal to service all host read IOs.
- Virtual Shelf
- A Virtual Shelf is a grouping of seven individual Virtual Drives. Data is written across the Virtual Shelf using RAID-HA to protect against concurrent dual Virtual Drive failures within a Virtual Shelf.
- Amazon S3
- In order to ensure the highest durability, all data residing on the Virtual Drives are copied and persisted into Amazon S3. Any Virtual Drive failures will result in a data restore from Amazon S3.
Use Cases
Cloud Block Store addresses multiple use cases that help customers enhance their existing storage capabilities.
Production
Application Migration
When migrating data to the public cloud, the requirement to redesign existing applications deployed on-premises is a common challenge. The enterprise data storage services of Cloud Block Store allow customers to easily migrate and run their existing mission critical data in the public cloud without redesigning their applications. Cloud Block Store shares a common abstraction layer as the FlashArray for both management and data access which allows customers to preserve the same operational workflows, scripts, and orchestration tools. Cloud Block Store's built-in resiliency allows for the most mission critical data to run without requiring the application layer to manage data availability.
For example, many traditional applications attain high availability by leveraging clustering services, which requires multiple compute nodes to have access to the same storage volume(s). However, deploying clustering services in AWS is challenging because native EBS volumes have limited capabilities when attaching to multiple EC2 compute instances. A single failure on either of the application's EBS volume or the EC2 compute host can lead to a disruptive application outage. To work around these high availability vulnerabilities, customers must redesign their applications, which might be time-consuming or require costly additional AWS resources. Cloud Block Store removes this limitation because applications can access their block storage using iSCSI, which supports attaching volumes to multiple compute instances. This enhancement makes the transition to AWS less challenging because customers can continue to use the clustering services that they are comfortable with, especially for production applications that demand the highest level of resiliency.
Existing developers who have standardized on the rich APIs provided by the FlashArray can continue to use them for Cloud Block Store. The identical scripts and automation tools built on the APIs can be directly applied to Cloud Block Store with little or no change. Once again, this workflow reduces the friction for customers who want to migrate their application data between their private and public cloud environments.
Reduce Storage Costs
Once customers successfully make the transition to AWS with Cloud Block Store, they can realize the benefits of Cloud Block Store enterprise features. Customers can consolidate data onto Cloud Block Store to take advantage of Purity's data reduction capabilities. Existing FlashArray customers can expect the same data reduction ratio already observed on their physical FlashArray. Thin Provisioning with Cloud Block Store further improves a customer's total efficiency.Developers can confidently provision volumes as large as needed without the cost concern or the need to constantly resize their disks.
Increase Availability
In production environments, data availability and protection is the number one priority. Customers can rely on the highly efficient snapshot capabilities of Cloud Block Store to provide periodic point-in-time volume snapshots to protect from unintentional data changes. Additionally, customers who require the highest form of data availability can replicate their data between availability zones or regions. ActiveCluster allows data to be synchronously replicated between Cloud Block Store instances in different availability zones. This feature allows for automatic and transparent application failovers in the event of a complete outage of an availability zone. For larger-scale regional outage protection, data can also be replicated asynchronously between Cloud Block Store instances residing in different regions. Lastly, customers can also leverage Purity's CloudSnap feature to send volume snapshots directly to Amazon S3 buckets for backup purposes. Customers can restore CloudSnap snapshots to any FlashArray or to other Cloud Block Store instances.
Dev/Test
Most dev/test environments rely on the usage of snapshots for their applications. For example, when developers want to test their new code or script against real data, rather than testing against live data, they typically perform their tests on copies of the original data. In the public cloud, developers can make copies using cloud-native snapshot tools. However, depending on the public cloud vendor, these snapshots may incur a charge for the saved data. Furthermore, accessing a snapshot requires a restore process first, which essentially creates a full clone of the desired snapshot. These clones are the equivalent of a full volume and customers are charged as such. Since the data will need to be physically copied onto a new volume, the restore process might take time.
Production and Dev/Test in Cloud
Using Cloud Block Store’s native snapshot capability in a dev/test environment provides a more streamline and economical option than using the cloud-native snapshot and cloning tools. Customers can run production applications on Cloud Block Store while also generating copies of their production data for development, testing, and analytical purposes. Since Cloud Block Store’s snapshots are metadata pointers to the original data, developers can virtually create thousands of snapshots or clones instantaneously while consuming no additional storage. Physical storage is only consumed when there is new and unique data written to the Cloud Block Store volumes. These capabilities improve both efficiency and operational workflow for a dev/test environment.
Production and Dev/Test with Hybrid/Multi-Cloud
Many customers have a desire to separate their production and dev/test environments. They may want to keep their production data in their private cloud while leveraging the public cloud’s elastic compute capabilities to spin up dev/test environments on demand. Cloud Block Store allows this type of workflow. Customers can run their mission critical production workload on a physical FlashArray while replicating the same dataset onto a Cloud Block Store managed application. As the need arises, they can spin up their EC2 instances on demand and start testing, development, or analysis of their data on Cloud Block Store. Any scripts or API’s used on the FlashArray can be reused for Cloud Block Store. When they’re done, they can shut down the EC2 instances to reduce costs. If the dataset is no longer needed, they can even remove the Cloud Block Store managed application and redeploy the next time it is needed.
Disaster Recovery and Migration using Replication
The Purity Operating Environment enables both the FlashArray and Cloud Block Store to use the same replication technology despite running on two different environments. This feature opens the door to new use cases including disaster recovery, data migration, and back-up to the public cloud.
For disaster recovery (DR) solutions, many customers search for ways to incorporate the public cloud. Leveraging the public cloud alleviates the need to manage remote secondary or tertiary physical data centers. In a disaster recovery solution, customers can use Cloud Block Store as a replication target. During a DR failover event, customers can use a replicated snapshot volume on Cloud Block Store to instantaneously clone and attach to the respective application hosts in the public cloud. For protection against a single AZ or regional failure, Cloud Block Store can replicate its own source volumes to other Cloud Block Store instances.
Customers looking to migrate data from their on-premises data center to the public cloud can rely on Cloud Block Store to not only provide the vehicle to move their data, but also provide ongoing enhanced data services. Once data volumes are replicated to Cloud Block Store, customers can easily attach the volumes to the application compute instances over iSCSI with the same simple steps as on a FlashArray.
CloudSnap
CloudSnap is a built-in feature that allows customers to quickly send snapshots copies of their FlashArray or Cloud Block Store volumes to cloud targets like Azure Blob storage or Amazon S3 buckets. These snapshots are self-contained with the meta-data needed to restore volumes back onto any other FlashArray or Cloud Block Store deployment. CloudSnap is built for archival and backup/restore purposes, but can potentially be leveraged as a DR alternative for customers who have higher RTO/RPO tolerances. For example, customers can periodically send CloudSnap snapshots to Amazon S3. In a DR event where the primary site is inaccessible, customers can deploy a new Cloud Block Store instance on-demand and restore their CloudSnap snapshots. Once the CloudSnap snapshots are fully restored onto the Cloud Block Store instance, customers can attach the restored volumes to the appropriate EC2 in their VPC to resume application services. This DR alternative provides a lower cost option for customers who have a higher RTO/RPO tolerance. Since volumes can be restored from Amazon S3 or Azure Blob storage, the RTO will largely depend on the amount of data that has to be restored.
Hybrid Cloud
Cloud Block Store provides an abstraction layer that allows applications to be agnostic to the private or public cloud that it runs on. Pure Storage's vision is to enable customers to seamlessly move their data between all the major public cloud vendors using the native replication capabilities of Cloud Block Store. Furthermore, Cloud Block Store's data reduction is preserved when data is replicated, thus reducing the required bandwidth, time, and potential data egress costs. Customers are protected from being locked into any single environment and are afforded the flexibility to migrate or replicate data with minimal effort.
Containers and PSO
Containers are popular among developers because of their self-contained, lightweight, and portable capabilities. Pure Service Orchestrator provides container storage-as-a-service to help developers deploy scale-out microservices with persistent storage. Developers can provision storage automatically using policies that integrate into their container orchestration framework. Pure Storage Orchestrator supports both the FlashArray and Cloud Block Store, which allows customers to use the same scripts to orchestrate and automate storage provisioning to their Kubernetes clusters, both on-premises or on AWS.
Solutions
VMware environments
Using vVols
Undoubtedly, VMware is a dominant presence in the IT industry. As the IT industry continues to shift towards a hybrid cloud model, data mobility for guest VMs is crucial. However, AWS does not natively support VMFS file systems, so a VMFS volume that is replicated to AWS is not directly readable by the AWS provided AMIs. Storage vMotion provides a simple solution for customers, allowing them to convert their VMFS datastore-based VMs into distinct VMware vSphere Virtual Volumes (vVols)-based VMs. Each vVol is a standalone volume that is formatted with the respective file system of the guest OS. Therefore, a vVol from a Windows guest VM is directly readable from an EC2 instance with a Windows AMI since it is simply reading an NTFS file system.
VMware Cloud
Customers who deploy VMware Cloud can also take advantage of Cloud Block Store. Cloud Block Store allows customers to expand their VMware Cloud storage capacity independently from the nodes, providing an economical storage option. Although VMware Cloud does not support external storage resources connected to vSphere, guest VMs can still use the same iSCSI protocol to connect to a Cloud Block Store volume. With in-guest iSCSI, customers can mount new Cloud Block Store volumes, or even vVol volumes replicated from their on-premises FlashArray.
Oracle, SAP HANA, SQL Server and Other Database Environments
AWS provides native snapshot capabilities that are useful when creating backups of EBS volumes. These snapshots are taken on EBS volumes and stored onto Amazon S3. To read or write to these snapshots, users must copy the snapshots back from Amazon S3 onto new EBS volumes. The length of time time to complete the copy depends on the amount of data stored on Amazon S3. This practice is adequate for basic backup/restore use cases. But for many database environments, it can be slow and expensive. More specifically, many developers rely on databases to test against their scripts, applications, or analytics engines. Performing these tests on live production databases is not an option. Therefore, snapshots and clones are essential for many developers. Cloud Block Store allows developers to instantaneously create snapshot and clone copies of production databases. Also, customers can instantly restore their databases with ease. The data reduction engine of Cloud Block Store reduces the overall costs of creating the database snapshots since only new and unique blocks of data consume additional storage space. Customers can potentially create hundreds of snapshots and clones without consuming any storage or incurring any additional costs.
Replication
Array-based replication allows customers to efficiently copy or migrate data between Cloud Block Store instances and physical FlashArrays (or other Cloud Block Store instances). Cloud Block Store supports both bi-directional asynchronous replication as well as synchronous replication using ActiveCluster. Cloud Block Store also supports the similar one-to-many, many-to-one, and Active/Active Async topologies as the FlashArray.
Leveraging the same robust Purity Operating Environment as the FlashArray, Cloud Block Store provides a proven, highly efficient, and rich replication solution. Not only can customers reduce the underlying storage footprint of their existing data they can also reduce the amount of data sent across the wire. Data is always deduplicated and compressed prior to replication. Additionally, metadata is continuously shared between replicating parties, which prevents data from being replicated if the blocks already exist on the target. These built-in efficiencies reduce the overall time, bandwidth, and potential egress cost associated with replicating data.
Customers can replicate data between FlashArrays and Cloud Block Store instances as long as there is connectivity between the two appliances. Additionally, customers can connect between a VPC and their datacenter by using an AWS Site-to-Site VPN connection or AWS Direct Connect between two sites. For customers who want to connect two Cloud Block Store instnaces that reside in separate VPCs or regions, AWS peering connections can be used.
The replication workflow is identical to existing steps on the FlashArray. Customers can use protection groups (pgroups) to asynchronously replicate groups of interdependent volumes consistently. Customers can also synchronously replicate volume(s) using stretched pods with ActiveCluster. There are no requirements to learn new procedures or change operational workflows when replicating with Cloud Block Store. And to reduce the overall failover RTO, customers can use the same existing Purity REST APIs to streamline and automate the failover workflow.
Non-Disruptive Upgrades (NDU)
As new Purity software code or next-generation hardware becomes available, Pure's FlashArray facilitates effortless non-disruptive in-place upgrading. Customers who consume services from the public cloud expect an experience minimal intervention. Cloud Block Store follows these principles and provides non-disruptive upgrades (NDU) for capacity increases and new Purity code releases. When an NDU is requested, Pure Support will remotely execute internal scripts to automate the upgrade process. Since Cloud Block Store is fully redundant, customers do not need to perform manual failovers and failbacks. Cloud Block Store data services remain online, and the data is accessible for the duration of the upgrade.
Procurement and Deployment
Procurement
Pure Storage provides two flexible options to obtain a Pure-as-a-Service subscription, which will allow customers to deploy and consume Cloud Block Store capacity.
CBS Licensing Marketplace link
- Pure-as-a-Service via Pure
- Customers can work with Pure Storage partners to obtain a Pure-as-a-Service subscription contract. The subscription contract includes a Cloud Block Store license key that allows customers to deploy Cloud Block Store in their desired AWS Virtual Private Cloud (VPC). This option requires a minimum 1 year contract but is the most economical option.
- Pure-as-a-Service via AWS Marketplace
- Customers who do not prefer long-term contracts or have pre-commits with AWS can go directly to the AWS Marketplace and obtain a Pure-as-a-Service subscription + Cloud Block Store license key. This license key allows customers to deploy and use Cloud Block Store in their desired VPC. Customers who start with this short-term contract have the option to migrate to a longer term Pure as-a-Service contract if desired.
Deployment
CBS Deployment Marketplace Link
Customers can deploy Cloud Block Store in a few simple steps. Whether customers obtained the Pure as-a-Service subscription through Pure (and Pure partners) or the AWS Marketplace, steps to deploy a Cloud Block Store instance are identical. Customers can start at the AWS Marketplace where customers can search and select the Cloud Block Store Product Deployment listing. The product deployment listing will lead customers to the AWS CloudFormation service which will guide customers into launching new Cloud Block Store instances in their desired VPC and subnets. The deployment is completely automated and results in a fully initialized Cloud Block Store virtual appliance. See the Cloud Block Store Deployment and Configuration Guide for detailed deployment steps and pre-requisites..