Skip to main content
Pure Technical Services

Transferring VMware data from an on-premises FlashArray | Pure CBS on AWS

Currently viewing public documentation. Please login to access the full scope of documentation.

KP_Ext_Announcement.png

 

Introduction

This document goes through the steps to transfer VMware data from an on-premises FlashArray to a Cloud Block Store (CBS) instance in Amazon Web Services (AWS). Since AWS does not recognize the Virtual Machine File System (VMFS), if VMware data is replicated from a FlashArray to a CBS instance in VMFS format, it will not be usable in the cloud. Therefore, the following process should be followed to make the data usable in AWS:

  • Data on the VMFS Datastore is first migrated to a Virtual Volume (vVol) Datastore
  • The data is then replicated from the vVol Datastore on the FlashArray to a CBS array in AWS
  • The replicated data on the CBS array can then be connected to Elastic Compute Cloud (EC2) instances with matching operating systems. The replicated volumes are connected as additional data volumes

Please note that CBS volumes can't be connected to EC2 instances as boot volumes, so EC2 instances can not boot from CBS.

The steps listed in this document go through the process of replicating a SQL database to AWS. Similar steps can be followed to transfer other data types.

The Existing SQL Server Setup

We start with an existing SQL Server database on-prem. The setup of the existing SQL Server database is as follows:

Following the best practices listed in the document below, the database volume and the logs volume are separated from the boot volume. Please note that this is just a sample setup, and following this setup is not a requirement for transferring data to CBS.

https://support.purestorage.com/Solutions/Microsoft_Platform_Guide/Microsoft_SQL_Server/001_Microsoft_SQL_Server_Quick_Reference

Three volumes are created on the FlashArray, namely SQL_Boot, SQL_DB, and SQL_Logs:

volumes.png

Three Datastores are created in vCenter; one on each of the three volumes. A VM called SQLSrv-2019 is created on the datastore using the SQL_Boot volume, and the datastores SQL_DB and SQL_logs are added as additional Hard Disk devices to the SQLSrv-2019 VM (see below):

vm_datastores_1.png

Microsoft Windows and SQL Server are installed on the SQLSrv-2019 VM. During SQL Server installation, the SQL_DB volume (drive letter E:) and SQL_Logs volume (drive letter F:) are selected as locations to store the database files and log files respectively:

database-logs2.png

In the SQLSRV-2019 VM, A database called CBSDatabase is created with a table called dbo.Contacts, and a few rows of data are added to the table, as shown below:

fa-sql-1.png

This database is used as an example to show how VMware data stored in VMFS format is transferred from a FlashArray to a CBS instance in AWS.

 

Migrating Data from a VMFS Datastore to a VVol Datastore

Before data can be migrated from a VMFS Datastore to a vVol Datastore, it is recommended that the FlashArray plugin be installed in vCenter, in order to simplify vVol administration.

The following document goes through the steps to install the FlashArray plugin in vCenter, and to register the FlashArray VASA provider with vCenter:

https://support.purestorage.com/Solutions/VMware_Platform_Guide/003Virtual_Volumes_-_VVols/Web_Guide%3A_Virtual_Volumes_Quick_Start_Guide

When the steps in the document at the above link are followed to install the FlashArray plugin in vCenter, and to register the VASA provider with vCenter, a Protocol Endpoint called pure-protocol-endpoint is automatically created on the FlashArray. 

To verify the existence of the protocol endpoint, select the ESXi host in vCenter, and go to Protocol Endpoints under Configure. The Protocol Endpoint should be visible, as shown below:

vcenter-pe.png

If the Protocol Endpoint doesn't exist, contact Pure Storage Support for troubleshooting help.

After the steps above have been completed, a vVol Datastore should be created.

Creating a VVol Datastore

The next step is to create a vVol Datastore. This can be done using the following steps. Right-click on the host or the host cluster, and select New Datastore under Storage:

vvol_1.png

Select the VVol option for the datastore type:

vvol_2.png

Specify a name for the new vVol Datastore; select the vVol container for the FlashArray; and click Next: In this example, we are creating a vVol Datastore called Vvol-TMEFA07-DS:

vvol_4.png

Select the host or host cluster that will have access to the vVol Datastore:

vvol_5.png

Review the summary, and click Finish to create the datastore:

vvol_6.png

View the list of datastores for the ESXi host to verify that the newly created vVol Datastore called Vvol-TMEFA07-DS has been created:

vvol_7.png

 

Migrating the VM to the VVol Datastore

After the vVol Datastore has been created, the next step is to migrate the storage for the SQLSrv-2019 VM to the vVol Datastore called Vvol-TMEFA07-DS. Right click on the VM SQLSrv-2019 and select the Migrate option:

VMFS_Vvol_1PP.png

Select the Change Storage option. The option to change both compute and storage can also be used, if desired:

VMFS_VVols_PP2.png

Select the vVol Datastore called Vvol-TMEFA07-DS as the storage destination, and click Next:

VMFS_VVols_PP3.png

Review the summary and click Finish to migrate the storage for the SQLSrv-2019 VM to the Vvol-TMEFA07-DS Datastore:

VMFS_VVols_PP4.png

When the migration is complete, verify that the datastore for the VM SQLSrv-2019 is now the vVol Datastore Vvol-TMEFA07-DS as shown below:

vm_vvol.png

After the datastore for the VM SQLSrv-2019 has been migrated from VMFS to the vVol Datastore, the VM can be replicated to a CBS instance in AWS.

 

Replicating Data from the FlashArray to CBS

Before replicating data from the FlashArray to a CBS instance in AWS, we need to deploy a CBS instance, and connect it to the on-prem FlashArray.

Important pre-requisite:

  • A VPN connection or a DirectConnect connection needs to be established between the customer's Data Center with the FlashArray, and the AWS VPC in which CBS is deployed
  • The CBS array and the on-prem FlashArray must be able to see each other over the network connection

 

Deploying a Cloud Block Store (CBS) Instance in AWS

Important pre-requisite:

  • A VPN connection or a DirectConnect connection needs to be established between the customer's Data Center where the FlashArray resides, and the AWS VPC in which CBS is deployed, so that the CBS array and the on-prem FlashArray are able to see each other over the network connection

For instructions on how to deploy a Cloud Block Store instance in AWS, please see the CBS deployment guide at the link below:

https://support.purestorage.com/FlashArray/PurityFA/Cloud_Block_Store/Cloud_Block_Store_Deployment_and_Configuration_Guide_for_AWS

After the CBS array has been deployed, the IP address to access the CBS GUI is displayed under the Outputs tab in Cloud Formation, as shown below:

cf.png

Since the CBS instance is launched in a private subnet in AWS, it is not accessible from the internet. If you don't already have a jump server in the VPC where the CBS array is launched, you may need to create one to access the CBS array. Alternatively, the CBS array can be accessed from a system in a DC which is connected via VPN or AWS DirectConnect to the VPC where the CBS array is deployed. 

In order to make sure that the CBS array is running, and that it is accessible from the jump server, open a browser in the jump server; and enter the IP address of the CBS array. You should see the CBS GUI interface shown below:

cbs-login.png

 

Connecting the FlashArray to the CBS Array

After the CBS array is up and running, set up Asynchronous Replication between the FlashArray and the CBS array by following the steps below. For additional details on setting up Asynchronous Replication, please check out the document at this link:

https://support.purestorage.com/FlashArray/PurityFA/Protect/FlashArray_Replication/FlashRecover_Replication_Configuration_and_Best_Practices_Guide_-_Purity%2F%2F%2F%2FFA_5.0_and_Below

Log into the GUI interface of the CBS array, and go to Settings > Network. Note the management and replication IP addresses of the CBS array:

cbs-networklist.png

Go to Storage > Array on the CBS array, and under Connected Arrays, click on the menu icon on the right hand side, and select Get Connection Key:

connect_1.png

Copy the Connection Key that is displayed:

connect_2.png

Next, log into the on-prem FlashArray, and go to Storage > Array. Under Array Connections, click on the menu on the right, and select Connect Array:

connect_3.png

Enter the IP address and the Connection Key from the CBS array; select Async Replication for the Type, and Ethernet (IP) for the Replication Transport, and press Connect:

connect_4.png

If the connection is successful, the name of the CBS array should appear under Array Connections, as shown below:

connect_5.png

Log into the CBS array, and go to Storage > Array. The name of the FlashArray should be listed under Connected Arrays:

connect_6.png

After a CBS array has been launched in AWS, and Asynchronous Replication has been set up between the on-prem FlashArray and the CBS array, the next step is to transfer data from the FlashArray to CBS.

Transferring Data from the VVols Datastore to CBS

Follow the steps below to transfer data from the vVols Datastore on the FlashArray to the CBS array.

Creating a Protection Group

Log into the FlashArray GUI, and under Protection Groups, select the option to create a new protection group. Select a name for the new protection group, and click Create:

Vvols_CBS_1.png

In the newly created protection group, configure the following 3 items:

1. Add the volumes that need to be replicated

2. Add the name of the CBS array as the replication target

3. Configure the replication schedule

Adding Volumes to the Protection Group

To add the volumes that need to be replicated, select the newly created protection group. Click on the menu icon under Members, and select Add Volumes. In the Add Members window, select the volumes that need to be replicated to the CBS array.

In this example, though we only need the SQL_DB and SQL_Logs volumes to restore the sample SQL database in AWS, we replicate all the SQL Server volumes to CBS as a best practice. These volumes will be required in the event that we need to restore this database on a FlashArray.

Vvols_CBS_4.png

Adding the CBS array as the Replication Target

Next, under Targets for the newly created protection group, click on the menu, and select Add. In the Add Targets window, select the name of the CBS array to which the selected volumes need to be replicated, and click Add:

target.png

Configuring the Replication Schedule

Next, click on the icon under Replication Schedule, and in the Edit Replication Schedule window, enable the replication schedule; select the replication frequency, and the retention period; then click Save:

schedule.png

Under Protection Groups, select the newly created protection group, and review the information under Members, Targets, and Replication Schedule:

Vvols_BBS_6.png

When the protection group has been created and configured, the FlashArray starts replicating data to the CBS array.

After data has been replicated to the CBS array, the replicated snapshots will show up in the CBS array under Volume Snapshots, as shown below:

CBSSnaps.png

This completes the process of replicating data from the on-prem FlashArray to the CBS array in AWS.

 

Launching a SQL Server EC2 Instance in AWS

In order to use the data replicated to CBS, we launch an EC2 instance in AWS, and select an AMI with Microsoft Windows and SQL Server pre-installed.

Go to the EC2 Dashboard in the AWS Management Console, and launch a new EC2 instance. In the Choose an AMI menu, select an AMI with Microsoft Windows and SQL Server, as shown below:

sql_ami.png

Make sure that this EC2 instance has network connectivity with the CBS array, so that an iSCSI connection can be established between the EC2 instance and CBS. 

 

Connecting the SQL Server EC2 Instance to CBS

Follow the steps below to connect the SQL Server EC2 instance to the replicated volumes on the CBS array:

 

Creating Volumes from the Replicated Snapshots on CBS

In order to use the replicated data on the CBS array, we need to first create volumes from the replicated snapshots. In this example, we only need to create volumes from two of the replicated snapshots; the database snapshot and the logs snapshot.

Log into the CBS array, and go to Storage > Volumes. Under Volume Snapshots, identify the replicated snapshot that contains the database, click on the menu to the right of the snapshot, and select Copy from the menu.

Hint: You can use the size of the data snapshot to determine which snapshot contains the boot volume, which snapshot contains the database volume, and which snapshot contains the logs volume.

copy_1.png

In the Copy Snapshot window below, select a name for the volume that'll be created from the snapshot. In this example, we use the default container "/", and call the database volume CBSDatabase-SQL-DB:

copy4.png

Similarly, copy the replicated logs snapshot to a volume. In this example, we select "/" as the container, and name the logs volume CBSDatabase-SQL-Logs.

copy5.png

Verify that the volumes CBSDatabase-SQL-DB and CBSDatabase-SQL-DB have been created.

vols.png

 

Creating & Configuring a Host on CBS

In the CBS array GUI, go to Storage > Hosts, and under Hosts, either click on "+" or select Create under the menu icon. In the Create Host window below, enter a name for the host, and click on Create:

create-host.png

Next, go to the newly created host, click on the menu under Connected Volumes, and select Connect. The Connect Volumes to Host window will appear. In this window, select the SQL volumes, and click on Connect:

add-vols.png

Next, log into the SQL Server EC2 instance, and open the iSCSI Initiator Properties box. Go to the Configuration tab, and copy the Initiator Name:

iqn-1.png

Go back to the CBS array, and under Hosts, select the newly created host. Under Host Ports, click on the menu, and select Configure IQNs:

config-iqn.png

In the Configure iSCSI IQNs window, paste the Initiator Name copied from the iSCSI Initiator Properties window in the SQL Server EC2 instance, and click Add:

host-iqn.png

Select the newly created host under Hosts to review the information:

host-done.png

 

Creating an iSCSI Connection using the iSCSI Initiator

In this example, we show a simple way to connect the the SQL Server EC2 instance to the CBS array. For information on how to use PowerShell for configuration, how to setup MPIO, or how to connect a Linux host to the CBS array, please see the CBS deployment guide.

In the SQL Server EC2 instance, go to the iSCSI Initiator Properties box, and select the Targets tab. Under Target, enter one of the iscsi IP addresses of the CBS array, and click on Quick Connect:

iscsi-connect.png

If the connection is successful, the FlashArray will appear in the list of Discovered targets:

connect1.png

 

Managing the Connected Disks in Windows

In the SQL Server EC2 instance, go to Computer Management, and under Actions, click on Rescan Disks:

scan.png

Click on Disk Management under Storage. The volumes on the CBS array should appear as offline disks:

offline.png

Right-click on each disk, and select Online. The volumes should come online:

DM2.png

Go to This PC and confirm that the volumes are connected, and that the SQL data is present in the volumes:

PC2.png

 

Connecting SQL Server Management Studio to the Replicated Database

Launch SQL Server Management Studio in the SQL Server EC2 instance. Select the default Server name, and click on Connect:

cbs-sql-1.png

In Object Explorer, right-click on Databases, and select Attach:

cbs-sql-3.png

Click Add on the next screen:

cbs-sql-4.png

Expand the drive letter that contains the database files (drive F: in this case), and select the DATA folder. You should see the CBSDatabase.mdf file that was replicated from the on-prem SQL Server. Select the CBSDatabase.mdf file, and click OK:

cbs-sql-6.png

The following screen will appear, which shows that both the database file and the log file are located in the F: drive. Since the logs for this database are located in a separate volume, which is attached as drive G:, the location of the log files needs to be changed. Under "CBSDatabse" database details, click on the menu icon to the right of Log in the File Type column:

next-logs.png

Expand the G: drive, and select the Data folder. You should see the CBSDatabase_log.ldf file in the box on the right. Select this file, and click OK:

cbs-sql-10.png

After selecting the database file and the log file, click OK to attach the CBSDatabase database to SQL Server Management Studio:

cbs-sql-11.png

Next, expand the Databases folder, and expand CBSDatabase:

cbs-sql-12.png

Expand Tables, and right-click on dbo.Contacts:

cbs-sql-13.png

Select Edit Top 200 Rows, or Select Top 1000 Rows:

cbs-sql-14.png

You should see the rows that were created on the SQL Server in the on-prem data center:

cbs-sql-15.png