Transferring VMware data from an on-premises FlashArray to CBS
Introduction
This document goes through the steps to transfer VMware data from an on-premises FlashArray to a Cloud Block Store (CBS) instance in Amazon Web Services (AWS). Since AWS does not recognize the Virtual Machine File System (VMFS), if VMware data is replicated from a FlashArray to a CBS instance in VMFS format, it will not be usable in the cloud. Therefore, the following process should be followed to make the data usable in AWS:
- Data on the VMFS Datastore is first migrated to a Virtual Volume (vVol) Datastore
- The data is then replicated from the vVol Datastore on the FlashArray to a CBS array in AWS
- The replicated data on the CBS array can then be connected to Elastic Compute Cloud (EC2) instances with matching operating systems. The replicated volumes are connected as additional data volumes
Please note that CBS volumes can't be connected to EC2 instances as boot volumes, so EC2 instances can not boot from CBS.
The steps listed in this document go through the process of replicating a SQL database to AWS. Similar steps can be followed to transfer other data types.
The Existing SQL Server Setup
We start with an existing SQL Server database on-prem. The setup of the existing SQL Server database is as follows:
Following the best practices listed in the document below, the database volume and the logs volume are separated from the boot volume. Please note that this is just a sample setup, and following this setup is not a requirement for transferring data to CBS.
Three volumes are created on the FlashArray, namely SQL_Boot, SQL_DB, and SQL_Logs:
Three Datastores are created in vCenter; one on each of the three volumes. A VM called SQLSrv-2019 is created on the datastore using the SQL_Boot volume, and the datastores SQL_DB and SQL_logs are added as additional Hard Disk devices to the SQLSrv-2019 VM (see below):
Microsoft Windows and SQL Server are installed on the SQLSrv-2019 VM. During SQL Server installation, the SQL_DB volume (drive letter E:) and SQL_Logs volume (drive letter F:) are selected as locations to store the database files and log files respectively:
In the SQLSRV-2019 VM, A database called CBSDatabase is created with a table called dbo.Contacts, and a few rows of data are added to the table, as shown below:
This database is used as an example to show how VMware data stored in VMFS format is transferred from a FlashArray to a CBS instance in AWS.
Migrating Data from a VMFS Datastore to a VVol Datastore
Before data can be migrated from a VMFS Datastore to a vVol Datastore, it is recommended that the FlashArray plugin be installed in vCenter, in order to simplify vVol administration.
The following document goes through the steps to install the FlashArray plugin in vCenter, and to register the FlashArray VASA provider with vCenter:
When the steps in the document at the above link are followed to install the FlashArray plugin in vCenter, and to register the VASA provider with vCenter, a Protocol Endpoint called pure-protocol-endpoint is automatically created on the FlashArray.
To verify the existence of the protocol endpoint, select the ESXi host in vCenter, and go to Protocol Endpoints under Configure. The Protocol Endpoint should be visible, as shown below:
If the Protocol Endpoint doesn't exist, contact Pure Storage Support for troubleshooting help.
After the steps above have been completed, a vVol Datastore should be created.
Creating a VVol Datastore
The next step is to create a vVol Datastore. This can be done using the following steps. Right-click on the host or the host cluster, and select New Datastore under Storage:
Select the VVol option for the datastore type:
Specify a name for the new vVol Datastore; select the vVol container for the FlashArray; and click Next: In this example, we are creating a vVol Datastore called Vvol-TMEFA07-DS:
Select the host or host cluster that will have access to the vVol Datastore:
Review the summary, and click Finish to create the datastore:
View the list of datastores for the ESXi host to verify that the newly created vVol Datastore called Vvol-TMEFA07-DS has been created:
Migrating the VM to the VVol Datastore
After the vVol Datastore has been created, the next step is to migrate the storage for the SQLSrv-2019 VM to the vVol Datastore called Vvol-TMEFA07-DS. Right click on the VM SQLSrv-2019 and select the Migrate option:
Select the Change Storage option. The option to change both compute and storage can also be used, if desired:
Select the vVol Datastore called Vvol-TMEFA07-DS as the storage destination, and click Next:
Review the summary and click Finish to migrate the storage for the SQLSrv-2019 VM to the Vvol-TMEFA07-DS Datastore:
When the migration is complete, verify that the datastore for the VM SQLSrv-2019 is now the vVol Datastore Vvol-TMEFA07-DS as shown below:
After the datastore for the VM SQLSrv-2019 has been migrated from VMFS to the vVol Datastore, the VM can be replicated to a CBS instance in AWS.
Replicating Data from the FlashArray to CBS
Before replicating data from the FlashArray to a CBS instance in AWS, we need to deploy a CBS instance, and connect it to the on-prem FlashArray.
Important pre-requisite:
- A VPN connection or a DirectConnect connection needs to be established between the customer's Data Center with the FlashArray, and the AWS VPC in which CBS is deployed
- The CBS array and the on-prem FlashArray must be able to see each other over the network connection
Deploying a Cloud Block Store (CBS) Instance in AWS
Important pre-requisite:
- A VPN connection or a DirectConnect connection needs to be established between the customer's Data Center where the FlashArray resides, and the AWS VPC in which CBS is deployed, so that the CBS array and the on-prem FlashArray are able to see each other over the network connection
For instructions on how to deploy a Cloud Block Store instance in AWS, please see the CBS deployment guide at the link below:
After the CBS array has been deployed, the IP address to access the CBS GUI is displayed under the Outputs tab in Cloud Formation, as shown below:
Since the CBS instance is launched in a private subnet in AWS, it is not accessible from the internet. If you don't already have a jump server in the VPC where the CBS array is launched, you may need to create one to access the CBS array. Alternatively, the CBS array can be accessed from a system in a DC which is connected via VPN or AWS DirectConnect to the VPC where the CBS array is deployed.
In order to make sure that the CBS array is running, and that it is accessible from the jump server, open a browser in the jump server; and enter the IP address of the CBS array. You should see the CBS GUI interface shown below:
Connecting the FlashArray to the CBS Array
After the CBS array is up and running, set up Asynchronous Replication between the FlashArray and the CBS array by following the steps below. For additional details on setting up Asynchronous Replication, please check out the document at this link:
Log into the GUI interface of the CBS array, and go to Settings > Network. Note the management and replication IP addresses of the CBS array:
Go to Storage > Array on the CBS array, and under Connected Arrays, click on the menu icon on the right hand side, and select Get Connection Key:
Copy the Connection Key that is displayed:
Next, log into the on-prem FlashArray, and go to Storage > Array. Under Array Connections, click on the menu on the right, and select Connect Array:
Enter the IP address and the Connection Key from the CBS array; select Async Replication for the Type, and Ethernet (IP) for the Replication Transport, and press Connect:
If the connection is successful, the name of the CBS array should appear under Array Connections, as shown below:
Log into the CBS array, and go to Storage > Array. The name of the FlashArray should be listed under Connected Arrays:
After a CBS array has been launched in AWS, and Asynchronous Replication has been set up between the on-prem FlashArray and the CBS array, the next step is to transfer data from the FlashArray to CBS.
Transferring Data from the VVols Datastore to CBS
Follow the steps below to transfer data from the vVols Datastore on the FlashArray to the CBS array.
Creating a Protection Group
Log into the FlashArray GUI, and under Protection Groups, select the option to create a new protection group. Select a name for the new protection group, and click Create:
In the newly created protection group, configure the following 3 items:
1. Add the volumes that need to be replicated
2. Add the name of the CBS array as the replication target
3. Configure the replication schedule
Adding Volumes to the Protection Group
To add the volumes that need to be replicated, select the newly created protection group. Click on the menu icon under Members, and select Add Volumes. In the Add Members window, select the volumes that need to be replicated to the CBS array.
In this example, though we only need the SQL_DB and SQL_Logs volumes to restore the sample SQL database in AWS, we replicate all the SQL Server volumes to CBS as a best practice. These volumes will be required in the event that we need to restore this database on a FlashArray.
Adding the CBS array as the Replication Target
Next, under Targets for the newly created protection group, click on the menu, and select Add. In the Add Targets window, select the name of the CBS array to which the selected volumes need to be replicated, and click Add:
Configuring the Replication Schedule
Next, click on the icon under Replication Schedule, and in the Edit Replication Schedule window, enable the replication schedule; select the replication frequency, and the retention period; then click Save:
Under Protection Groups, select the newly created protection group, and review the information under Members, Targets, and Replication Schedule:
When the protection group has been created and configured, the FlashArray starts replicating data to the CBS array.
After data has been replicated to the CBS array, the replicated snapshots will show up in the CBS array under Volume Snapshots, as shown below:
This completes the process of replicating data from the on-prem FlashArray to the CBS array in AWS.
Launching a SQL Server EC2 Instance in AWS
In order to use the data replicated to CBS, we launch an EC2 instance in AWS, and select an AMI with Microsoft Windows and SQL Server pre-installed.
Go to the EC2 Dashboard in the AWS Management Console, and launch a new EC2 instance. In the Choose an AMI menu, select an AMI with Microsoft Windows and SQL Server, as shown below:
Make sure that this EC2 instance has network connectivity with the CBS array, so that an iSCSI connection can be established between the EC2 instance and CBS.
Connecting the SQL Server EC2 Instance to CBS
Follow the steps below to connect the SQL Server EC2 instance to the replicated volumes on the CBS array:
Creating Volumes from the Replicated Snapshots on CBS
In order to use the replicated data on the CBS array, we need to first create volumes from the replicated snapshots. In this example, we only need to create volumes from two of the replicated snapshots; the database snapshot and the logs snapshot.
Log into the CBS array, and go to Storage > Volumes. Under Volume Snapshots, identify the replicated snapshot that contains the database, click on the menu to the right of the snapshot, and select Copy from the menu.
Hint: You can use the size of the data snapshot to determine which snapshot contains the boot volume, which snapshot contains the database volume, and which snapshot contains the logs volume.
In the Copy Snapshot window below, select a name for the volume that'll be created from the snapshot. In this example, we use the default container "/", and call the database volume CBSDatabase-SQL-DB:
Similarly, copy the replicated logs snapshot to a volume. In this example, we select "/" as the container, and name the logs volume CBSDatabase-SQL-Logs.
Verify that the volumes CBSDatabase-SQL-DB and CBSDatabase-SQL-DB have been created.
Creating & Configuring a Host on CBS
In the CBS array GUI, go to Storage > Hosts, and under Hosts, either click on "+" or select Create under the menu icon. In the Create Host window below, enter a name for the host, and click on Create:
Next, go to the newly created host, click on the menu under Connected Volumes, and select Connect. The Connect Volumes to Host window will appear. In this window, select the SQL volumes, and click on Connect:
Next, log into the SQL Server EC2 instance, and open the iSCSI Initiator Properties box. Go to the Configuration tab, and copy the Initiator Name:
Go back to the CBS array, and under Hosts, select the newly created host. Under Host Ports, click on the menu, and select Configure IQNs:
In the Configure iSCSI IQNs window, paste the Initiator Name copied from the iSCSI Initiator Properties window in the SQL Server EC2 instance, and click Add:
Select the newly created host under Hosts to review the information:
Creating an iSCSI Connection using the iSCSI Initiator
In this example, we show a simple way to connect the the SQL Server EC2 instance to the CBS array. For information on how to use PowerShell for configuration, how to setup MPIO, or how to connect a Linux host to the CBS array, please see the CBS deployment guide.
In the SQL Server EC2 instance, go to the iSCSI Initiator Properties box, and select the Targets tab. Under Target, enter one of the iscsi IP addresses of the CBS array, and click on Quick Connect:
If the connection is successful, the FlashArray will appear in the list of Discovered targets:
Managing the Connected Disks in Windows
In the SQL Server EC2 instance, go to Computer Management, and under Actions, click on Rescan Disks:
Click on Disk Management under Storage. The volumes on the CBS array should appear as offline disks:
Right-click on each disk, and select Online. The volumes should come online:
Go to This PC and confirm that the volumes are connected, and that the SQL data is present in the volumes:
Connecting SQL Server Management Studio to the Replicated Database
Launch SQL Server Management Studio in the SQL Server EC2 instance. Select the default Server name, and click on Connect:
In Object Explorer, right-click on Databases, and select Attach:
Click Add on the next screen:
Expand the drive letter that contains the database files (drive F: in this case), and select the DATA folder. You should see the CBSDatabase.mdf file that was replicated from the on-prem SQL Server. Select the CBSDatabase.mdf file, and click OK:
The following screen will appear, which shows that both the database file and the log file are located in the F: drive. Since the logs for this database are located in a separate volume, which is attached as drive G:, the location of the log files needs to be changed. Under "CBSDatabse" database details, click on the menu icon to the right of Log in the File Type column:
Expand the G: drive, and select the Data folder. You should see the CBSDatabase_log.ldf file in the box on the right. Select this file, and click OK:
After selecting the database file and the log file, click OK to attach the CBSDatabase database to SQL Server Management Studio:
Next, expand the Databases folder, and expand CBSDatabase:
Expand Tables, and right-click on dbo.Contacts:
Select Edit Top 200 Rows, or Select Top 1000 Rows:
You should see the rows that were created on the SQL Server in the on-prem data center: