Skip to main content
Pure Technical Services

Replacing a Boot Drive

Currently viewing public documentation. Please login to access the full scope of documentation.

KP_Ext_Announcement.png

This KB serves as a supplemental guide for the Boot Drive Replacement Guides. There are certain points within the guide that provide the guidance to call support for action to be taken, this KB will include guidance on how to perform that steps and any caveats to the procedure.  Please check back to this KB every time you plan to perform this procedure as it is updated with any new caveats or steps that are needed.

IMPORTANT: Prior to the Swap

  1. Perform a Health Check:  If there are any issues or open alerts, resolve them with support before proceeding.

  2.  Make a note of tunables set on the array, these will need to be set again after the swap is complete:
    From the array:

    puretune --list

    From logs on fuse:

    pureadm list-tunable
  3. Check if the current purity installation files are still in the /var/cache/purity directory.  If the files are not there, please request that Pure Storage stages the proper Upgrade files before the boot drive swap.  This is necessary if you have to upgrade Purity on the newly replaced boot drive.
     
Alternative Boot Drive only replacement process for FlashArray //M or //X models

The boot drive p/n 80-1802 will have Purity 6.1.19 loaded onto it.
Please follow the procedure for matching Purity levels on boot drives

If stipulated in the case, the following process can be followed where controller replacement is not required, and only boot drive parts have been dispatched. The Boot drive on the FlashArray is a field replacement unit and the following steps detail how to proceed.

BOOT DRIVE Replacement procedure

This section describes the replacement of a FlashArray Boot Drive. To install a Boot Drive in a controller, you must remove the controller from the FlashArray
chassis and then remove the appropriate riser to access the Boot Drive.

FE/IE note:

When replacing any spare, if obstruction is encountered, do not proceed forward until contacting a Pure Support Representative. Trying to clear the
obstruction may have adverse effects including those to both Pure and other customer product. Under no circumstances should a primary component be
disconnected to replace the secondary (defective) component.

Equipment required:

  • Ensure proper ESD protection (e.g., wrist strap) is used while working on Pure Storage equipment that is not powered on.
  • Incidental tools (e.g., Phillips screwdriver)

FLASHARRAY POWER DURING REPLACEMENT

You can replace a Boot Drive while an array is operating because all FlashArray data connections between controllers and shelves are redundant; however Pure Storage recommends that you perform hot replacements during less-critical times when I/O activity to the array is low.

Before proceeding with replacing a Boot Drive, the controller must be de-cabled, removed, and will be reconnected after the boot drive is replaced.

 

Step by Step
Step 1: Removing the Controller.

Follow the instructions for controller removal in FlashArray//M Service Guide or FlashArray//XR2 and //XR3 Combined Service Guide

Step 2: Place the controller on a clean, flat surface.
Step 3. Press the controller cover pads, slide the cover toward the retaining bar about 1/2 inch, and then lift the cover off. See Figure 1 Below:

Figure 1

Figure 1: Removing the Controller Cover

 

Step 4: Remove Riser 0
Step 5. Grip Riser 0 by the two finger holes as shown in Figure 13-7 and pull it straight up to remove it from the chassis.

Figure 2

Figure 2: Remove Riser 0

 

Step 6. Locate the Boot Drive. See Figure 3Figure 3

Figure 3: Locate Boot Drive

Step 7. Using a Philips screwdriver, carefully remove the holding screw. See Figure 4 

Figure 4

Figure 4: Locate holding Screw

Step 8. Slightly lift the loose end of the boot drive and extract from the socket.
Step 9. Install the replacement boot drive and secure with the holding screw.
Step 10. Re-install Riser 0. See Figure 5

Figure 5

Figure 5: Re-install Riser 0


Step 11. Re-install the cover. See Figure 6

Figure 6

Figure 6: Re-install Cover

 
Step 12: Reinstall the Controller

Follow the instructions for controller installation in FlashArray//M Service Guide or FlashArray//XR2 and //XR3 Combined Service Guide

This completes the Boot Drive installation procedure.

Configure Purity on New Boot Drive

Purity will not start on boot for replacement boot drives. This is to prevent a version mismatch while Purity is running.

Step 1: Verify Purity version on good controller (Example: CT0)

Check purity version of good controller and confirm that secondary is still not present:

root@slc-420-ct0:/home/os76# purearray list --controller
Name  Mode         Model   Version  Status
CT0   primary      FA-4XX  4.5.3    ready
CT1   not present  -       -        unknown

Step 2: From the existing controller (Example: CT0) find if connection to new peer (bond0/haeth0) is available:

os76@slc-420-ct0:~$ ip neighbor |grep bond0
fe80::202:c903:a2:d261 dev bond0 lladdr 80:00:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:a2:d2:61 DELAY
ip neighbor |grep haeth0

Step 3: Connect to peer (Example CT1) via bond0:

os76@slc-420-ct0:~$ ssh os76@fe80::202:c903:a2:d261%bond0
The authenticity of host 'fe80::202:c903:a2:d261%bond0 (fe80::202:c903:a2:d261%bond0)' can't be established.
ECDSA key fingerprint is 79:34:86:07:fd:19:96:dc:1f:e6:ad:04:88:6c:0e:ed.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'fe80::202:c903:a2:d261%bond0' (ECDSA) to the list of known hosts.
os76@fe80::202:c903:a2:d261%bond0's password:

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

Fri Oct 02 09:26:50 2015

Welcome os76. This is Purity Version 4.5.2 on FlashArray pure
http://www.purestorage.com/

 Step 4: Check version of Purity on the replaced controller (Example CT1)

os76@pure-FAKOEm9N:~$ pureversion
Product Version: 4.5.2

Depending on the version: 

  • If Purity is the same, no further action required, proceed to starting Purity. 
  • If the Purity version is not the same on the replaced boot drive (Example CT1) then you will need to install Purity before proceeding.

Step 5: Move files from good controller (Example CT0) to Replaced Controller (Example CT1) 

There is a chance that the files you need are on the good controller.  If they are not, however, you will need to SCP those files to the good controller via a Remote Assist session.  Contact Support for the necessary files.

For this process you do not want to use the upgrade script, so make sure you have the .ppkg AND .sha1 files. 

Once you have the upgrade files needed, you can scp the purity files matching the existing controller (in this example, 4.5.3) to the peer via bond0:

root@slc-420-ct0:/home/os76# scp purity_4.5.3_201508120114+f96731f.ppkg* os76@\[fe80::202:c903:a2:d261%bond0\]:/home/os76/
os76@fe80::202:c903:a2:d261%bond0's password:
purity_4.5.3_201508120114+f96731f.ppkg                                                                                                                              100% 1212MB 110.2MB/s   00:11
purity_4.5.3_201508120114+f96731f.ppkg.sha1   

Step 6: As root on the replacement controller (CT1), upgrade purity to match versions:

root@pure-FAKOEm9N:/home/os76# pureinstall purity_4.5.3_201508120114+f96731f.ppkg
Verifying package...
Installing Purity on alternate partition labeled second.
Erasing Purity software image from alternate partition second to prepare for installation.
WARNING: Do not interrupt this process!!
Unpacking new Purity software.
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Finalizing installation. This may take several minutes.
Purity installed.
Installation complete. The new Purity version will load at next reboot.

Important!
The first boot of a new Purity version may take longer if the new version includes controller firmware updates.
DO NOT REBOOT THE CONTROLLER DURING THE FIRMWARE UPDATE.

Refer to http://community.purestorage.com for more information about the Purity upgrade process and firmware updates.

NOTE: Check the timezone from both controllers using cat /etc/timezone. If both controllers have the same timezone then you will not need to set it, skip ahead to step 8.

Step 7: Set timezone if needed:

root@pure-FAKOEm9N:~# puresetup timezone
##########################################
#   Welcome to the Purity Setup Wizard   #
##########################################
[Errno 111] Connection refused
Error: Unable to communicate due to exception. Please try again.
Changing the time zone will immediately stop Purity and require a reboot on this controller.
Current time zone: America/Los_Angeles
Change time zone [requires reboot] (y/n): y
lio-drv disabling
wait-for-state foed stop/waiting
wait-for-state gui stop/waiting
wait-for-state lio-drv stop/waiting
Pure Storage is offline.

Current default time zone: 'America/Denver'
Local time is now:      Fri Oct  2 10:52:53 MDT 2015.
Universal Time is now:  Fri Oct  2 16:52:53 UTC 2015.

Confirm time zone change from America/Los_Angeles to America/Denver
(y/N): y
Tunable parameter set: PURITY_START_ON_BOOT=1
Press ENTER to reboot

This controller will be online after reboot.

Broadcast message from pureeng@pure-FAKOEm9N
	(/dev/pts/3) at 10:53 ...

The system is going down for reboot NOW!

Broadcast message from pureeng@pure-FAKOEm9N
	(/dev/pts/3) at 10:53 ...

The system is going down for reboot NOW!

NOTE: Since puresetup timezone reboots the controller, skip ahead to step 9.

Step 8: If timezone did not need to be changed, reboot:

root@pure-FAKOEm9N:/home/os76# pureboot reboot --offline

Broadcast message from pureeng@pure-FAKOEm9N
	(/dev/pts/3) at 9:47 ...

The system is going down for reboot NOW!

Broadcast message from pureeng@pure-FAKOEm9N
	(/dev/pts/3) at 9:47 ...

The system is going down for reboot NOW!

Step 9: Watch for controllers to be online.

Here, CT1 is visible but not online yet:

root@slc-420-ct0:/home/os76# purearray list --controller
Name  Mode       Model   Version  Status
CT0   primary    FA-4XX  4.5.3    ready
CT1   secondary  FA-4XX  4.5.3    not ready

After waiting a couple more minutes, see both online:

root@slc-420-ct0:/home/os76# purearray list --controller
Name  Mode       Model   Version  Status
CT0   primary    FA-4XX  4.5.3    ready
CT1   secondary  FA-4XX  4.5.3    ready

Step 10: Set the tunables to the new controller to match previous configuration.

Step 11: Test ssh peer

Make sure that you can "ssh peer" to the controller with the new boot drive, if you have any problems doing this please see KB: Unable to SSH to Peer after Controller Replacement

Swapping Both Boot Drives

In some rare cases we may need to swap both boot drives.  If we do, please keep the following in mind: 

  • We will want to replace the boot drive on the secondary first, this will need to be co-ordinated with the field technician if they are performing the swap.
  • Ensure that the replaced boot drive is healthy, the proper Purity Version, and that the GUI has been synced before proceeding to replace the Primary.
  • Once this has been confirmed, force a failover on the Primary and make sure that the failover completed without issue.
  • Identify the new secondary controller and repeat the boot drive replacement procedure above.

Troubleshooting

If after swapping the boot drive and starting Purity, foed gets stuck at:

root@PURESTORAGE:~# pureadm start
purity start/running
platform: .done
foed: done
gui: ..........done
rest: done
platform_env: 0.done
foed_env: 27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.27.^C
..and rdmaoopsd logs contains something similar to:
Jan 20 19:40:36 rdmaoopsd[MSG]: RDMA CM event 3/RDMA_CM_EVENT_ROUTE_ERROR (id 0x1c7f920/context 0x1c7cb40)
Jan 20 19:40:36 rdmaoopsd[ERR]: Route resolution error for remote fe80::f652:1403:87:9121%8
Jan 20 19:40:37 rdmaoopsd[MSG]: RDMA CM event 1/RDMA_CM_EVENT_ADDR_ERROR (id 0x1c7f920/context 0x1c7cb40)

 Check the IB links to make sure they are showing full speed: 

root@ct1:/var/log/purity# purehw list | awk '$1 ~ /IB/'
CT0.IB0 ok - 4 0 56.00 Gb/s -
CT0.IB1 ok - 4 1 56.00 Gb/s -
CT1.IB0 ok - 4 0 56.00 Gb/s -
CT1.IB1 ok - 4 1 56.00 Gb/s -

 If those are fine, you should be able to resolve it by restarting rdmaoopsd on the controller with the error:

root@PURESTORAGE:/var/log/purity# service rdmaoopsd restart
rdmaoopsd stop/waiting
rdmaoopsd start/running, process 46126
Check the status of Purity and you should find it completing:
root@PURESTORAGE:/var/log/purity# pureadm wait
platform: done
foed: done
gui: done
rest: done
platform_env: 0.done
foed_env: 2.2.2.2.2.2.2.2.2.2.2.2.2.2.2.2.0.done
remote_patch: done
driver: done
san: ......done
health: done
Broadcast Message from ct0
(somewhere) at 11:56 ...
Purity Information System Status
================================
Purity has successfully started for
the first time after an install
or upgrade. Purity 3.4.3 (201405140754+d5af1e5-r6)
is now set to be the default.