Troubleshooting: Storage Providers are offline after a Purity upgrade to 5.1.6+
Do not manually reset certificates if the FlashArray is running Purity 5.3.0 or higher in the same method as outlined in this KB for Purity 5.1 or 5.2. All management of the VASA certificates must be done with purecert via the CLI on Purity 5.3.0 and higher.
Overview
Starting with Purity 5.1.6 changes were made to where the VASA Certs, Keystores and Truststores were stored. Prior to 5.1.6 VASA Keystores and Truststores were located at /cache/ui/keystore and /cache/ui/truststore. From Purity 5.1.6 and onward the Truststore and Keystore are both stored with the VASA Cert in /cache/ssl/. The reason for this change is documented in PURE-129687.
There is no expectation for customers upgrading to 5.1.6 and higher to have any impact to their existing VVol Configuration and setup with the FlashArrays VASA Provider. That said, there have been instances with customers upgrading to 5.1.6 and higher where the storage provider reports as offline from vCenter. This in turn causes the VVol Datastore to report is offline/disconnected. While the VVol VMs will not be disrupted, any operations (power on, vSphere vMotion, clone, managed snapshot, etc) will fail.
While we have some causes figured out, there may be others that we have not root caused yet. So, we will want to make sure we are gathering all of the information we can to ensure we have the best chance to root cause all possible pathologies. You'll want to have a JIRA opened and ready to start gathering the data, documenting it and then helping the customer get up and running.
Customer Facing Verbiage
With Purity 5.1.6 and higher Pure Storage updated the security infrastructure for the VASA Provider. This was done to ensures that the VASA provider uses the same security infrastructure as the array. As part of this update the previously stored keystore and truststore are use to generate the updated VASA Certificate.
While this update is not expected to impact vCenter Storage Providers that are registered with the FlashArray, Pure Storage has identified some instances where after the Purity upgrade is complete the Storage Providers report as offline in vCenter.
One possible issue is if the vCenter CA Certificates utilize Intermediate Certificates. With Purity 5.1.6 and 5.1.7 by default the FlashArray will not accept a CA Certificate that is an Intermediate Certificate.Pure Storage is working to monitor and work with customers that are using VVols and are upgrading to Purity 5.1.6+. After the FlashArray upgrade is complete Support will work with the Customer and request the customer to check the Storage Provider status from vCenter. Support will work with the customer should the Storage Providers be offline, in a sync error, or any state not active and online to get the Storage Providers back online. In the event that the Storage Provider is offline, this does not cause an outage or a failure of IO for any VMs using VVols. Rather while the Storage Provider is offline, VVol related operations will no longer complete, such as taking managed snapshots, powering on new VVol VMs or a vSphere vMotion.
Workflow and Action Plan
Alright, a customer has just upgraded their array to 5.1.6+ and have now reported that their VVol Datastore is offline/disconnected and/or their Storage Providers appear offline. Here is the workflow we'll want to proceed with and the information we'll want to gather. Along with the action plan to get the customer up and running again.
Information Gathering
This is the most important part for us to diagnose the issue and find out if there are new pathologies that need to be fixed. Here is the information needed right away.
- Find out when the upgrade happened and when both controllers were rebooted.
Run something like this on fuse to find out when the reboots happened zgrep -h "Running.*pureboot reboot --" /logs/egeli.ch/purerz1-ct[01]/2019_01_03/install.log* Jan 03 2019 12:23:50.788 PureRZ1-ct0 0x47be7994: Running command: /opt/Purity/bin/pureboot reboot --secondary Jan 03 2019 12:10:04.743 PureRZ1-ct1 0x47be7994: Running command: /opt/Purity/bin/pureboot reboot --secondary
- Have the customer open an RA and gather the following information from both Controllers
openssl x509 -in /cache/ssl/vasa.crt -text -noout
root@sn1-m20-c12-25-ct0:~# openssl x509 -in /cache/ssl/vasa.crt -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 18168171810716888782 (0xfc225000b40baece) Signature Algorithm: sha256WithRSAEncryption Issuer: CN=CA, DC=sso, DC=alex.purestorage.com, C=US, ST=California, O=prod-vcsa.alex.purestorage.com, OU=VMware Engineering Validity Not Before: Dec 26 23:41:46 2018 GMT Not After : Dec 27 23:41:46 2019 GMT Subject: C=US, O=Pure Storage, OU=Pure Storage, CN=10.21.88.113 Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit) Modulus: 00:81:67:d2:06:13:d1:df:c0:19:92:92:c5:b2:4c: c6:75:c4:f6:a6:32:3a:8e:9a:da:4e:0b:ee:c4:e6: 15:f2:26:04:6c:14:75:3f:1b:05:a7:99:15:d3:fb: ef:f0:57:4f:eb:74:5b:65:20:3d:66:08:68:22:97: 28:3b:51:7e:75:b7:f8:25:72:be:79:8c:62:ba:3e: ef:76:63:e4:10:1e:7b:8d:4d:6f:6a:29:f7:78:c3: 0c:a3:32:68:49:01:f7:4c:33:a2:7f:53:17:b3:33: 1a:9a:7c:fb:dd:56:4b:09:98:fe:46:3f:97:7a:43: 92:1d:d0:26:a6:ba:40:e9:cf:d4:50:fd:d2:ea:17: 71:1c:04:dd:68:f3:bd:9a:ab:cb:00:93:34:78:7b: 47:dd:31:d4:dc:57:c5:a7:fb:b3:04:ae:04:90:6b: c8:14:fb:5f:12:c0:85:11:bb:1a:a5:e1:5d:85:2f: 8a:45:7f:8e:8c:35:cc:22:3c:15:0a:ec:31:30:9d: ca:fb:7d:5a:ae:13:47:ee:ef:7a:a9:70:d0:70:34: b3:ef:7d:99:0b:02:25:b1:96:de:fe:b9:0d:53:0c: aa:63:ab:5d:dc:ee:11:53:71:85:75:7d:19:21:34: 5a:a0:7e:d8:6c:16:e4:b9:a3:c8:fe:44:1b:ae:83: 3f:21 Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Subject Alternative Name: IP Address:10.21.88.113 X509v3 Authority Key Identifier: keyid:0E:44:D2:A3:D9:30:52:4C:A1:A9:79:37:7C:81:42:F3:B0:23:FD:B2 Authority Information Access: CA Issuers - URI:https://prod-vcsa.alex.purestorage.com/afd/vecs/ca Signature Algorithm: sha256WithRSAEncryption 7c:b4:d0:f0:91:3a:1a:2f:f2:da:52:62:1d:44:fb:d2:b2:f4: cb:7a:4d:f0:c4:9d:00:7c:c7:a8:0f:28:aa:88:ba:39:46:a2: 3b:f0:79:63:07:e0:be:bc:8f:ae:f9:11:75:9d:09:eb:13:64: e7:0a:01:a9:30:02:d6:c5:03:cd:6d:91:ec:32:c0:57:53:04: 67:84:46:ee:a9:bb:04:05:06:b7:cb:7d:ec:c2:5e:cc:f8:f8: 03:6a:a0:ab:5a:c0:5f:e7:36:40:d7:6b:2f:7f:d1:09:3f:fe: 56:2e:35:f7:a2:44:c8:03:3d:4b:e6:82:ff:30:a6:42:03:a2: 3a:c7:14:02:f9:3a:79:d5:48:b9:71:4b:e7:0c:e1:4d:71:a0: 02:98:e1:3a:69:6c:1f:85:cd:74:2d:c8:8e:b4:3c:80:fb:97: 6d:82:45:42:59:d2:d7:16:89:5e:cc:65:50:28:17:72:52:76: b7:62:36:10:cc:cf:8b:21:e3:3d:77:f2:30:42:c0:e9:bb:0f: 7b:f4:f3:af:ee:0c:d4:49:75:cf:cd:c0:78:3c:ef:fe:49:4f: 72:26:eb:07:06:83:78:92:1a:fd:17:23:a5:78:6f:3e:04:22: 43:56:51:7e:88:7b:f7:bc:4f:fd:2b:da:5c:68:6e:c1:aa:9b: 57:eb:94:cd
keytool -list -v -keystore /cache/ui/keystore_backup
root@sn1-m20-c12-25-ct0:/cache/ui# keytool -list -v -keystore /cache/ui/keystore_backup Enter keystore password: ***************** WARNING WARNING WARNING ***************** * The integrity of the information stored in your keystore * * has NOT been verified! In order to verify its integrity, * * you must provide your keystore password. * ***************** WARNING WARNING WARNING ***************** Keystore type: jks Keystore provider: SUN Your keystore contains 1 entry Alias name: jetty Creation date: Nov 15, 2018 Entry type: PrivateKeyEntry Certificate chain length: 1 Certificate[1]: Owner: C=US, O=Pure Storage, OU=Pure Storage, CN=10.21.88.113 Issuer: C=US, O=Pure Storage, OU=Pure Storage, CN=10.21.88.113 Serial number: 167191e020e Valid from: Thu Nov 15 12:44:16 PST 2018 until: Fri Nov 15 12:44:16 PST 2019 Certificate fingerprints: MD5: C7:1A:D3:F4:C0:35:33:47:1C:2A:CB:11:C2:AB:1F:1A SHA1: 32:98:E7:C7:E8:F2:80:13:EF:F4:C5:B9:C1:A9:0F:EF:DF:35:4C:4C SHA256: C5:39:0F:5D:AF:E0:27:09:02:BD:E7:AE:AA:A0:46:D9:4B:CE:2E:AA:99:A5:DF:9F:7E:4D:D4:23:C6:90:30:2C Signature algorithm name: SHA256withRSA Subject Public Key Algorithm: 2048-bit RSA key Version: 3 Extensions: #1: ObjectId: 2.5.29.17 Criticality=false SubjectAlternativeName [ IPAddress: 10.21.88.113 ] ******************************************* ******************************************* Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /cache/ui/keystore_backup -destkeystore /cache/ui/keystore_backup -deststoretype pkcs12".
keytool -list -v -keystore /cache/ui/truststore_backup
root@slc-m50r2-ct0:~# keytool -list -v -keystore /cache/ui/truststore_backup Enter keystore password: ***************** WARNING WARNING WARNING ***************** * The integrity of the information stored in your keystore * * has NOT been verified! In order to verify its integrity, * * you must provide your keystore password. * ***************** WARNING WARNING WARNING ***************** Keystore type: JKS Keystore provider: SUN Your keystore contains 2 entries Alias name: vmca_1 Creation date: Jan 5, 2019 Entry type: trustedCertEntry Owner: OU=VMware Engineering, O=UCS-vCSA-PSC-02.slc.purestorage.com, ST=California, C=US, DC=slc.purestorage.com, DC=ucs-vcsa-sso, CN=CA Issuer: OU=VMware Engineering, O=UCS-vCSA-PSC-02.slc.purestorage.com, ST=California, C=US, DC=slc.purestorage.com, DC=ucs-vcsa-sso, CN=CA Serial number: e45815208f5d298c Valid from: Sun Apr 29 16:33:14 UTC 2018 until: Wed Apr 26 16:33:14 UTC 2028 Certificate fingerprints: MD5: B9:BC:82:BC:C2:92:1B:D5:BE:B7:8D:AA:DE:A0:85:C8 SHA1: B8:9B:B1:4B:B8:EC:A4:EF:35:FF:E1:A9:07:74:9D:C6:DA:47:79:82 SHA256: 68:7C:7E:EA:A2:1A:44:B5:EB:19:FA:58:FD:4D:5F:88:85:2B:64:D9:30:4B:2B:49:97:46:FF:D4:8B:4B:73:6B Signature algorithm name: SHA256withRSA Subject Public Key Algorithm: 2048-bit RSA key Version: 3 Extensions: #1: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:0 ] #2: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ Key_CertSign Crl_Sign ] #3: ObjectId: 2.5.29.17 Criticality=false SubjectAlternativeName [ RFC822Name: email@acme.com IPAddress: 127.0.0.1 ] #4: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: B6 E2 E1 2E DF 89 CC 22 2A 19 62 E8 6B 19 B2 EE ......."*.b.k... 0010: EC A8 FE 9C .... ] ] ******************************************* ******************************************* Alias name: vmca_0 Creation date: Jan 5, 2019 Entry type: trustedCertEntry Owner: OU=VMware Engineering, O=ucs-vcsa-psc-01.slc.purestorage.com, ST=California, C=US, DC=slc.purestorage.com, DC=ucs-vcsa-sso, CN=CA Issuer: OU=VMware Engineering, O=ucs-vcsa-psc-01.slc.purestorage.com, ST=California, C=US, DC=slc.purestorage.com, DC=ucs-vcsa-sso, CN=CA Serial number: dbb2fb157ea40782 Valid from: Sun Apr 29 16:26:29 UTC 2018 until: Wed Apr 26 16:26:29 UTC 2028 Certificate fingerprints: MD5: 24:E9:4D:F0:E7:76:45:93:3D:19:FF:3F:B8:8A:AE:76 SHA1: FE:4D:56:07:39:20:12:86:7A:53:23:C3:93:40:27:39:B4:FD:F9:04 SHA256: 0E:37:B8:20:8A:BF:F7:D7:B7:3C:73:AE:1C:E2:22:BF:5F:DF:9C:DD:60:C9:55:AB:FE:28:6C:A5:CA:22:F1:3B Signature algorithm name: SHA256withRSA Subject Public Key Algorithm: 2048-bit RSA key Version: 3 Extensions: #1: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:0 ] #2: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ Key_CertSign Crl_Sign ] #3: ObjectId: 2.5.29.17 Criticality=false SubjectAlternativeName [ RFC822Name: email@acme.com IPAddress: 127.0.0.1 ] #4: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 85 11 5C A0 CA AE DE 13 61 64 A2 02 C3 00 4E 11 ..\.....ad....N. 0010: 47 98 F8 8E G... ] ] ******************************************* *******************************************
The reason we want this information is to confirm that the /cache/ssl/vasa.crt is getting correctly populated with the pervious certificate information. - Are the Storage Providers reporting as offline in vCenter after a rescan or a certificate refresh?
Here the Status is Online. Check if it's Offline after a rescan and certificate refresh. - From the ESXi Hosts Gather the Following information
openssl s_client -CAfile /etc/vmware/ssl/castore.pem -showcerts -connect FlashArray-CT0-IP:8084
[root@ESXi-2:~] openssl s_client -CAfile /etc/vmware/ssl/castore.pem -showcerts -connect 10.21.203.32:8084 CONNECTED(00000003) depth=1 CN = CA, DC = sso, DC = alex.purestorage.com, C = US, ST = California, O = dev-vcsa.alex.purestorage.com, OU = VMware Engineering verify return:1 depth=0 C = US, O = Pure Storage, OU = Pure Storage, CN = 10.21.203.32 verify return:1 --- Certificate chain 0 s:/C=US/O=Pure Storage/OU=Pure Storage/CN=10.21.203.32 i:/CN=CA/DC=sso/DC=alex.purestorage.com/C=US/ST=California/O=dev-vcsa.alex.purestorage.com/OU=VMware Engineering -----BEGIN CERTIFICATE----- MIIECjCCAvKgAwIBAgIJAPdseqeMC9//MA0GCSqGSIb3DQEBCwUAMIGvMQswCQYD VQQDDAJDQTETMBEGCgmSJomT8ixkARkWA3NzbzEkMCIGCgmSJomT8ixkARkWFGFs ZXgucHVyZXN0b3JhZ2UuY29tMQswCQYDVQQGEwJVUzETMBEGA1UECAwKQ2FsaWZv cm5pYTEmMCQGA1UECgwdZGV2LXZjc2EuYWxleC5wdXJlc3RvcmFnZS5jb20xGzAZ BgNVBAsMElZNd2FyZSBFbmdpbmVlcmluZzAeFw0xOTAxMDIwMDEyMDNaFw0yMDAx MDMwMDEyMDNaMFIxCzAJBgNVBAYTAlVTMRUwEwYDVQQKEwxQdXJlIFN0b3JhZ2Ux FTATBgNVBAsTDFB1cmUgU3RvcmFnZTEVMBMGA1UEAxMMMTAuMjEuMjAzLjMyMIIB IjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAmyrLznYC1uQZCkm1QIHkKnUb X0ZddQ0jZfcSEeiDExBKunyscZTUp7OoGHKkmmTD8BTDVfbmnBjqhdG8erlyeHMS xUD6sQRj217EcS7ByO2+uI55lw2I+uCx0a8tsHMxRgGM4D2k14D4TfchukGlFCpf mZi1RRZAlA2pV6pVmqbWgZMVL1bnQKZf7kIHilKRTGpHx7c6Ef2mafceWOcnTtlK QCSILxmSYrYsU9s+V2BJT5po+fZ+DXZblT4vMOcQ2ilTxJRP5QE9j5qsZrEUz+rX l433gZKDLT6lYlR+5AnDX+I/DdJ+4liBDdF/mXsmwm6EzD59BnEFf+zh3c4VpwID AQABo4GEMIGBMA8GA1UdEQQIMAaHBAoVyyAwHwYDVR0jBBgwFoAUhPjekr8DI8v9 L4f0blgFN62jI70wTQYIKwYBBQUHAQEEQTA/MD0GCCsGAQUFBzAChjFodHRwczov L2Rldi12Y3NhLmFsZXgucHVyZXN0b3JhZ2UuY29tL2FmZC92ZWNzL2NhMA0GCSqG SIb3DQEBCwUAA4IBAQAE9mMsUi+tZ01ReUwGvFCU4fLa+kLgYbpgqhvq1LImu20U JI0uu0k1o2YU4unZAdlsl88LQVmMdEJrG7eoVnBW3peEbLF/3tj6MB0k2k60zASt jmv5NLydMnq3DZ6PDrhXA31yruqmGXLjHdfjw2bL4X9+04csiDMpsalVTv8Ig8KA K5jHLf3IJTcmh2EtX+eGoMIomYglBdBaxa7VYAPj7gTT3LMgvvJld0ZaA0XHoENO bsNPcNWbppwwVKryOcpi7dKc3mS5EIi79aJp4VCa7zco1ERhp/gsK11Nx85Ngk46 HpkqOyuD4m/6rmbmJolsTj/2+mgntiqZ9/Ad5NAr -----END CERTIFICATE----- --- Server certificate subject=/C=US/O=Pure Storage/OU=Pure Storage/CN=10.21.203.32 issuer=/CN=CA/DC=sso/DC=alex.purestorage.com/C=US/ST=California/O=dev-vcsa.alex.purestorage.com/OU=VMware Engineering --- Acceptable client certificate CA names /CN=CA/DC=sso/DC=alex.purestorage.com/C=US/ST=California/O=dev-vcsa.alex.purestorage.com/OU=VMware Engineering /CN=CA/DC=sso/DC=alex.purestorage.com/C=US/ST=California/O=prod-vcsa.alex.purestorage.com/OU=VMware Engineering Client Certificate Types: RSA sign, DSA sign, ECDSA sign Requested Signature Algorithms: RSA+SHA512:DSA+SHA512:ECDSA+SHA512:RSA+SHA384:DSA+SHA384:ECDSA+SHA384:RSA+SHA256:DSA+SHA256:ECDSA+SHA256:RSA+SHA224:DSA+SHA224:ECDSA+SHA224:RSA+SHA1:DSA+SHA1:ECDSA+SHA1 Shared Requested Signature Algorithms: RSA+SHA512:DSA+SHA512:ECDSA+SHA512:RSA+SHA384:DSA+SHA384:ECDSA+SHA384:RSA+SHA256:DSA+SHA256:ECDSA+SHA256:RSA+SHA224:DSA+SHA224:ECDSA+SHA224:RSA+SHA1:DSA+SHA1:ECDSA+SHA1 Peer signing digest: SHA512 Server Temp Key: ECDH, P-256, 256 bits --- SSL handshake has read 2116 bytes and written 443 bytes --- New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384 Server public key is 2048 bit Secure Renegotiation IS supported Compression: NONE Expansion: NONE No ALPN negotiated SSL-Session: Protocol : TLSv1.2 Cipher : ECDHE-RSA-AES256-GCM-SHA384 Session-ID: 84534AF41A73A5FE12E43DE366458F4FC261423182E4AB56041D0FFA863CFE38 Session-ID-ctx: Master-Key: ECE352C5EA02DFE871C63FCA53E57040F16422A54B15E2671B157B64822EAE1F5725636A5DC9847C890684FC84240D96 Key-Arg : None PSK identity: None PSK identity hint: None SRP username: None TLS session ticket lifetime hint: 300 (seconds) TLS session ticket: 0000 - 26 85 31 7d ab 2f 67 fa-6f 45 25 9a c2 98 52 b6 &.1}./g.oE%...R. 0010 - 59 8f 18 2b 50 cb f2 48-d6 45 b0 8c df 0a 77 af Y..+P..H.E....w. 0020 - d1 c5 a9 fd 94 62 19 98-e9 78 1d aa 8a 43 91 f8 .....b...x...C.. 0030 - 5f bd 86 f9 90 74 c8 9b-f7 4b 4f 9f 59 8a b4 46 _....t...KO.Y..F 0040 - fd 32 a9 57 1c 9e 72 4a-2c 43 00 ab b3 5a 34 df .2.W..rJ,C...Z4. 0050 - 94 fc 13 ed 76 9c 6b dd-c4 c9 7f 74 13 ec 03 66 ....v.k....t...f 0060 - 00 3d 69 d5 77 80 a8 e4-9a 2b dd 61 a5 55 a4 ab .=i.w....+.a.U.. 0070 - d4 a7 10 08 9f 43 bc 47-dc 40 2e 3e 49 05 f1 03 .....C.G.@.>I... 0080 - bd 22 df fa f4 c8 bc 72-fc 3a 29 6c b9 db 95 05 .".....r.:)l.... 0090 - 93 02 99 ac d9 f1 ee 23-ea f2 da a4 a2 40 63 17 .......#.....@c. 00a0 - 04 b9 de 96 a1 8d 65 03-3b 2e 65 aa 8a 48 36 13 ......e.;.e..H6. Start Time: 1546654101 Timeout : 300 (sec) Verify return code: 0 (ok) ---
esxcli storage vvol vasaprovider list
[root@ESXi-2:~] esxcli storage vvol vasaprovider list sn1-m20-c08-17-ct0 VP Name: sn1-m20-c08-17-ct0 URL: https://10.21.203.31:8084 Status: syncError Arrays: Array Id: com.purestorage:b9b81f54-a054-4f56-86ad-cfec8c86dee8 Is Active: true Priority: 200
grep -B 2 "FINAL FAIL" /var/log/vvold.log |tail
[root@ESXi-2:~] grep -B 2 "FINAL FAIL" /var/log/vvold.log |tail 2019-01-05T02:11:23.013Z error vvold[7557B70] [Originator@6876 sub=Default] --> VasaOp::ThrowFromSessionError [#3248]: ===> FINAL FAILURE setPEContext, error (INVALID_SESSION / Bad session state (Uninitialized)) VP (sn1-m20-c08-17-ct0) Container (sn1-m20-c08-17-ct0) timeElapsed=0 msecs (#outstanding 0) -- --> VasaOp::SetPEContext [#3249]: ===> Issuing 'setPEContext' to VP sn1-m20-c08-17-ct0 (#outstanding 0/4) [session state: Uninitialized] 2019-01-05T02:16:23.030Z error vvold[7083B70] [Originator@6876 sub=Default] --> VasaOp::ThrowFromSessionError [#3249]: ===> FINAL FAILURE setPEContext, error (INVALID_SESSION / Bad session state (Uninitialized)) VP (sn1-m20-c08-17-ct0) Container (sn1-m20-c08-17-ct0) timeElapsed=0 msecs (#outstanding 0) -- --> VasaOp::SetPEContext [#3250]: ===> Issuing 'setPEContext' to VP sn1-m20-c08-17-ct0 (#outstanding 0/4) [session state: Uninitialized] 2019-01-05T02:21:23.048Z error vvold[7516B70] [Originator@6876 sub=Default] --> VasaOp::ThrowFromSessionError [#3250]: ===> FINAL FAILURE setPEContext, error (INVALID_SESSION / Bad session state (Uninitialized)) VP (sn1-m20-c08-17-ct0) Container (sn1-m20-c08-17-ct0) timeElapsed=0 msecs (#outstanding 0) --- [root@esx-cisco-rz1-s3:~] zgrep -B 2 "FINAL FAIL" /var/log/vvold.log | tail 2019-01-03T13:11:07.188Z error vvold[6994B70] [Originator@6876 sub=Default] --> VasaOp::ThrowFromSessionError [#13661495]: ===> FINAL FAILURE getEvents, error (INVALID_SESSION / Bad session state (TransportError)) VP (Pure_RZ1-ct0) Container (Pure_RZ1-ct0) timeElapsed=274 msecs (#outstanding 0) -- 2019-01-03T13:11:17.836Z error vvold[6CA0B70] [Originator@6876 sub=Default] VasaOp::DoRetry [#13661497]: No VP available for array (com.purestorage:61f3ee08-61c1-55ed-3eb2-a48b8ef8e598) 2019-01-03T13:11:17.836Z error vvold[6CA0B70] [Originator@6876 sub=Default] --> VasaOp::ThrowFromSessionError [#13661497]: ===> FINAL FAILURE queryVirtualVolume, error (INVALID_SESSION / No VP available) VP (Pure_RZ1-ct0) Container (cdc465ff-8ba4-3913-bc9b-a52f2b1bad80) timeElapsed=1243 msecs (#outstanding 0) -- 2019-01-03T13:11:18.984Z error vvold[6647B70] [Originator@6876 sub=Default] VasaOp::DoRetry [#13661499]: No VP available for array (com.purestorage:61f3ee08-61c1-55ed-3eb2-a48b8ef8e598) 2019-01-03T13:11:18.984Z error vvold[6647B70] [Originator@6876 sub=Default] --> VasaOp::ThrowFromSessionError [#13661499]: ===> FINAL FAILURE queryVirtualVolume, error (INVALID_SESSION / No VP available) VP (Pure_RZ1-ct0) Container (cdc465ff-8ba4-3913-bc9b-a52f2b1bad80) timeElapsed=1146 msecs (#outstanding 0)
- Now that you have the information above, make sure you have it copied down in your notes or into the JIRA, then help the customer get the issue resolved.
Correcting the Problem
Okay, we've got all the information we need and the customer will be getting us both the vCenter and Host logs after the issues is corrected. Let's get this issue resolved and the customer up and running. Here are the steps we'll want to take.
- First try to refresh the certificate for the storage provider for both CT0 and CT1.
- If the certificate refresh fails, remove the storage providers for CT0 and CT1 and then Re-Register the Storage provider for CT0 and CT1
- If the Registration is failing, confirm that the correct IP address and port :8084 are being registered. If it continues to fails, see what is failing in the vCenter SPS Log.
It's a good idea to tail the SPS log during these registrations in any case. The sps log is located on vCenter Server Appliance at "/var/log/vmware/vmware-sps/
"
Look for errors like this:2018-12-22T14:56:53.958-06:00 [pool-14-thread-9] ERROR opId= com.vmware.vim.sms.provider.vasa.VasaProviderImpl - Error provisioning a VMCA signed cert! com.vmware.vim.sms.fault.VasaServiceException: org.apache.axis2.AxisFault: First Element must contain the local name, Envelope , but found html
2018-12-22T14:56:54.019-06:00 [pool-14-thread-9] WARN opId= com.vmware.vim.sms.provider.vasa.VasaProviderInfoPersistenceManager - [cleanProvider] Failed to remove provider from KV store! com.vmware.vim.sms.fault.DBPersistenceException
- The next step we can follow is to clear the current VASA Certs on both controllers and restart the VASA Service on both controllers.
Here are the commands you would need to clear the certificate information on the controller. This process is required if the array has never been registered against until it was on 5.1.6+.
DO NOT RUN THIS WORKFLOW ON ANY ARRAY RUNNING PURITY 5.3.0 OR HIGHER!
-
openssl genrsa -out /cache/ssl/vasa.key 2048 openssl req -new -key /cache/ssl/vasa.key -out /cache/ssl/vasa.csr -subj "/C=US/O=Pure Storage/OU=Pure Storage/CN=Pure Storage" openssl x509 -req -days 7300 -in /cache/ssl/vasa.csr -signkey /cache/ssl/vasa.key -out /cache/ssl/vasa.crt rm /cache/ssl/vasa.csr service nginx reload service vasa restart
This would need to be done for both controllers.
These steps are well outlined Here for reference. - Should the registration fail again, Confirm with the customer that they are using an intermediate certificate and that when you ran the openssl s_client from the host that it failed. Should that be the case follow these steps from ES-60209.
First backup the VASA nginx config file. cp /etc/nginx/sites-enabled/vasa /home/os76/vasa-nginx-bak
Add the line ssl_verify_depth 10;
to the VASA nginx file.vim /etc/nginx/sites-enabled/vasa
cat /etc/nginx/sites-enabled/vasa server { listen 8084 ssl; ssl_certificate /cache/ssl/vasa.crt; ssl_certificate_key /cache/ssl/vasa.key; ssl_dhparam /cache/ssl/dhparam2048.pem; ssl_client_certificate /cache/ssl/vasaRoot.crt; ssl_verify_client optional; ssl_verify_depth 10; ssl_protocols TLSv1.1 TLSv1.2; ssl_ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AES:!aNULL:!DSS; ssl_prefer_server_ciphers on; location ~ { proxy_pass http://127.0.0.1:8085; proxy_read_timeout 999999; set $client_verified true; if ($ssl_client_verify != SUCCESS){ set $client_verified false; } proxy_set_header client_verify $client_verified; proxy_set_header remote_address $remote_addr; } }
Restart both the nginx and vasa services on the Controller. service nginx reload service vasa restart
Repeat the same steps for the other Controller. cp /etc/nginx/sites-enabled/vasa /home/os76/vasa-nginx-bak vim /etc/nginx/sites-enabled/vasa service nginx reload service vasa restart
- Have the customer register the storage providers for both controllers. Once both storage providers are registered, confirm with the customer that they are able to run through some VVol related workflows.
- Should the customer be unable to register the storage providers still or they are continuing to have issues with vvols post 5.1.6+ upgrades, please continue to work through the JIRA and document the failures seen in the sps log on vCenter and the vasa log on the FlashArray.
One of the most important things we can do here is over document what's going on and the information we are gathering. The issue with the Intermediate Certs is understood and a fix is forthcoming, but in the event that customers are having problems post 5.1.6+ upgrade and are not using Intermediate Certs the information gathering is crucial.
JIRA References
JIRA Number | Quick Summary |
---|---|
PURE-135909 | VASA doesn't allow intermediate CA certificates for client authentication |
PURE-136065 | VASA doesn't create new cert when finding double CN cert under /cache/ssl |
ES-58961 | Issue from Intermediate Cert used on vCenter/SSO registered with Array Storage Provider |
ES-60209 | Issue from Intermediate Cert used on vCenter/SSO registered with Array Storage Provider |
ES-60222 | Issue from Intermediate Cert used on vCenter/SSO registered with Array Storage Provider |
ES-60512 | Storage Providers had to be removed and re-registered post upgrade |