Jemimus on September 22nd, 2016

Re-encrypting my work laptop harddrive.
Veracrypt is the successor to Truecrypt and its code has been community-vetted to insure there are no ‘back doors’ in it (and its security can be independently verified).

The only downside it has is that by default, it uses a rather high header key derivation iteration value (a lot higher than truecrypt). Meaning that it can take several minutes to boot your laptop. This is a frequent complaint by new Veracrypt users.

The workaround is simple. As long as you use a password that is longer than 20 characters, you are allowed to reduce the amount of iterations substantially by using a lower multiplier value (called a PIM), that you type in at boot time after your password. The multiplayer may be as low as 1, which will more or less instantly mount your boot partition.

For the purposes of theft-risk-reduction by common criminals, this is probably more than enough protection. However, if you are seeking to thwart the NSA which may try to brute-force your password using a server farm for 5 years, it may not be 😉

Tags: ,

On ESX 5.5U3, I recently ran into an annoying issue with HA. vSphere had recently been updated, but the hosts had not been all yet received the very latest version of the FDM (fault domain manager, aka ‘HA’) agent.
During some routine maintenance work, a particular host was taken in and out of maintenance mode a few times. Eventually it was observed to no longer properly complete HA configuration. Checking the host status in the UI, it would seemingly get stuck in the install phase of the latest FDM agent.

Checking the FDM installer log ( /var/run/log/fdm-installer.log ) , I found the following:

—————————————————————-
fdm-installer: [40283] 2016-08-25 11:16:13: Logging to /var/run/log/fdm-installer.log
fdm-installer: [40283] 2016-08-25 11:16:13: extracting vpx-upgrade-installer/VMware-fdm-eesx-2-linux-4180647.tar
[40283] 2016-08-25 11:16:13: exec rm -f /tmp/vmware-root/ha-agentmgr/upgrade
[40283] 2016-08-25 11:16:13: status = 0
[40283] 2016-08-25 11:16:13: exec cd /tmp/vmware-root/ha-agentmgr/vpx-upgrade-installer
[40283] 2016-08-25 11:16:13: status = 0
fdm-installer: [40283] 2016-08-25 11:16:13: Installing the VIB
fdm-installer: [40283] 2016-08-25 11:16:18: Result of esxcli software vib install -v=/tmp/vmware-root/ha-agentm
fdm-installer: Error in running rm /tardisks/vmware_f.v00:
fdm-installer: Return code: 1
fdm-installer: Output: rm: can’t remove ‘/tardisks/vmware_f.v00’: No such file or directory
fdm-installer:
fdm-installer: It is not safe to continue. Please reboot the host immediately to discard the unfinished update.
fdm-installer: Please refer to the log file for more details.
fdm-installer: [40283] 2016-08-25 11:16:18: There is a problem in installing fdm vib. Remove the vib…
[40283] 2016-08-25 11:16:18: exec esxcli software vib remove -n=vmware-fdm.vib
[NoMatchError]
No VIB matching VIB search specification ‘vmware-fdm.vib’.
Please refer to the log file for more details.
[40283] 2016-08-25 11:16:19: status = 1
fdm-installer: [40283] 2016-08-25 11:16:19: Unable to install HA bundle because esxcli install return 1

—————————————————————-

This was decidedly odd. I checked the /tardisks mount, and could, indeed, not found any vmware_f.v00 file. It was trying to ‘remove’ (unmount, as it turns out) a file that did not exist. And this was breaking the uninstall process.

This page was useful in understanding the sequence of events: http://vm-facts.com/main/2016/01/23/vmware-ha-upgrade-agent-issue-troubleshooting/

What I can only speculate as to what happened, is that at some point in the sequence of taking the host in and out of maintenance, the FDM uninstall somehow failed to complete properly, and left the host image list in a strange, invalid state.

Querying the host in this state, it listed the old FDM agent as still installed:

————-
# esxcli software vib list | grep -i fdm
vmware-fdm                     5.5.0-3252642                       VMware  VMwareCertified   2016-02-03
————-

Yet a force uninstall of the VIB would fail with the same error.

————————
fdm-uninstaller: [] 2016-08-24 11:42:30: exec /sbin/esxcli software vib remove -n=vmware-fdm
Removal Result
Message: Operation finished successfully.
Reboot Required: false
VIBs Installed:
VIBs Removed: VMware_bootbank_vmware-fdm_5.5.0-3252642
VIBs Skipped:
fdm-uninstaller: [] 2016-08-24 11:43:58: status = 1
fdm-uninstaller: [] 2016-08-24 11:43:58: exec /sbin/chkconfig –del vmware-fdm
———————-

Together with VMware support, we tried various tricks, including copying a fresh imgdb.tgz from a different host to /bootbank , and running the latest installer and uninstaller of the FDM agent manually.
By the way, the source that vCenter uses for the FDM agent installer and uninstaller, is (on Windows) “Program Files\VMware\Infrastructure\VirtualCenter Server\upgrade”

If you wish to try to run these files directly on an ESX host, simply copy them to the host /tmp and chmod them to 777. They are then executable.

But in all cases, the FDM installer first will always try an uninstall of the previous verison, which always includes trying to unmount /tardisks/vmware_f.v00

Now /tardisks is a bit of a strange bird, and deserves some explanation. This VMware research paper turned out to be a very excellent document in understanding what /tardisks actually is and does: https://labs.vmware.com/vmtj/visorfs-a-special-purpose-file-system-for-efficient-handling-of-system-images

In short, it is a directory that hosts mounted TAR files, that are loading at boot time from /bootbank (or /altbootbank). These TAR files are mounted as live filesystems, using what VMware calls VisorFS. Which makes the mounted TAR files behave as part of the file system. This has various administrative and management advantages as the paper linked above explains.

It is therefore not possible to simply copy a missing file to /tardisks in order to force the FDM uninstaller to properly complete.

You can list which TAR filesystems ESX has mounted, by running the command  esxcli system visorfs tardisk list

 

This list will be the same as the filelist of /tardisks

Of note: when you re-install FDM, just after the install, the ‘system’ flag will be set to false, until you reboot. After a reboot, it will be set to true like all other modules.

On a normal host, you will find the FDM VIB listed here.

In our case, this entry was missing, even though the Vib list command showed it as installed.

So it seemed to me that if ESX needed to mount these TAR files at boot time, there was probably a command it used to do this.
Or in any case, I found it likely such a command should exist, if only for troubleshooting purposes.
I wondered that if I could mount this TAR manually, the uninstaller might proceed normally.
A few minutes of google-fu later, I stumbled on this:
Creating and Mounting VIB files in ESXi

Now the VMware engineer noted that the vmkramdisk command has been deprecated since 4.1, but to both our surprise (and delight) it was still there in 5.5, and still did its job.

We manually mounted the /bootbank/vmware-f.v00 using the command vmkramdisk /bootbank/vmware-f.v00

Immediately you will find vmware-f.v00 listed under /tardisks, and using esxcli system visorfs tardisk list

And as predicted, the installed passed through the uninstall this time, without a hitch, and then installed the new version of the HA agent. We rebooted the host just to be sure it would properly load the new VIB each time. And it did, and managed to initiate HA in the cluster without any issues thereafter.

 

Tags: , , , , , , , , , , , , ,

We all know about the VMware case numbers. Each SR you open gets a nice number.

Internally, VMware has a problem database. Newly found bugs end up in there. And if you spend a lot of time with VMware support, you will end up hearing a lot about these internal PR (problem reports).

Here is a cool fact you may not know. Hidden in the HTML source of the public release notes that VMware produces, are the actual PR numbers associated with the issue that is described as having been fixed. (or not fixed).

Take the NSX 6.2.0 release notes for example: https://www.vmware.com/support/nsx/doc/releasenotes_nsx_vsphere_620.html

View the source:

And if you scroll down to the fixes, you will find:

Its those DOCNOTE numbers, that are the actual PR numbers. Sometimes they also list the public KB number too.  But there are far more interal PR numbers than there are public KB equivalents.

So how can this help you?

Well for one thing, you can start asking intelligent questions of VMware support, like: ‘Has something like my issue been reported before in the PR database?’ (prompting the engineer to go look for it, which they don’t always do on their own accord 😉
Or you can use it as a validation check. If your issue is scheduled to be fixed in an upcoming patch, ask the support engineer for the associated PR numbers! That way, you can verify yourself in the release notes, if the fix was included!
The process of getting a new patch or update through QA is quite involved, and sometimes fixes fall by the wayside. This is not immediately known to everyone inside VMware. So its always worth checking yourself; trust but verify.

 

 

Tags: , , , , , , ,

Jemimus on January 31st, 2016

This year I need to refresh my VMware VCP cert, so I have started to look around for educational materials to help with this.

image

I decided several years ago that I would never again buy any IT book in physical form, if that book ran the risk of being outdated quickly. This is especially true for product-specific books. My reasoning is: buying a physical book that has a limited shelf life is wasteful. And it is usually the case that eBook versions are cheaper. Finally, when I study, I tend to do so when the opportunity and motivation arises which could be anywhere at any time. Often when I travel. So I benefit from a flexible digital format that will follow me around my various devices.

VMware refreshes their core vsphere product every few years, so here is a prime example of the kind of subject I would not buy a physical book for.

It’s slightly shocking to me to see how expensive a ‘Mastering vSphere’ book now is.  My main issue is that this book has a utility that is severely limited in time. I would have far less trouble dishing this kind of amount out for books that I could proudly put on my shelf the rest of my life. This amount is so off putting, I will forgo purchasing this book this time around, and will seek other means of getting my coverage of the product. But this is a real shame. I like these books, and I appreciate the effort put into then. But these prices are just not worth it.

Tags: , , , , , ,

Warning: This is kind of a rant.

Sometimes I really have to wonder if the engineers who build hardware ever even talk to people who use their products.

Though I love the EMC VPLEX, I get this feeling of a ‘disconnect’ between design and use more strongly with this product than with many others.

This post is a typical example.

I noticed that one of my vplex clusters apparently does not have the correct DNS settings set up.

Now, Disclaimer: I am not a Linux guy. But even if I was, my first thought, when dealing with hardware, is not to treat it as an ordinary Linux distro. Those kind of assumptions can be fatal.  When its a complete provided solution, I assume and it is mostly the case,that vendors supply specific configuration commands environments to configure the hardware. It is always best practice to follow vendor guidelines first before you start messing around yourself.   Messing around yourself is often not even supported.

 

So, lets start working the problem:

 

My first go to for most things is of course google:

 

Now I really did try to find anything, any post by anyone, that could tell me how to set up DNS settings. I spent a whole 5 minutes at least on Google :p

But alas, no, lots of informative blog posts, nothing about DNS however.

Ok, to the manuals. I keep a folder of VPLEX documentation handy for exactly this kind of thing:

 

 

 

docu52651_VPLEX-Command-Reference-Guide MARCH2014.pdf

 

 

Uhh.. nope.

docu52646_VPLEX-Administration-Guide MARCH2014.pdf

AHA!

 

 

Uhh.. nope.

 

docu34005_VPLEX-Configuration-Guide MARCH2014.pdf

Nope

:(

 

 

 

Ok, something more drastic:

docu52707_VPLEX-5.3-Documentation-Portfolio.pdf

3 hits. THREE.. really?

 

Yes.. I know the management server uses DNS. *sigh*

 

 

 

Oh.. well at least I know that it uses standard Bind now, great!

 

 

 

 

oh, hi again!

 

 

Ok, lets try EMC Support site next:

Uhhmm..    only interesting one here is:

( https://support.emc.com/docu34006_VPLEX-with-GeoSynchrony-5.0-and-Point-Releases-CLI-Guide.pdf?language=en_US )

director dns-settings create, eh??

Ok then!

Getting exited now!

\

 

‘Create a new DNS settings configuration’

Uhmm.. you mean like… where I can enter my DNS servers, right? Riiiiight?

 

Oh.. uh.. what?  I guess they removed it in or prior to Geosyncronity 5.3 ?    :p

:(

Back to EMC support

Nope.

 

 

Nope.

So… there is NO DNS knowledge anywhere in the EMC documentation?  At all???  Anywhere??

 

Wait! Luke, there is another!

 

SolVe (seriously, who comes up with these names) is the replacement to the good ole ‘procedure generator’ that used to be on SupportLink.

Hmm… I dont see DNS listed?

Change IP addresses maybe??

Hmm…  not really.. however I see an interesting command: management-server

Oh… I guess you are too good to care for plain old DNS eh?

 

And this is the point where I have run out of options to try within the EMC support sphere.

And As you can see, I really really did try!

 

So…   the Management server is basically a Suse Linux distro, right?

vi /etc/resolv.conf

Uhm… well fuck.

Now, I am logged into the management server with the ‘service’ account. The highest-level account that is mentioned in any of the documentation. of course, it is not the root account.

sudo su – …  and voila:

There we go!

 

Which brings me to another thing I might as well address right now.

The default root account password for vplex management server is easily Googlable. That is why you should change it. There actually is a procedure for this: https://support.emc.com/kb/211258
Which I am sure no one ever anywhere ever has ever followed.. that at least is usually the case with this sort of thing.

Here is the text from that KB article:

The default password should be changed by following the below procedure. EMC recommends following the steps in this KB article and downloading the script mentioned in the article from EMC On-Line Support.

Automated script: 

The VPLEX cluster must be upgraded to code version 5.4.1 Patch 3 or to 5.5 Patch 1 prior to running the script.

Note: VS1 customers cannot upgrade to 5.5, since only VS2 hardware is capable of running 5.5. VS1 customers must upgrade to 5.4 SP1 P3, and VS2 customers can go to either 5.4 SP1 P3, or 5.5 Patch 1.

The script, “VPLEX-MS-patch-update-change-root_password-2015-11-21-install” automates the workaround procedure and can be found at EMC’s EMC Online Support.

Instructions to run the script: 

Log in to the VPLEX management-server using the service account credentials and perform the following from the management-server shell prompt:

  1. Pull down a copy of the “VPLEX-MS-patch-update-change-root_password-2015-11-21-install” script from the specified location above and then, using SecureCopy (scp), copy the script into the “/tmp/VPlexInstallPackages/” directory on the VPLEX management server.
  2. The permissions need to be changed to allow execution of the script using the command chmod +x.

service@ManagementServer:~> chmod +x /tmp/VPlexInstallPackages/VPlex-MS-patch-update-root_password-2015-11-21-install

  1. Run the script as shown below.

Sample Output:

This script will perform following operation:
– Search and insert the IPMI related commands in /etc/sudoers.d/vplex-mgmt.
– Prompt for the mgmt-server root password change.
Run the script with “–force” option to execute it

service@ManagementServer:~> sudo /tmp/VPlexInstallPackages/VPlex-MS-patch-update-root_password-2015-11-21-install –force

Running the script…

– Updating sudoers
– Change root password
Choose password of appropriate complexity.

Enter New Password:
Reenter New Password:

Testing password strength…

Changing password for root.

Patch Applied

NOTE: In the event that the password is not updated, run the script again with proper password complexity.

  1. Following running of the script, from the management server, verify that password change is successful.

Sample output:

service@ManagementServer:~> sudo -k whoami
root’s password:
root

***Contact EMC Customer Service with the new root password to verify that EMC can continue to support your VPLEX installation. Failure to update EMC Customer Service with the new password may prevent EMC from providing timely support in the event of an outage.

Notice how convoluted this is. Also notice how you need to have at least 5.4.1 Patch 3 in order to even run it.

While EMC KB articles have an attachment section, this script in question is of course not added.

Instead, you have to go look for it yourself, helpfully, they link you to: https://support.emc.com/products/29264_VPLEX-VS2/Tools/

And its right there, for now at least.

What I find interesting here is that it appears both the article, and the script, have been last edited.. .today?
Coincidental. But also a little scary. Does this mean that prior to 5.4.1 Patch 3 there really was no supported way to change the default vplex management server root password? The one that every EMC and VPLEX support engineer knows and is easily Googlable? Really? 

I think the most troubling part of all this is that final phrase:

Failure to update EMC Customer Service with the new password may prevent EMC from providing timely support in the event of an outage.

Have you ever tried changing vendor default backdoor passwords, and see if their support teams can deal with it?  Newsflash: they can not. We tried this once with EMC Clariion support. Changed the default passwords. We dutifully informed EMC support that we changed them. They assured it this was noted down in their administration for our customer.

You can of course guess what happened. Every single time EMC support would try to get in, and complain that they could not. You had to tell them every single time about the new passwords you had set up.  I am sure that somewhere in the EMC administrative system, there is a notes field that could contain our non-default passwords. But no EMC engineer I have ever spoken to would even look there, or even know to look there.

If you build an entire hardware-support infrastructure around the assumption of built-in default password that everyone-and-their-mother knows, you make it fundamentally harder to properly support users who ‘do the right thing’ and change them. And you build in vulnerability by default.

Instead, design you hardware and appliances to generate new and unique strong default passwords on first deployment, or have the user provide them (enforcing complexity). (many VMware appliances now do this). But do NOT bake in backdoor default passwords that users and Google will find out about eventually.