Archive for the ‘vexpert’ Category

NLVMUG2018 – Speaking on NSX microsegmentation and a community panel discussion #vexpert

Thursday, March 15th, 2018

Its always exiting to speak publicly and this year I am setting my bar higher by participating in 2 sessions.

First up is a panel discussion that I was very happy to be invited to by Francisco Perez van der Oord, one of the directors of ITQ. We will have a 45 min flow of topics around SDDC, NSX, Cloud, etc, and the general trends of technology as they impact vSphere admins. We titled the session “vSphere – .. and then what next?”  Never participated in a panel discussion on stage before, so that will be an interesting experience. The other participants are, imo, giants in the dutch VMware community: Joep Piscaer of OGD/Jumbo and Viktor van den Berg of PQR, and I feel quite humbled being on stage with them.
https://nlvmugusercon2018.sched.com/event/E1wh/sddc-cmp-and-nsx-discussions-with-the-community

 

My second session is my own talk, 20 minutes, on NSX Microsegmentation in practice. This is a condensed version of the talk I gave at the Infosecurity conference last year.
In it I cover some practical tips about using NSX Microsegmentation, do’s and dont’s, and common Gotcha’s
Its actually quite tough to get all the essentials into 20 mins or so, so it will be dense and fast-paced (as usual for me).

https://nlvmugusercon2018.sched.com/event/E1wy/micro-segmentatie-in-de-praktijk

Nervous, but really looking forward to the day. I love the VMUG concept and I love networking and seeing all the community in the flesh again (as apposed to only on Slack/Twitter)

LinkedIn embeds:

 

ESX Update Failure because of lack of space in /core

Wednesday, February 28th, 2018

On ESX 6.5, ran into an issue where updates from vSphere Update Manager (VUM) where refusing to install, due to 2 different errors, both having the same root cause.

VUM will through an error 15 in the UI, but if you look at the /var/log.esxupdate.log on the esx host itself, you will see in more detail what is going on.

It should be noted that “The host returns esxupdate error code:15” is a highly generic error message you might get at remediation, and can be caused by a bunch of different causes, including a corrupt update manifest file, corrupt bootbank, currupt VIB file or corrupt local  temporary patch database.

In the screenshot below of esxupdate.log, you can see that the temporary patch database was unable to be created in /locker/package/var/db/locker

different way this problem may present, is as a ‘broken pipe’ error ( ‘{errno 32] Broken pipe’

Notice that in both cases, it is failing on the large, 200mb VMware_locker_tools-light bundle

 

Both /core and /locker symbolic link to one of the ESX partitions. In this case, the partitions are on a mirrored SDCard. These are of type vfat

If you cd into /core you will end up in these partitions

Using DF -h you can check how much free space there is. As you can see in our case, just a little over 200mb remains.

That is not much, especially of you consider that the vmwaretools locker light bundle is itself about 200mb

Check the /var and /packages directory tree in this partition for files that can be cleaned up.

In the screenshot above, you can see that there appears to be a 73mb hostd core dump file sitting in /store/var/core

Unless you really need these to send to GSS (global support) for example, they can be deleted.

Similarly, you can also delete the old VMware tools bundles, unless you need them

/locker/packages/6.5.0/vmtools/

These bundles are only used if you choose to auto-install VMware tools directly to a VM, using the UI or API

In practice, with most environments, this feature is not used (or very rarely), because most people either use the Open VMware tools included in Linux OS, or include VMware tools in a template or golden image. Or auto-install it with config management like puppet, Ansible or vRO.
So to save some space in case you have large update packages that don’t fit in /core, you can consider deleting these files too, they are about 200mb in all, after all.

Be aware though, that updates to esx-base or specifically named vmware tools updates, will of course, reinstall these files.

 

VMworld 2017 EU Day 1 (part 1) #vexpert

Tuesday, September 12th, 2017

Had a super productive first day at VMworld!
The Partner day is typically a bit quieter than the rest of the week, and more sales-oriented in the breakout-sessions. But I only got 1 session in anyway, as the rest of the day was focused on, imo, more valuable private sessions with various VMware groups.

UX Design Session VMware on AWS

First up was the a VMware User Experience design session based around VMware on AWS.  We only had an hour and that barely touched on all feedback we could give. We ended up going only through the initial first setup wizzard and discussing a lot about how and where it integrates with Amazon AWS structures.  Extremely useful to also get a first impression of VMware on AWS, but I think I will go for the Hands-On Lab here at VMworld, to get a more general overview.

A User Experience design session can be a strange experience if you don’t know what to expect.  Its the session leaders responsibility t mostly listen and observe how people experience the product, strongly from a user-interface perspective.  They will ask you specific questions such as “what is the first thing on this screen your eyes are drawn to?”, “when I click on this button, what is your expectation of what will happen” , “Does this popup meet your expectations?”.  It was a surprising amount of fun.

 


Participation is rewarded with swag! You can expect some unique gifts for getting involved on the day. We don’t do it for the swag, but its of course appreciated 😉

GSS Leadership Session

At Redlogic, through our engagement with our main customer, we have enjoyed a very close working relationship with VMware GSS in Cork. We have weekly meetings to discuss open SR’s, and have even been toured around personally by the Director of GSS in Cork.  So every year at VMworld, its a pleasure to meet up with the GSS team in person and talk about the past year of support, the roadmap for our customer going forward, and any areas things can improve.  While previous years might have spent talking about issues about NSX, we where pleased to talk about all the stuff that has now been fixed and how stable the VMware software stack is overall. Even if you have nothing to complain about, its good to give feedback and to emphasize and celebrate success together.

NSX Product UI Feedback and Preview session

There are not many companies who have claimed to have worked with NSX for over 3 years. So our feedback is valued and this is noticeable. I gave detailed and deep feedback over aspects of the distributed firewall UI and about management of NSX Edge appliances, which we do a lot with.

Also got a preview of upcoming changes and ideas about the NSX UI, which was very cool.

The ability to give direct feedback, to talk one-on-one with product managers about the product and the roadmap, is in my opinion far more valuable that visiting breakout sessions (that you can watch later online anyway).  I take giving feedback seriously and enjoy it, and VMware has an absolutely healthy attitude about feedback.

 

Lego

😉

 

VMware Technical Support Summit 2017, Global Support Services, Cork

Friday, May 26th, 2017

 

Last week myself and Erwin had the opportunity to attend the VMware Technical Support Summit in Cork, Ireland.
This is a 2 day event hosted by GSS which is technically oriented. Many interesting sessions by some of the best GSS technical talent, and various breakout sessions to get near one-on-one time with engineers and product leaders.

 

As you can see by the schedule below, it was a wide array of product fields and the technical depth varied between good and amazing.

I was especially blown away by Valentin Bondzio’s talk about CPU accounting in the hypervisor. Technically extremely interesting!
He dove extremely deeply into CPU metrics and what ‘idle’ and ‘use’ really mean from an architecture point of view, and how hyperthreading changes the game.
This was especially gratifying as I have worked with him on a case for over a year that tackles exactly that aspect of hypervisor performance.

The team that support Airwatch gave a very interesting talk that was relevant to a project I am working on.  So afterwards I grabbed all 4 of them into a conference room at the hotel, to discuss our VDI and mobile management design ideas. Extremely valuable opportunity as I bet I will be talking to these guys more in the future!

They also very graciously dropped me and Erwin of at our Hotel afterwards, and then drove us to the city center, where VMware hosted drinks and dinner with live music, which was quite entertaining.

Another talk I was looking forward to was that of Cormac Hogan, and Mark Fitzgerald, senior director of support Cork, presented him with some gifts for just plan being around a long time 😉

 

I had gently badgered various VMware contacts about seeing if we could get a visit of the actual VMware campus, and eventually Mark Fitzgerald himself very graciously drove us over to the VMware office campus and gave us a a personal guided tour of all the GSS offices, which was a great great experience.

We got to meet every Cork GSS team and even visited the test lab including the folks who run that. It was great to put voices and names to faces and to get a real sense of the environment that these engineers work in. Sometimes support engineers are faceless and nameless, but VMware seems to breath a very human and supportive culture that was much in evidence throughout the campus.

 

 

I would much like to thank the entire VMware Cork team and all the GSS engineers for putting on a great summit. And a special thank you to Danka for showing up in the middle of her time-off, just to say hi. Much love to VMware’s best escalation manager!

 

 

Solaris 11 on ESX – Serialized Disk IO bug causes extreme performance degradation #vexpert

Wednesday, March 29th, 2017

In this post, I discuss a newly found performance bug in Solaris 11, that has since Solaris 11 came out in 2011, severely hampered ESX VM disk i/o performance when using the LSI Logic SAS controller. I show how we identified the issue, what tools were used, and what the bug actually is.

In Short:

A bug in the disk controller driver ‘mpt_sas’ as used in Solaris 11, as used by the VMware virtual machine ‘LSI Logic SAS’ controller emulation, was causing disk I/O to only be handled up to 3 i/o at a time.

This causes severe disk i/o performance degradation on all versions of Solaris 11 up to the patched version. This was observed on Solaris 11 VMs on  vSphere 5.5u2, but has not been tested on any other vSphere version.

The issue was identified by myself and Valentin Bondzio of VMware GSS, together with our customer, and eventually Oracle. Tools used: iostat, esxtop, vscsiStats

The issue was patched in patch# 25485763 for Solaris 11.3.17.5.0, and in Solaris 12

Bug Report ( Bug 24764515 : Tagged command queuing disabled for SCSI-2 and SPC targets  ) : https://pastebin.com/DhAgVp7s

Link to Oracle Internal

KB Article: (Solaris 11 guest on VMware ESXI submit only one disk I/O at a time (Doc ID 2238101.1) ) : https://pastebin.com/hwhwiLRM

Link to Oracle Internal

————————

TLDR below:

(more…)