Well it’s been literally months since I last weighed my travel kit, but with my travel ramping up, up and away lately and the kit having undergone a pretty extensive transformation, I decided it was time for an update.  Last round weight was amazing at 8.75lbs, but still significantly up from the incredible 6.3lbs from “back in the day” (my 2007 kit).  One change is that this round I also opted to weigh the actual laptop case.  In the past I was using primarily ultralight messenger bags, but these days I’m using a backpack.  On to the gory details!

First the numbers:

  1. Galaxy Note 10 LTE 2014 with Samsung cover                       1lb 8oz  (replacing the iPad Air)
  2. Nokia 1520 LTE with Nokia cover                                            12oz (replacing the Nexus 7 which actually died)
  3. Razer Blade 14 2014                                                            4lb 5oz (replacing the Retina 15)
  4. Razer Blade power supply                                                    1lb 2oz (the Achilles heal of even high end PC laptops)
  5. iPhone 5s with Mophie Juice Pack                                            8oz (the only survivor!)
  6. 2TB portable Toshiba USB 3 drive                                           8oz
  7. Accessories stack:                                                                 4lb
    1. retractable ethernet, HDMI, USB (x5), audio
    2. HDMI to VGA
    3. USB 3 hub
    4. USB multi-cable/charger (VMware swag)
    5. mini power strip with USB charging
    6. high AMP USB charger
    7. multi port USB charger
    8. Verizon LTE hotspot
    9. AT&T LTE hotspot
    10. ZyXEL travel router
    11. Plantronics USB collapsible headset
    12. Razer Orochi Mouse
    13. Logitech Presentation remote
    14. Lenmar Powerwave 6600 battery/USB charger
    15. Audio Technica headset
    16. 256GB USB thumb drive
    17. 64GB USB thumb drive
  8. VMware solar panel equipped backpack                                 2lb 3oz

Total weight: 14lb 14oz

WOW!  Now that is some serious weight right?  Basically the current kit is as heavy as the past two kits combined! Certainly not a good development with travel becoming more intense and more difficult, but there are a few clarifying points here.  First, in the past I hadn’t been weighing the actual bag, so that adds a pound or two onto the old totals (call them 10 and 7.5)  That’s still a sizable weight increase, but there are lots of extra accessories these days.  The more time you spend on the road, the more you want to really be ready for a wide range of contingencies.  So the multiple hotspots and audio devices, the portable drives and the battery backups are all new.

So netting this out, it’s a lot more weight, but a ton more capability.  The more shocking change, perhaps, is the turnover in gear composition.  Last entry I pointed out that with cloud services and cloud based data, mature file formats and more multi platform apps, and mobile devices driving commoditization, moving between platforms is pretty low friction so a move to OSX isn’t as jarring as it once was.  Well this cuts both ways and a move back is just as easy.  I documented recently the strange issues I had with the Mac which seemed to sort themselves out, but I do see an occasional odd system halt.  With the Mac coming up on two years old, and being something less than fully stable, I decided it was time to relegate to home duty and get a new kit for the road.  The mid-life refresh (Haswell/GT750) isn’t super exciting, but the new Razer really is!  With a 3200×1800 screen and a GTX870, paired with the expected quad Haswell 2.2/3.2, 8GB RAM and fast 256GB SSD, it packs a solid punch for work or gaming.  The screen quality is phenomenal and is touch based, the build quality is excellent (near Mac level), and the entire thing is almost a slightly smaller, tiny bit lighter, black version of the Macbook.  Except of course for the Windows part.  Which admittedly, is a mixed bag.  I can probably say that I have something of a love/hate with Windows 8.1.  I’ll save that for another entry though.

On the tablet front my Nexus 7 just up and died (not uncommon) so I decided to replace it with Windows Phone in order to have exposure and access to all three platforms (useful for my job).  Because I really do like handwritten notes (one thing I missed about both my old Windows tablets and my recent Galaxy Note 2 phone) I decided to sell the iPad Air in favor of the Galaxy Note 10 2014.  The new Note 10 is also a screen triumph (2560×1600), and has plenty of horsepower, but as always I have a similar love/hate with Android (even KitKat) as I do with Windows 8.1.  Google has joined Microsoft as having a lot to learn from Apple about consistent and fluid UX/UI design.

So the while Redmond had no representation last round, this time they have come raring back with two significant devices in the mix!  Maybe next time we’ll have a Chromebook!  Honestly though, I have to say that we still have a long way to go in terms of ecosystem integration maturity.  I feel there isn’t nearly enough “payoff” for fully committing to a single vendor story (even Apple).  There is of course some settings and data sync tied to universal id from all three (Google/Apple/MSFT), and there is the consistent tablet/phone app story if you decide for mobile device vendor redundancy, but I’d like to see a lot more.  Maybe next generation of service/OS things will improve.

That’s it for now, but here are some parting shots of the big weigh in!:

2014-08-17 23.07.49 2014-08-17 23.12.09


I recently decided to upgrade the ReadyNAS Ultra to make room for some new storage requirements.  The ReadyNAS remains a surprisingly powerful and flexible device so things went well overall, but there was some weirdness that is worth documenting in case others run into it as well and are wondering if it is normal or will cause issues.  To review, my current ReadyNAS Ultra is configured as follows:

 

  • Ultra 6
  • RAIDiator-x86 4.2.26
  • All bays full 6 x 2TB (mix of WD and Seagate “green” series drives)
  • X-RAID2, single redundancy mode

For anyone not aware, X-RAID is a clever protection scheme which brings an added layer of flexibility while still providing standard RAID protection.  As a quick primer, the benefits are:

  • allows a mixture of disk sizes
  • allows for dynamic expansion of an array
  • provides single disk redundancy (RAID 5 analog) or dual disk redundancy (RAID 6 analog) while maintaining the above benefits

Some caveats to X-RAID are:

  • volume can only grow by an order of 8TB from it’s original size (for all of these caveats it is usable space being measured, not raw… so measure post protection volume size)
  • volume cannot be larger than 16TB without initiating a factory reset.  So for example, if you started with a 10TB volume, even though technically you could go to 18TB without violating the “no larger than +8TB from inception” rule, you would be stopped because the final volume would be larger than 16TB.  Factory reset is data destructive so, while not a show stopper, this restriction is definitely one to watch as it can turn a simple expansion project into a very complex one requiring full backup/restore and a companion device that can absorb a potentially massive amount of data
  • drives are grouped into layers by size.  What this means is that if you add 4TB drives in as replacements for 2TB drives, you create a 4TB disk layer and a 2TB disk layer that are, transparently to you, contributing to a single virtual volume.  The minimum require disk count per spindle size is 2 in order to retain protection.  Disks are replaced one at a time.  So in the case of 6 2TB drives, 1 is replaced by a 4.  Once sync’d, a second has to be replaced before the array is protected again.  Once 2 4TB drives are part of the volume, protection will be in sync again.  At this point the space sacrificed to volume protection will jump from a single 2TB spindle to one of the 4TB spindles (space inefficient, but at least it allows mixing).  In dual disk redundancy modes, the spindle counts are doubled.
  • drive sizes can only go up, not down.  So if you jump from 2TB disks to a mix of 2TB and 4TB, you can no longer add a 3TB disk

With all of the above in mind, I decided to move forward with the 2TB to 4TB scenario, replacing 2 of my Seagate disks with the new Hitachi HST efficiency series 5400 RPM 4TB disks.  Following the process as prescribed by Netgear went well.  I pulled one of the 2TB disks, swapped the 4TB into the carrier (this took longer than the required 10 seconds you need to wait before swapping back in) and then installed the 4TB disk into the array.  The Netgear immediately flipped to “unprotected” and “disk fault” upon removal of the 2TB disk and switched over to “resyncing array” about 5 minutes following installation of the 4TB disk.  The first resync took 26 hours.  This is on a 9.25TB usable array which was about 33% full.  After resync, I did a reboot just for good measure.

After reboot I repeated the procedure with the second disk.  This time the process took about 18 hours.  So things improved which is good!  Upon completion of this resync, the array status flipped back to “protected”, but each of the 4TB disks was only utilized at the 2TB level (1875GB usable).  This is because the 4TB disks were added into the 2TB layer as 2TB disks.   At this point, a reboot is required in order to get the array to actually resize.  Following this reboot, the status of the ReadyNAS switched to “array expansion” and the GUI started updating progress against the eventual target size.  This is where things got weird:

Screenshot 2014-08-06 16.27.40

As you can see, the GUI was reporting that the new size would be 10TB usable – a mere 750GB up from the old array size of 9.25TB.  Some quick math shows that this is either incorrect, or something is wrong:

ORIGINAL ARRAY:

  • 6 x 2TB = 12TB RAW
  • 6 x 1875GB usable = 11,250GB usable
  • 1 disk sacrificed to protection = 9375GB usable
  • 100GB for snapshot storage = 9275GB usable as expected

Now let’s consider the new array:

  • 4 x 2TB = 8TB RAW
  • 2 x 4TB = 8TB RAW
  • 4 x 1875GB usable = 7,500GB usable
  • 2 x 3750 usable = 7,500 usable
  • Total RAW = 16TB, Total usable = 15TB
  • 1 4TB disk sacrificed to protection = 11,250GB
  • 100GB for snapshot storage = 11,150GB usable

This is pretty far off from the flat 10TB being reported.  Binary/decimal translation aside (10TB vs 10TiB), we’re looking at over 1TB “missing”.  So what gives?  Well before panicing, I decided to have a look at the console.  Check out what a quick df in Linux reported:

Screenshot 2014-08-06 16.30.08

 

Ah ha!  11,645,691,704 1K blocks so, in other words, 11.6TB!  Much better.  The good news is that as I copy about 5TB up to the array, df is, as expected, reporting spot on accurate usage whereas the GUI is staying very fuzzy and very wrong.  The conclusion?  Something is up with the GUI post expansion (and post reboot as I rebooted twice to attempt to remediate).

So some final notes:

  • be mindful when expanding of the 8TB and 16TB limits
  • note that the minimum spindle size to maintain protection is 2 and that you will sacrifice a full 1 of those to reach the new protection size requirement
  • reboot as many times as you want during resync.  it won’t cause any issue
  • do not reboot during expansion as it might cause an issue
  • expect that the GUI might not report size correctly

This is where things stand so far.  As the situation develops or changes, I will update!

Sync My Clouds!

Posted: July 29, 2014 in Computers and Internet

As cloud services mature, one of the trickiest problems is definitely data sprawl. Issues of rationalization and migration of data become a challenge as information spreads across multiple services. If you consider music as an example, it is definitely possible to end up with a collection that spans Amazon Music, Google Music and iTunes. One of the only real ways to keep those particular services synchronized is to source them from a common distribution point, preferably living on a pure storage service. Of course depending on the size of your collection, this can require a fairly significant investment in cloud. In recent months, though, there has been an incredible land grab for consumer business that has seem rates for storage drop dramatically. Currently, this is how my personal spend/GB looks:

 

Base Storage for Subscription Tier Extra Storage (bonus, referral, etc) Monthly Cost Note
DropBox 100GB 7GB $10
OneDrive 1,020GB 10GB $11 Office365 Home Sub – lots more than just storage in here – plus 20GB base storage
Google Drive 100GB 16GB $2 Includes Gmail and Google+

Pretty impressive! Tallying things up, we’re looking at a total spend of $23 which provides:

  • 1253GB storage across 3 providers
  • Office 365 access (mail, SharePoint, Office Web Applications)
  • Office local install for Mac, PC, Android, IOS (multiple machines)
  • Live Mail, GMail, Google Plus
  • Desktop/device integration for all providers

To me this seemed like a fantastic deal for less than $25 a month and 1.24TB in the cloud is a ton of storage.  As a result, over the past few months, I have been shifting to a cloud only model for data storage.  The way I decided to run things was to make DropBox my primary storage service.  Despite having by far the worst economics (ironically DropBox has become ridiculously expensive compared to the competition), it has the best client integration experience as a result (IMO) of the service maturity.

So with DropBox in prime the next challenge was figuring out a plan for the secondary services.  At first I tried a model where I would assign use cases to each service.  So music in Google only, pictures on OneDrive only, documents across all 3.  This quickly fell apart as you wind up in a model where you need to selectively sync the secondary services, and you lose redundancy for some key use cases.  In analyzing my total usage pattern though, I found that as a high watermark I consume 75GB of space in the cloud (including documents, photos and music).  With the current $/GB rates, this data volume can easily fit in all 3 providers.  Realizing this I quickly moved to a hub/spoke sync model where I utilize OneDrive and Google Drive for backup/redundancy and DropBox becomes the master.  Of course the logistics of this proved very challenging having to utilize a middle man client to funnel the data around.  There had to be a better way. Wasn’t this a great idea for a startup? Well… Enter CloudHQ!

CloudHQ aims to provide a solution of the monumental task of cloud data sync.  As a premise it sounds amazing!  Just register with these guys, add your services, create some pairings, and let their workflow (and pipes) do the rest.  I’ve been tracking these guys for a while and it appears they are delivering. Of course the challenge is that to do meaningful work (more than one pairing) you need to pony up to the commercial level.  I held off a while to see how their service would mature.  Recently, though, they had a price drop that I feel represents a fantastic deal.  I was able to get onboard with the Premium level subscription for $119 by committing to 1 year. $10 a month is just a terrific price for a service like this so hopefully this price will lock-in moving forward.  Of course the service does have to work or it’s not such a great price right?  Well let’s see how things went!

First off… The sign-up and setup process was fantastic.  I actually went through the entire setup on an iPhone over lunch using my Google oID as a login.  Once signed up you can jump right in and get started.  Here is a shot of the basic mobile UI:

2014-07-29 17.05.00

 

I love how clean this is. Very clear how you can get started creating sync pairs using the supported named services.  Clicking one of those options will trigger a guided workflow.  In addition, you can setup your own sync pairs manually.  Either option brings you to service registration:

2014-07-29 17.06.22

CloudHQ currently supports a very nice set of services.  Supported services view from the desktop UI:

Screenshot 2014-07-29 21.38.15

 

Once services are registered and sync pairs registered, the service will start to run in a lights out fashion.  Updates are emailed daily and a final update message goes out once initial sync is completed.  The stages break down as follows:

  • Initial indexing and metadata population
  • Service sync (bidirectional)
  • Initial seeding complete
  • Incremental sync process runs indefinitely

In my case, there was about 75GB of data or so in play.  The biggest share was on DropBox and there was a stale copy of some of the DropBox data already sitting on both OneDrive and Google Drive.  In addition, there was a batch of data on both OneDrive and Google Drive that did not exist on DropBox.  The breakdown was roughly as follows:

  • DropBox – 56GB or so of pictures, documents and video
  • OneDrive – subset of DropBox content, roughly 5GB of picture data and 3GB of eBooks
  • Google Drive – subset of DropBox content, roughly 12GB of music and 5GB of picture data

The picture data was largely duplicated.  In approximate numbers, about 40GB had to flow in to OneDrive and Google Drive and about 15GB had to flow into DropBox.  Keeping an eye on sync status in the UI is terrific:

2014-07-29 17.06.35

 

In the desktop UI, there is great detail:

Screenshot 2014-07-29 22.20.02

 

The email updates are great.  Here is a sample of the initial email:

Screenshot 2014-07-29 22.25.24

These updates are very straightforward and will come daily.  The pair, and transfer activity for the pair, is represented.  In addition, there is a weekly report which provides a rollup summary:

Screenshot 2014-07-29 22.26.07

So how did the service do?  Quite well actually.  Here is my experience in terms of performance:

  • Account Created, services registered, pairs added:                                                           7/26 – 12:30PM
  • Indexing and initial metadata population complete, Evernote backup complete:     7/26 – 9:52PM
  • DropBox to GMail Complete, DropBox to OneDrive partial – 63GB copied:               7/29 – 10:30PM

No conflicts occurred and there have been no problems with any of the attached volumes.  I have to say I am extremely impressed with CloudHQ so far and pushing 63GB of bits around in a matter of 3 days is a fantastic “time to sync state”.

As my experience with the service increases I will continue to post updates, so stay tuned!

Upgrades!

Posted: July 12, 2014 in Computers and Internet

Well there is truly no rest for the weary. Or is it the wicked? Let’s compromise and say in this case it’s both! It’s no surprise that even a really sweet piece of kit like the Dell T620 isn’t going to stay stock for long at ComplaintsHQ where “live to mod” is a life motto. Luckily the recent generosity of family members wise enough to provide MicroCenter gift cards as presents provided just the excuse required to get some new parts.

It was hot on the heels of the initial install of the Dell that we added an SSD for VSAN testing and two ATI cards for vDGA View testing. Honestly though, vDGA isn’t cool. You know what’s cool? vSGA! For those saying “uh, what?”, both of these are technologies which allow a hardware GPU installed in the host to be surfaced in the guest OS (View desktops generally). With vDGA, a single GPU is dedicated to a single guest OS via Intel VT-D or AMD-Vi (IO MMU remap/directed IO technologies which allow a guest OS to directly access host hardware). This does work, but obviously isn’t very scalable nor is it a particularly elegant virtualization solution. vSGA, on the other hand, allows for a GPU installed in the host to be virtualized and shared. The downside is that there is a (very) short list of boards supported none of which I had on the shelf. The last item on the “to do” list from the initial setup was to get some sort of automated UPS driven shutdown of the guests and host in the (likely around here) event of power failure.

The current status to date (prior to the new upgrades) was that I had an old Intel X25 80GB SSD successfully installed and shared to the nested ESXi hosts (and successfully recognized as SSD) and vSAN installed and running. I also had a View config setup with a small amount of SSD allocated for temporary storage. With aspirations of testing both vSAN and running View 80GB of SSD really is tight so beyond saying “OK, it works!” not much could actually be done with this setup. Since SSDs are cheap and getting cheaper, I decided to grab this guy on super sale at MicroCenter for $99:

2014-07-12 15.52.02

While there I also picked up a small carrier to mount both SSDs in. I decided to also utilize some rails and mount the SSDs properly in one of the available 5.25 bays:

2014-07-12 16.00.03

The vSGA situation is certainly trickier than simply adding a budget SSD, but perusing eBay the other day, I happened upon a great find so, since I was upgrading anyhow, I jumped on it. Not only one of the few supported cards, but an actual Dell OEM variant for $225:

quadro4000

 

Another refinement I’ve been wanting to do to the server is to add power supply redundancy (mainly because I can leave no bay unfilled!).  I’ve committed to definitely resolving my UPS driven auto-shutdown challenge this round, so while not necessary, the redundant supply fits the theme well.  Luckily eBay yielded some more good results.  Dell OEM at $145:

2014-07-12 14.32.23

On the UPS side, you may remember that during the initial install of the server I had added in a BackUPS 1500 to run the ReadyNAS and the T620.  Unfortunately,  APC is a pain in the ass and VMware doesn’t make it any better.  Getting the ReadyNAS on managed UPS backup is as easy as plugging the USB cable in and clicking a checkbox using any APC unit.  In VMware, this is pretty much impossible.  Unless you buy not only the highest end of the SmartUPS line, but also buy the optional UPS network card (hundreds more), there is really no native support to be found.  I had explored some options using USB passthrough from the host to a Windows guest, combined with some great open source tools like apcupsd and Network UPS Tools.  I never quite got things working the way I wanted though.  More on that later…

OK, so that is the part list!  Total damage for all of the above was $900.  Steep, but almost half of it was actually the UPS.  As always, there is no better way to start healing from the emotional trauma of spending money than to start installing!  Let’s begin with the super easy stuff; the PSU.  I can honestly say that installing a new hot-swap supply in a T620 actually couldn’t be any easier.  First step is to access the back of the case and pop off the PSU bay cover (it pops right out):

2014-07-12 16.02.19

With the bay open, you literally just slide the new supply in and push gently (you will feel the connector catch and seat):

2014-07-12 16.03.06

Once installed, head into iDRAC to complete the power supply reconfiguration.  The options are very basic.  You can either enable or disable PSU hot sparing once the new one is in (and set which one is primary) and you can enable input power redundancy:

Screenshot 2014-07-12 18.28.55

OK, back to the UPS quandary! The general idea of VM based UPS control is as follows:

  • plug in UPS, plug server into UPS
  • attach UPS USB cable to server
  • enable passthrough for the USB channel (requires AMD-Vi or Intel VT-d, under Advanced Options in the Server Configuration in the VIM client)
  • add the USB device to a Windows (or Linux) guest VM
  • install the open source APC driver
  • install NUT
  • develop a script that fires off scripts on the ESX host prior to executing a VM shutdown (the host scripts will ultimately pull the rug out from under the UPS host VM which is fine)
  • make sure that VMware tools is installed in all VMs so they can be gracefully shutdown by the host
  • utilize either WOL (or an awesome ILO board like the iDRAC) to ensure that the server can be remotely brought back

Since I was in a spending mood, I decided to add a companion to my BackUPS 1500 just for the server.  Here she is:

2014-07-12 19.49.55

That is the SmartUPS 1000 2RU rack mount version.  So problem solved right?  Yeah no.  But before we get into that, let’s get this beast setup.  First the batteries have to be installed.  The front bezel pops off (it actually comes off and I popped it in for this photo) revealing a removable panel:

2014-07-12 19.49.36

A single thumb screw holds the panel in place.  Removing it allows the panel to be slid left and pulled forward revealing the battery compartment.  As always, the battery is pulled out by the plastic tabs, flipped over, and put back in where it will now snap into place (it’s own weight is enough to really seat it well if the unit is a bit angled).  The final product will look like this:

2014-07-12 19.49.02

In terms of connectivity, here is what you get (not joking):

2014-07-12 19.50.15

Yes, this is *one* USB cable and thats *it* for $450!

Now, let’s take a look at what APC requires for VMware host support:

  • a SmartUPS unit – check, we have this one
  • the optional network card – bzzzt… nope
  • serial only connection to the host – bzzzt… nope! (THIS one really pissed me off)

So somehow APC can’t figure out how to get a USB connected UPS working on ESXi, and the latest SmartUPS somehow has no included serial cable.  Really fantastic!  I considered a few options including attempting to do a DB9 to USB conversion using the RJ45 to USB cable from my lesser BackUPS 750, but I shot all of the options down.  USB to serial requires driver support and there is zero chance of getting that working on the host.   Some of the other options I considered were publishing serial over network, but this seemed like a poor approach also.  At this point, I was stumped and seriously considering returning the seemingly useless SmartUPS to MicroCenter.  Before packing it in, I decided to try one more approach.

Returning to the basic architecture I had planned for the BackUPS, but this time using the native PowerChute Business app included with the SmartUPS (at least it comes with something useful!), I setup UPS support on my vCenter.  Passing through USB worked from the host and PowerChute server, console and agent installed without a hitch and successfully located the UPS.  So far so good!

The critical step was now to figure out a way to get the vCenter guest to shutdown all of the VMs and the server once PowerChute detected a power event.  Luckily, it wasn’t too difficult and I was able to find this awesome script to handle the ESX side.  Here is the logic:

  • add a custom command in PowerChute.  The custom command calls Putty from the command line with the option to run a script on the host upon connection.  The command is inserted into “batchfile_name.cmd” in the APC\agents\commandfiles directory and should be formatted like this:
@SMART "" "C:\Program Files (x86)\putty\putty.exe" -ssh -l login -pw password -m C:\script.sh
  • the contents of “script.sh” is that amazing script above.  The gist of it is:
    • use the ESX command line tools to enumerate all running VM’s to a temp file (basic string processing on the output of a -list)
    • pipe that file into a looped command to shut them down (a for or while loop construct)
    • shutdown the host

Here are the contents of the script:

#/bin/sh
VMS=`vim-cmd vmsvc/getallvms | grep -v Vmid | awk '{print $1}'`
for VM in $VMS ; do
 PWR=`vim-cmd vmsvc/power.getstate $VM | grep -v "Retrieved runtime info"`
 if [ "$PWR" == "Powered on" ] ; then
 name=`vim-cmd vmsvc/get.config $VM | grep -i "name =" | awk '{print $3}' | head -1 | cut -d "\"" -f2`
 echo "Powered on: $name"
 echo "Suspending: $name"
 vim-cmd vmsvc/power.suspend $VM > /dev/null &
 fi
done
while true ; do
 RUNNING=0
 for VM in $VMS ; do
 PWR=`vim-cmd vmsvc/power.getstate $VM | grep -v "Retrieved runtime info"`
 if [ "$PWR" == "Powered on" ] ; then
 echo "Waiting..."
 RUNNING=1
 fi
 done
 if [ $RUNNING -eq 0 ] ; then
 echo "Gone..."
 break
 fi
 sleep 1
done
echo "Now we suspend the Host..."
vim-cmd hostsvc/standby_mode_enter

I am happy to say that it worked like a charm and successfully shutdown all VMs cleanly and brought down the host!  You can set some delays in PowerChute and I set them to 8 minutes for the OS shutdown and 8 minutes as the time required for the custom command to run, but it really won’t matter since the custom command will kill the VM (and PowerChute) anyhow.

A couple of things to be aware of with this approach:

  • the PCBE Agent Service needs “interact with desktop” checked on newer versions of Windows (2k8+).  Make sure to run the SSH client once outside of the script first to deal with any interaction it needs to do (saving fingerprint, etc)
  • the USB passthrough can be a bit flaky in that the USB device doesn’t seem to be available right at first OS boot (so the service may not see the UPS).  Eventually it does refresh and catch up on its own, however

Coming up soon will be the Quadro install and the SSD setup, followed by some (finally) notes on VSAN and accelerated View (both vDGA and vSGA), so stay tuned!


The VMware NGC client is definitely super convenient being entirely browser based, but the legacy client undoubtedly had its charms. Chief among those charms is the ability to manage an actual ESXi host rather than just a vCenter instance. Except on a Mac where it doesn’t work at all. Admittedly this isn’t a huge issue for production where vCenter will be highly available and the admin console is unlikely to be a Mac, but in a home lab, it becomes a huge issue. The solution? Enter WineBottler!

For those not familiar, WINE is a recursive acronym that stands for “WINE is not Emulation”. It dates back to the early days of Linux (1993) and the idea is to provide a containerized Windows OS/API experience on *NIX systems. In a very real way WINE is one of the earliest runs at application virtualization. It’s an extremely nifty idea but, as with all cross-platform “unofficial” app virtualization technologies, it is not 100% effective. The VIM client falls into the edge cases that require some tweaking to get to work. The good news, though, is that it can be done:

Screenshot 2014-07-11 04.37.44

OK, with the proof of life out of the way, let’s walk through exactly what it takes to get this thing working step-by-step.  Note that it will not work straight out of the box.  It will fail and need to be remediated.

Step 1: Download and install WineBottler.  This article is based on the (at time of publication) current stable release 1.6.1.

Step 2: With WineBottler installed, download the MSXML Framework version 3.0  and copy it into the “Winetricks” folder (/Users/username/.cache/winetricks/msxml3)  “Winetricks” are component installs that Wine can inject into the container during packaging (middleware, support packages, etc).  VIM requires .NET 3.5 SP1 which WineBottler has standard, but also requires MSXML version 3.0 which it does not.  The first pass through packaging will generate an error if this step isn’t completed, but the errors are extremely helpful and will provide both a download link for the missing package and the path to copy it to (so no fear if you miss this step)

Step 3: We’re now ready to bottle up some WINE!  Launch the WineBottler app and click the “Advanced” tab:

Screenshot 2014-07-11 10.42.58

Lots to explain here, so let’s take it one component at a time.

Prefix Template:  this option refers to the actual app container (the virtual environment that WINE bottler creates during this sequencing step for the application).  This can be either a new container, or based on a previously created one.  For now we are creating a new template, but later we will be reusing it.

Program to Install: this is the application we are virtualizing.  In our case, at this stage, we want the actual VIM install package (VMware-viclient-all-5.5.0-1281650.exe) which can be downloaded directly from the host at https://esxi-hostname.  This is an installer, so we want to select that option.  Later on we will be repeating this with the actual app, but for now we are going to use the installer to lay the groundwork.

Winetricks: as discussed, these are optional component installs.  Here we want to check “.NET 3.5 SP1″.

Native DLL Overrides:  as the name implies, this powerful option gives us the ability to supplement and standard Windows DLL with an out-of-band version we would include here.  Huge potential with this one, but we do not need it for our purposes.

Bundle:  another powerful option, this gives us the ability to create a stand alone WINE container app.  With this option, the OSX app file created could be copied over to another machine and run without having to install WINE.

Runtime Options, Version, Identifier, Codesign Identity:  these are our important packaging options.  Runtime as implied allows us to tweak settings at time of packaging.  None required for our case here.  Version is an admin process option that allows you to version your containers.  Identifier is extremely important because the container path in the OSX filesystem will be named using the Identifier as a prefix, so use a name that makes sense and make a note of it.  I used “com.vmware.vim”.  Codesign Identity is also an admin process field allowing for providing validation of the package via unique identifier.

Silent Install:  allows you to run silent for most of the install (WINE will “auto-click” through the installers).  I left this unchecked.

Once you have checked off .NET 3.5 SP1 Winetrick and assigned an Identifier, click “Install”.  You will be asked to provide a name and location for the OSX app that will be created by the sequencing process:

Screenshot 2014-07-11 10.59.23

 

Step 4: walk through the install.  The install will now kick off in a partially unattended fashion, so watch for the dialogue prompts.  If the overall sequencer Install progress bar stalls, there is a good chance a minimized Windows installer is waiting for input:

Screenshot 2014-07-11 10.59.36

The Windows installer bits will look familiar and will be the base versions of .NET that WINE wants, the .NET 3.5 SP1 option that we selected, and the MSXML 3.0 package that is required.  The process will kickoff with .NET 2.0:

Screenshot 2014-07-11 10.59.58 Screenshot 2014-07-11 11.00.16

You’ll have to click “Finish” as each step completes and at times (during .NET 3.0), the installer will go silent or will act strangely (flashing focus on and off as it rapidly cycles through dialogues unattended).  At times you may need to pull focus back to keep things moving.  Once the .NET 2.0 setup is done, you will get a Windows “restart” prompt.  Weird I know, but definitely perform this step:

Screenshot 2014-07-11 11.10.51

During the XPS Essentials pack installation (part of base WINE package) you will also be prompted about component registration.  Go ahead and register:

Screenshot 2014-07-11 11.12.42

The XML Parser component install (part of base WINE package) will require user registration.  Go ahead and complete it:

Screenshot 2014-07-11 11.14.25

 

.NET 2.0 SP2 will require another restart. Go ahead and do that:

Screenshot 2014-07-11 11.20.34

 

 

With all of the pre-requisites finally out of the way, the core VIM install will finally extract and kickoff:

Screenshot 2014-07-11 11.21.47

You will see the VIM Installer warning about XP.  You can ignore this.  I was able to connect to vCenter without issue:

Screenshot 2014-07-11 11.22.40

The install will now look and feel normal for a bit:

Screenshot 2014-07-11 11.24.22

Until… dum dum duuuuuuuum.  This happens:

hcmon error picture

HCMON is the USB driver for the VMRC remote console (a super awesome VMware feature).  Long story short, for whatever reason, it doesn’t work in WINE.  Have no fear though, this entry is all about getting this working (minus the console capability, sorry!).  Do not OK this dialogue box.  Pause here.

Step 5:  once we acknowledge that dialogue, the installer will rollback and delete the installation which is currently being held in temp storage by WineBottler.  We want to grab that before this happens and put it somewhere safe.  So before clicking OK, go over to /tmp/winebottler_1405091227/nospace/wineprefix/drive_c/Program Files/VMware.  Copy the entire “Infrastructure” folder and paste it somewhere safe, then rename it:

Screenshot 2014-07-11 11.34.11

I dropped it into my Documents folder and renamed it “VMW”.  What we are looking for is to make sure that “Infrastructure/Virtual Infrastructure Client” is fully populated:

Screenshot 2014-07-11 11.36.24

We can now click “OK” to the HCMON error and allow the installer to rollback and WineBottler to complete.  It will look for us to select a Startfile.  There is no good option here since our installer actually didn’t finish correctly (WineBottler doesn’t actually know this).  It doesn’t matter what we select as we just want to get a completed install, so go ahead and select “WineFile”:

Screenshot 2014-07-11 11.39.09

 

This dialogue will complete this step:

Screenshot 2014-07-11 11.40.31

 

Step 6:  At this stage, we do not have a working install.  What we do have is a usable template on which we can build a working install.   First go ahead and launch the app (the shortcut will be where the container was saved in step 4).  Nothing will happen since there is no app, but the environment will be prepared.  This is the important piece.  The next step is to go back into WineBottler, and run a new sequencing, but with the options slightly changed:

Note, we are now selecting the newly created environment as the template (/Applications/VIM Client.app/Content/Resources in my case).  For our “Program to Install”, we are now selecting: /path to saved client files/Infrastructure/Virtual Infrastructure Client/Launcher/VpxClient.exe and we are letting WineBottler know that this is the actual program and that it should copy the entire folder contents to the container.  We can now go ahead and click Install (it will be quicker this time).  At the end of this install, be sure to select VpxClient.EXE as the “startup program” before completing.

Step 7: unfortunately, we’re not done yet!  The last step is the do some manually copying since the container will still not be prepared quite right.  Once again, copy the “Infrastructure” hierarchy.  Head over to /Users/username/Library/Application Support/ and find your WinBottler container folder (com.vmware.vim_UUID in my case).  Navigate to drive_c/Program Files/VMware and paste Infrastructure over the existing file structure.

With this step you should be complete!  The original environment can now be deleted and a new shortcut should exist that works.  Here is a final shot of VIM client managing vCenter via WineBottler on OSX:

Screenshot 2014-07-11 20.05.38

 

 


Depending on how things go, the title for this entry might more appropriately be “the self healing Mac”.  Only time will tell!  So what is this all about?  Well recently my trusty companion of 2 years, the “mid 2012 MacBook Pro Retina 15″, decided to have a near (as I can tell) death experience.

It all started with a single kernel panic while doing some boring daily tasks in Chrome.  Within a 24 hour period the problem accelerated to a continuous kernel panic loop.  My first thought was “recent update”, but searching high and low for clues didn’t yield much.  Basic diagnostics (read as the highest of high level) seemed to imply the hardware was OK, but it really felt like a hardware issue.  Or if not hardware, possibly drivers.  But of course neither of those made much sense.  This was OSX running on a nearly new Mac, after all!  It’s like suggesting that your brand new Toyota Corolla would up and completely die 3 miles off the lot (heavy sarcasm here).

Searching around I discovered that there were possibly some issues with Mavericks and the Retina that I had maybe been dodging.  It had also been 2 years of accumulating crap (dev tools, strange drivers, virtualization utilities, deep utilities, games) any of which could be suspect.  So I decided I would try a time machine rollback to before the first kernel panic, and if that failed, take a time machine back to the 1995 and do the classic Windows “fix” – wipe and re-install (ugh).

The time machine restore took literally ages thanks to the bizarrely slow read rates of my backup NAS (detailed here), but eventually completed (400GB, 24 hours).  Unfortunately, the system wasn’t back for more than 10 minutes before the first kernel panic!  That meant that either the condition actually pre-existed the first known occurrence and had just been lurking, or the issue was in fact hardware.  I moved forward with the clean install.

First I deployed a new version of Mavericks.  Boot up holding command R, follow the linked guide, and you’re off to the races.  The reinstall was pretty smooth (erase disk, groan, quite back to recover menu, install new OS) and first boot just felt better.  Of course you know what they say about placebos!  After an hour of installing my usual suite of apps, upgrading to Mavericks and grabbing the latest updates, the dreaded kernel panic struck!  Things were looking grim.

With little to lose I decided to maybe try rolling back to Mountain Lion on the outside chance that the latest Mavericks update was causing issues.  One more reinstall, followed by an app install only and I was feeling good.  Until terror struck!  Yes, another kernel panic.  Incidentally these kernel panics were all over the place (really suggesting RAM).

At this point I became bitter.  Suddenly the “it looks amazing and is all sealed and covered in fairy magic!” Apple approach didn’t seem so great.  Changing out a DIMM on a PC laptop is a cheap and very easy fix.  Hell, in these pages I’ve covered complete tear downs of PC laptops (down to motherboard replacements).  Compounding the issue was that I never opted for Apple Care (yes yes, I know that failing to spend more money no top of a premium $2500 laptop means I deserve what I get if said premium hardware somehow completely dies within 3 years).  Apples decision to solder the memory to the motherboard meant I’d be looking at an extremely expensive motherboard swap out and a good sized chunk of downtime (the latter being a really big issue for me).  Starting to feel truly grumpy, I decided to run a few tests.

First, memtest in OS.  Lots of failures.  Instant failures too.  As a matter of fact I’ve never seen such a horrific memtest result!  It was honestly a bit of a wonder the thing could even boot!  Thinking that maybe the software result was anomalous (memtest for OSX is a bit old at this point and in theory doesn’t support anything newer than 10.5.x) I decided to do the old faithful Apple Hardware Test.  If nothing else that utility is always a cool walk down GUI memory lane!

Well depressingly enough, AHT wouldn’t run.  I didn’t think to snap a pic at that point (I wasn’t planning on this entry), but this gives you an idea (stolen from the Apple support forums):

Image

Disclaimer: Not my pic. Error numbers have been changed to protect the guilty!

The actual error code I was faced with was -6002D.  Yep.  That’s generally memory.  So it looked like a total bust.  My Apple honeymoon appeared to be officially over.  I decided to do one final wipe, in preparation for the now seemingly inevitable hospital visit, and this time lay down only the bare minimum footprint needed to keep doing a bit of work in the meantime since one positive outcome of all of this wiping was the kernel panics had gone from continuous loop to fairly rare.

After turning in for the night, and struggling through a restless sleep fraught with nightmares of Genius Bar lines stretching to the horizon, I crept downstairs to discover that I didn’t see this:

Again, not mine… But you get the idea.  Seen one, seen em all!

Again, not mine… But you get the idea. Seen one, seen em all!

The Mac had made it through the night!  Now this was interesting.  Could it possible be that something in this lineup had become toxic?  With the cloud and “evergreen” software, it was possible.  After all, since our software library is now real time and online, it can be hard to avoid the newest version right?

  • Chrome
  • Lync
  • Skype
  • Office 2011
  • Camtasia
  • Omni Graffle
  • iMovie
  • Garage Band
  • Unarchiver
  • Evernote
  • Dropbox

That is literally the “slim” list that was in place every time the problem would happen post system wipe.  The new list, that seemed stable (against all odds), was solely Office, Lync and Skype.  It was time to do some testing!  Well the results were interesting to say the least! I decided to beat the Mac up a bit.  First, Unigine Heaven 4 in Extreme mode left running overnight (I was always curious how it would do anyhow):

Screen Shot 2014-06-19 at 7.28.35 PM

A great score it’s not, but banging through maxed out Unigine and left running overnight without a hitch kind of implies that the GPU (and drivers) are not an issue.  Well we did suspect that after all, right?  How about taking a closer look at memory?

Screen Shot 2014-06-19 at 7.29.10 PM

Hmmm… OK so far so good…

Screen Shot 2014-06-19 at 7.29.24 PM

Well it’s not ECC anyhow and who knows what this code is actually doing right?  For all we know this is just a register dump.  Time for the big guns.  How about some Prime 95 max memory torture testing?  This is another thing I’ve always wanted to subject the Mac to. No way it survives…

Screen Shot 2014-06-19 at 7.28.19 PM

 

Uh, ok.  It just got serious.  How the hell could this heap, which was unable to even start AHT one clean install ago, somehow now banging through hours of Prime 95 torture?  There was only one thing left to do (well OK two).  First up, memtest.  Keeping in mind that it might be incompatible of course!

WTF?!

WTF?!

What… the…. heck!?  This time the test ran like a charm; exactly as expected.  So not only does it appear that memtest does in fact work fine on Mountain Lion, but the MacBook passed.  With a cautious glimmer of hope starting to form, and more than a bit of fear, it was time for…. AHT!

2014-06-19 20.27.58

You have got to be kidding me!  This time not only did AHT run, but it passed the damn test!  At this point I started checking for hidden cameras, aliens and paranormal activity.  It just didn’t make any sense!

So where does this leave us?  Well at this point I have added everything back in except CHROME and have successfully repeated all of these tests!  Is it somehow possible that CHROME caused this?  But how?  Chrome certainly can’t survive reboots.  Or can it?  With modern laptops and the way they manage power, it’s hard to know if the machine is every really off. Is it possible that some software anomaly was leaving the Mac in a state that prevented it from being able to enter AHT and survived reboots?  It really does seem impossible and it doesn’t make sense, yet none of this makes sense.  How could the Mac have gone from being so unstable it couldn’t even enter AHT, to passing it over and over with flying colors and surviving brutal overnight torture tests with only a software change?  I’ve been doing this a long time (hint… Atari 400, Timex Sinclair 1000, etc) and have never seen something like this.  Is it a self healing Mac?  Is it software so insidious it can survive reboots?  I almost don’t want to know.  One thing is for sure though and that’s that I will be keeping a close eye on this and providing any updates on these pages.  And if I should suddenly vanish?  Tell them to burn the Macbook!


Last entry I touched on the idea that management and orchestration will be the future battleground for cloud providers.  The future of IT operations is likely to take multiple forms ranging from some evolutionary enhancement to what folks do today (console based administration, reactive support) all the way through cloud scale programmatic operation of IT via devops process and tooling (examine any advanced AWS shop to see this in action).  Somewhere in the middle is the vision that Microsoft and VMware are betting on to be the most; the “hybrid cloud” model.

What does “hybrid cloud” really mean though?  Well ideally,  it requires a “cleaning of the IT house” when it comes to the management of on premise resources.  Evolving ultimately into some semblance of an actual “private cloud” in terms of process and tooling, and then extended out to one or more public cloud providers in a seamless fashion.  If your IT shop presents a service catalog to empowered technologists in your business lines who are able to procure services based on budget and SLA requirements, and then have those services instantiate on the platform that best fits their needs (be it on prem or at a provider), then you have what Microsoft and VMware would define as a “hybrid cloud”.

Microsoft, more than any other technology vendor, has all of the component bits in the breadth of their portfolio.  From the hypervisor up through the server and desktop OS to the application layer and tooling (both developer and management).  With the addition of Azure, Office 365 and Dynamics they have a comprehensive XaaS platform as well.  On the consumer side there is similar breadth of service and increasingly there are points of synergy between the two (Onedrive being a good example).

The challenge for Microsoft has been in actually rationalizing all of these assets and telling a compelling holistic story.  In addition, there are weak points in the portfolio where the offerings are not accepted as best of breed (VMware leads in virtualization, AWS leads in IaaS).  Probably most importantly, Microsoft tends to approach problems from a monolithic perspective and the experience is generally not a great one unless you completely buy-in on the vision.

Since I test from a VMware perspective, the release of the Azure Pack seemed like the perfect opportunity to put the Microsoft vision through its paces and see how far they’ve come in addressing these challenges.  So what is the “Azure Pack”?  Azure Pack is, in some ways, Microsoft’s version of the vCloud Suite.  It is a set of software components that overlay the existing Microsoft stack with administrative and consumption web portals and provide multi-tenant service orchestration and management.  You can look at it as “cloud provider in a box”, designed to bolt on to a set of existing infrastructure bits.  Of course anytime something is “in a box” I approach it with some skepticism, so armed with my MSDN subscription (generously entitling you to both free Azure and the entire Microsoft catalog for testing and development) I set off the implement the Redmond version of “hybrid”, but with a heterogeneous architecture (the kind real customers tend to run!)

Before approaching an implementation challenge like this one, it’s important to understand what all of the components are.  It is also critical to know how the pieces fit together and what deployment restrictions are in play.  I think this image, courtesy of Microsoft, tells the story really well:

So what are all of these component parts?  Let’s walk through them…

  • Virtual Machine Manager:  VMM has had an interesting history within System Center.  It is the Microsoft (rough) equivalent of vCenter and these days is able to manage both native (hyper-V) and competitive (ESXi, XEN) hypervisors.  It is a critical component of the hybrid architecture in that it is responsible for surfacing virtual machine resources (organized within VMM into “clouds”) to the Azure console.
  • System Center Operations Manager: no stranger to these pages, SCOM is Microsoft’s comprehensive, and extensible, monitoring platform.   SCOM tends to be the manager of choice for Microsoft workloads and that trend continues here with the Azure hybrid model.  This product maps most closely to vCenter Operations.
  • Service Provider Foundationthis is an interesting set of bits.  It is an OData web service extension to Virtual Machine Manager that provides a multi-tenancy layer for the resources that VMM manages.  In the overall solution, this piece is closest to vCloud Director and is a standalone optional component packaged with System Center Orchestrator 2012.
  • System Center Orchestrator (optional):  this is Microsoft’s orchestration engine, also known as “what’s left of Opalis”.  While a full install of Orchestrator is not an explicit requirement of the Azure Pack (again, Service Provider Foundation is required, but is a stand alone component), an orchestration engine if a vital component in any cloud strategy.  Automation stands with identity management, in my opinion, as the two critical pillars of IT as a Service.  VMware offers a similar set of capabilities in vCenter Orchestrator.
  • System Center Service Manager (optional): service manager is Microsoft’s entry into the IT governance space.  The purpose of this class of software is to assist IT in implementing, automating and enforcing IT operational process using technology.  Essentially a policy engine, auditing system and dashboard, the service manager provides tracking and oversight of problem resolution, change control and asset lifecycle management. VMware’s offering is called, oddly enough, VMware Service Manager.
  • SQL Server: really needs no introduction.  In this case, Service Manager requires either 08 or ’12.  The rest of the products are fine with ’14 and/or are able to utilize SQLExpress.  I have 08 and ’14 in my lab and utilized ’14 for everything except Service Manager.

Since this is a complex installation, I thought it would be useful to go over what I found to be the minimum footprint for deploying all services.  Keep in mind that this is a lab build. Obviously in production these functions would all be discrete and made highly available where applicable:

BOX 1:

  • System Center Configuration Manager
  • System Center Virtual Machine Manager
  • System Center Orchestrator
  • Service Provider Foundation
  • Service Manager management server

BOX 2:

  • System Center Operations Manager

BOX 3:

  • Active Directory Domain Controller/DNS
  • Azure Pack

BOX 4:

  • SQL Server 2014
  • Service Manager Data
  • Database server for product backend
  • Provider for Azure Pack

BOX 5 (optional):

  • SQL Server 2008 R2
  • Database server for Service Manager (requires 2k8R2 or 2k12)
  • Provider for Azure Pack

There have been hundreds of pages written on all of the setup tasks required, so I decided to instead document some “heads ups” from my experience walking through the process end-to-end:

General Heads Ups

  • As always be hyper aware of firewall rules.  Lots of custom port definitions in this process and lots of services that don’t automatically get firewall rules created (Analysis and Reporting Services on SQL for example).  When facing a ‘can’t connect’, check the firewall first
  • Pick one service account, make it a domain admin and use it everywhere.  Life will be a lot easier this way with this build especially.  Of course if you are specifically testing the implications of a granular access control strategy then this doesn’t apply.

Virtual Machine manager and Service Provider Foundation Integration Notes

  • Make note of the SPF service account during install – this is super important as the permissions get tricky
  • SPF will create a set of local security groups on the SPF server.  They are all prefixed by “SPF-” quite handily.  For a lab install, add the service account to all of them.  In production more granular RBAC would likely be a better idea
  • The VMM application pool, used by the Virtual Machine Manager web administration console and API, will install as NetworkService by default.  It should be switched to a named account which also  needs to be a member of the Service Provider Foundation groups
  • The service account used for SPF and the VMM App Pool should be added to the Administrator role in Virtual Machine Manager under “Clouds”.

Service Manager Notes

  • SCOM Agent must be uninstalled prior to installation but can be re-installed after installation is complete
  • Ignore the collation warning.  Fantastic detail on that warning can be found here.
  • Management server and datawarehouse server must be separate (cannot one box this)
  • Pre-req’s will including warnings for RAM (wants 8GB) and CPU (wants 2.5Ghz) if these resources fall short
  • Service Manager Server Install requires 1GB and wants to create a 2GB database.  It also wants to map internal service account privileges to a Windows security group (local or domain)
  • Service Manager Data warehouse Install requires 1GB and wants to create 5 2GB databases.  It also wants to map internal service account privileges to a Windows security group (local or domain)
  • SQL Reporting Services requires some custom configuration for Service Manager.  Luckily the Deployment Guide covers it in detail.
  • Service Manager in general is honestly a pretty big pain in the ass.  Definitely keep the Deployment Guide handy

If everything goes well, the finished product is a working Azure style console for your on-prem private cloud:

Screenshot 2014-06-17 20.42.12

With a few clicks, SQL Server and MySQL capacity can now be rolled into a DBaaS foundation.  Adding the capacity in is as easy as selecting the category on the left hand resource family menu (SQL or MySQL) and selecting “Add” from the bottom actions.   The required options are straightforward: the server name (and optional port number if non-standard), credentials, a group assignment (Azure Pack provides the ability to associate SQL servers into a server group for easier control of consumption) and finally the amount of space that can be consumed via hosting (storage on the server allocated to Azure Pack consumption).

Screenshot 2014-06-17 14.23.20

Up next I’ll do a rundown of the experience using Azure Pack from both the service provider and consumer view.  Where possible I will compare/contrast to the VMware experience.  Stay tuned!