With public cloud infrastructure as a service adoption rate accelerating even in the traditional enterprise state, comparative performance measurements between providers is becoming increasingly important to architects and developers.  Ultimately the smart money is on a multi-cloud strategy with smart orchestration and an SLA/service-centric view of your organization, but knowing what you’re buying is an important part of making economic based decisions around which platform to leverage at a given time or in a given scenario.

Test Setup and Overview

So with this background in mind, I decided to take a deeper look at the current market leader in IaaS, AWS EC2, and compare it against the newest player in the mix, VMware’s vCloud Hybrid Service.  I am fortunate to have access to vCHS for testing purposes and Amazon makes a free tier available which, while limited, is perfectly useful for testing the low end of the spectrum.  While not perfect, testing at this level is valuable as long as care is taken to make the test as equivalent as possible.

With my concept in mind and credentials in hand, I set off down the roads of the two competing platforms to document not only the performance, but the overall experience.  For my testing, I settled on the following mix:

  • OS: Windows 2008 R2 – I wanted to use Windows for the testing since it is relevant to such a large number of enterprise customers
  • Benchmark Suite:
    • PCMark 8 v2 – PCMark ended up a bust crashing on both VMs (detail below)
    • CrystalDiskMark 3.0.3
    • SiSoft Sandra Lite 2014 R2 20.28
    • DaCapo 9.12 Java Suite
  • System Info: CPU-Z – to take a deeper look at what the platform is providing for compute, I settled on my old favorite CPU-Z
  • Virtual Hardware: this one is interesting because the two platforms take a dramatically different approach here.  Amazon provides hardware configuration for you in a variety of “instance sizes”.  For free, the biggest instance you can get is a “T1.micro”.  With vCHS VMware sells you blocks of capacity in either multi-tenant (“Virtual Private Cloud” – 5Ghz and 20GB RAM base to allocate to VMs) or dedicated (“Dedicated Cloud” – 30Ghz and 120GB RAM to allocate to “Virtual Datacenters” from which capacity is then allocated to VMs).  Based on the limits of the AWS free tier, the T1.micro instance’s hardware mix set the baseline for the test:
    • CPU: 1 vCPU at 1.8Ghz (Sandybridge era)  This is a very tricky baseline to set unfortunately since the whole point of cloud is that hardware detail is abstracted out.  Compounding this is that with vCHS, as mentioned above, you carve vCPUs out of a Ghz pool.  If you only provision a single CPU from your pool, you have lots of potential for massive burst performance which will obviously only ramp down as additional vCPUs are provisioned into VMs.  Still, with access to a vCHS Dedicated Cloud for testing I was able to carve up a very small single VM Virtual Datacenter of 2Ghz for this test.  Definite points for flexibility to vCHS here although not a negative per say for AWS since the models are so dramatically different.
    • RAM: the T1.micro gives you 618MB of RAM which makes Windows 2008 R2 quite an interesting science project (it does work though).  For vCHS I gave the VM 640MB RAM
    • Storage: for AWS the T1.micro free tier instance running Windows 2008 R2 comes with a 30GB EBS standard disk. This is network attached block storage using Amazons proprietary scheme for EBS and backed by RAID 1.  For vCHS I gave the VM a 30GB VMDK on the SSD accelerated storage tier which is a tiered RAID protected storage model that should provide higher IOPs than EBS generally but does not allow for prescriptive IOPs assignment the way PIOPs does.
    • Network:  single standard 1Gb/s (in theory) virtual NIC for the instance and for the VM
Machine Creation

Creating the machines on either platform is a great “eureka” moment for any cloud skeptic.  It is incredible how effortless it is to grab capacity with nothing more than a browser on either platform.  I took some screen shots of the process, but first here are the results of the “VM creation time” test.  Keep in mind that with AWS the “time to cloud” is instantaneous.  What this means is that you click through a sign up process no different than any standard web service registration and then can immediately start launching instances.  With vCHS it is a more enterprise centric approach for now that requires a purchase process.  At this stage you cannot simply visit the website and be up and running in minutes.  That said, once provisioned, the “time-to-VM” is quite comparable so this was what I measured:

Time to VM/Instance Results:
  • vCHS Time to VM:                    2:30
  • EC2 Time to Instance:            3:30

vCHS scores a victory here!  The time to bring the VM online was noticeably quicker.  EC2 time to instance was variable over a few runs with 3 minutes 30 seconds being the best time.  Two other creation attempts were actually a bit slower.  OK those are the numbers, but how was the experience?  I’ll let the screenshots tell the story here.  First, AWS:

Screen Shot 2014-04-22 at 10.35.03 PMScreen Shot 2014-04-22 at 10.35.30 PM Screen Shot 2014-04-22 at 10.35.37 PM Screen Shot 2014-04-22 at 10.35.46 PM Screen Shot 2014-04-22 at 10.35.56 PM Screen Shot 2014-04-22 at 10.36.03 PM Screen Shot 2014-04-22 at 10.36.11 PM

Very simple click through console experience!  The first image shows the basic EC2 console view with 2 instances provisioned.  The second image is the first step of the “Launch Instance” process presenting the standard EC2 catalog from which an AMI (OS gold master essentially) can be selected.  Huge depth here.  Next up is Step 2, the Instance Selection dialogue, where you choose the instance size.  No choice is given since we selected “free tier” which only allows T1.micro.  Step 3 allows us to configure instance provisioning details.  Tons of powerful options here, all out of scope for this discussion.  Step 4 we add storage.  Again prescribed by our service tier.  Step 5 we can apply some metadata and “tag” our instance.  And finally in Step 6 we assign a Security Group (or create one) which is a hypervisor level firewall protecting the instance at the network level.  So what is the process like with vCHS? Let’s take a look:

Screen Shot 2014-04-22 at 10.26.04 PM Screen Shot 2014-04-22 at 10.26.26 PM Screen Shot 2014-04-22 at 10.27.04 PM Screen Shot 2014-04-22 at 10.27.26 PM Screen Shot 2014-04-22 at 10.28.08 PM

Fairly similar experience overall.  The first 3 screenshots are quite different from anything in AWS as they cover allocating a block of capacity to a Virtual Datacenter.  In this case I am reducing my allocation from 5Ghz down to 2Ghz to allow for the constrained test. Next up is the catalog view of vCHS following the click through from “Add a VM” to selecting Windows 2008 R2 standard.  As we can see here this will be a cost item.  Worth noting that AWS provides Windows on the free tier.  Next we set our options for the virtual machine in one spot (compute, storage and RAM), as well as connect it to a network, and then click “Deploy the Virtual Machine” to create it.  With vCHS, the networking and security configuration happens in a separate part of the UI and is a bit more aligned with what traditional vSphere administrators, or network administrators for that matter, might expect.  Within the Network Configuration sections of the vCHS UI you can setup firewall and NAT rules (vs the subnet ACL or hypervisor level security group controls in EC2) at the virtual gateway as well as create up to 9 defined private subnets off of that gateway to which VMs can attach.  In EC2 private IP space is allocated at the CIDR block level within a VPC and a Virtual Private Gateway, as well as a virtual router internal to the VPC, and a NAT that can be added during VPC creation, all operate fairly transparently.  Overall I would say that vCHS networking is more flexible and definitely a more direct match to legacy skill sets whereas AWS networking is simpler for those who don’t really care that much about the details of networking and just want to get their services communicating (read as developers).

So What’s Under the Hood?

At this stage our Windows servers are up, so what did CPU-Z find?  Very interesting results actually.  First up EC2 T1.micro:

Screen Shot 2014-04-22 at 10.19.23 PM

Sandybridge EX, Xeon 2650@2Ghz running at 1.8Ghz with a bus speed of 100Mhz

Next up let’s have a look at the vCHS VM:

Screen Shot 2014-04-22 at 11.12.31 PM

Sandybridge EX, Xeon 2660@2Ghz running at 2.1Ghz with a bus speed of 66Mhz

Why is there a difference in the perceived bus speed of the vCPU?  Not sure actually, but it may be a difference in how ESXi presents to the OS vs Xen.  In any event, the benchmark results will ultimately tell the tale of the tape here.  Next up, let’s take a look at what the network performance was like downloading the (massive) 2.9GB PCMark 8.0 file.

Network Download Performance

Unfortunately I was not able to pull the package from the same mirror for both servers, but what I did do was choose the highest performing mirror that each server was able to contact.  Here is how they stacked up.  First up EC2 downloading from Tech Powerup  We can see here a 2.85MB/s sustained rate.  Not bad for free actually:

Screen Shot 2014-04-22 at 11.10.13 PM

And vCHS downloading from Gamers Hell.  Huge bandwidth here!  9.5MB/s sustained!

Screen Shot 2014-04-22 at 11.03.28 PM

The vCHS VM was able to take full advantage of the empty gateway (only one VM behind it) and consume in excess of the allocated 50Mb/s out to the internet.  Super impressive result, and a clear victory, but worth noting that this is compared to the AWS free tier and technically you can launch as many of these free instances as you want.  As additional vCHS VMs become active within the dedicated cloud, they will share that bandwidth.  Of course bandwidth can be added a la carte, so once again the offerings are not really directly comparable in terms of consumption models.

PCMark 8 Install and Setup

OK, PCMark has been downloaded, so let’s install it.  The installation goes as expected with no hiccups and is actually not noticeably slow on either machine which is impressive considering they have sub 1GB RAM and are running 2008 R2.  Quick shots of the install just for reference:

Screen Shot 2014-04-23 at 12.11.03 AMScreen Shot 2014-04-23 at 12.11.09 AMScreen Shot 2014-04-23 at 12.11.16 AM

Screen Shot 2014-04-23 at 12.11.23 AMScreen Shot 2014-04-23 at 12.21.09 AM

For the actual tests we are going to run the “Work Test”.  The other tests require hardware accelerated video which we do not have and are less relevant anyhow since they focus on consumer workloads like gaming and multimedia.  In addition, the Work test offers options for “Accelerated”, which leverages OpenCL (and again we have no GPU so not relevant) or “Conventional”.  I opted for Conventional which aspires to profile baseline performance:

Screen Shot 2014-04-23 at 12.22.35 AM

PCMark 8 Results

Unfortunately this turned out to be a bust on both platforms.  I’m not quite sure why, but it ended up failing at the same point for both EC2 and vCHS.  The failure point was run 4, test 1, of the   Here are two shots of the action in progress:

Screen Shot 2014-04-23 at 12.22.50 AM Screen Shot 2014-04-23 at 12.27.01 AM

Not sure what’s going on here, but I will take a deeper look and report back.

WINNER: Draw – both failed to complete the test
Crystal DiskMark 3.0.3 Results

CrystalDiskMark ran like a charm and turned up a massive disparity between the base storage offering of the two platforms.  This is not unexpected as described above, the vCHS standard storage offering is SSD accelerated.  Presumably there will be a lower tier, lower cost, offering coming and one can assume that there will also be a configurable IOPs version (to compete with pIOPs) coming as well.  For now though, vCHS base storage is very good indeed!  Here is the test mix:

Screen Shot 2014-04-23 at 1.15.49 PM

Standard test suite – 5 run throughs for each tier, 1GB data set and a mix of sequential read, 512K random, 4K random and 4K QD32.  QD32 is a test of native command queuing on a disk (QD = queue depth) with a queue depth set to 32 operations.  If a disk does not support native command queuing (NCQ) then performance on this test is typically dismal.  vCHS is iSCSI storage and EBS is proprietary AWS block network attach, so this will be an interesting result to take a look at.  Here is how EBS performed:

Screen Shot 2014-04-23 at 1.29.16 PM

Not too bad actually!  100MB/s sequential read is excellent, but what’s extremely impressive is the random I/O performance.  In particular, the 4K QD32 is quite good at 33MB/s.  The write speeds, of course, are significantly lower, but still quite good.  If we extrapolate IOPs, we get 716 IOPs.  That’s shockingly good really.  Of course there is no guarantee this performance will be consistently delivered (hence the need for pIOPs), but it does show what even EBS standard is potentially capable of.  For comparison, here are the single disk results (as local DAS) for a Western Digital Red 3TB drive courtesy of Legit Reviews:


As with most spinning rust disks, the small random results are just abysmal.  EBS kills it here thanks to a really well implemented network attach system since we know EBS standard isn’t using fast disks on the backend.   So it looks like cloud is actually hanging pretty well with physical DAS (exceeding it really since random I/O performance is almost always more important than sequential).  How does vCHS stack up?  let’s take a look:

Screen Shot 2014-04-23 at 1.28.36 PM

Wow!  vCHS destroys the respectable EBS standard results!  This is a really really decisive victory for vCHS SSD accelerated storage.  Just look at that 4K random write score!  We’re talking about 2200 IOPs in random writes!  Phenomenal showing for vCHS.  This is two victories for vCHS.  Will it sweep?  Let’s have a look at compute.

SiSoft Sandra Lite Results

SiSoft Sandra was configured with the network server off since both test runs were stand alone.  First up was the Overall System Performance test.  Here is a quick run through of the test setup process:

Screen Shot 2014-04-23 at 1.57.34 PM Screen Shot 2014-04-23 at 1.57.49 PM Screen Shot 2014-04-23 at 1.58.05 PM

Screen Shot 2014-04-23 at 1.58.18 PM

First up is the full test catalogue available on the Benchmarks tab.  Next up, since this is the first run, we are going to “refresh the results”.  Next is the option to participate in the public ranking system.  I am disabling this for these tests.  Finally is the option to participate in the device pricing engine run by the benchmark service.  Again, I disable this for these tests.  The Sandra suite is very comprehensive so I have included the full results below, but here is a snapshot of the vCHS summary screen:

Screen Shot 2014-04-23 at 2.39.55 PM

And the full results:

SiSoftware Sandra
 Connection : Local Computer
Processor Arithmetic
 Aggregated Score : 8.78GOPS
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
Processor Multi-Media
 Aggregated Score : 18.68MPix/s
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
 Aggregated Score : 0.602GB/s
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
.NET Arithmetic
 Aggregated Score : 3.57GOPS
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
.NET Multi-Media
 Aggregated Score : 2.22MPix/s
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
Memory Bandwidth
 Aggregated Score : 6.423GB/s
 Result ID : VMWare VMXNET3 Ethernet Adapter; 640MB EDO DIMM SDRAM
 Capacity : 640MB
 Finished Successfully : Yes
Cache & Memory Latency
 Aggregated Score : 810.3ns
 Result ID : VMWare VMXNET3 Ethernet Adapter; 640MB EDO DIMM SDRAM
 Capacity : 640MB
 Finished Successfully : Yes
File System Bandwidth
 Aggregated Score : 392.589MB/s
 Result ID : VMware Virtual disk (43GB, SASCSI, SCSI-2, 7200rpm)
 Speed : 7200rpm
 Capacity : 42.95GB
 Finished Successfully : Yes
File System I/O
 Aggregated Score : 1178.7IOPS
 Result ID : VMware Virtual disk (43GB, SASCSI, SCSI-2, 7200rpm)
 Speed : 3000Mbps
 Capacity : 42950MB
 Finished Successfully : Yes
GP (GPU/CPU/APU) Processing
 Error (339) : No devices found. : GP(GPU) call failed. Try another interface (e.g. OpenCL/ComputeShader/CUDA/etc.) or update video drivers.
 Finished Successfully : No
Video Shader Compute
 Error (335) : DirectX 11 Device(s) : VMware SVGA 3D (8MB) : Display call failed. Try another interface or update video drivers.
 Error (335) : DirectX 10.1 Device(s) : VMware SVGA 3D (8MB) : Display call failed. Try another interface or update video drivers.
 Error (335) : DirectX 10 Device(s) : VMware SVGA 3D (8MB) : Display call failed. Try another interface or update video drivers.
 Error (335) : DirectX 9.3 Device(s) : VMware SVGA 3D (8MB) : Display call failed. Try another interface or update video drivers.
 Error (335) : OpenGL Device(s) : VMware SVGA 3D (8MB) : Display call failed. Try another interface or update video drivers.
 Finished Successfully : No
Processor Multi-Media
 Aggregated Score : 19.12MPix/s
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
GP (GPU/CPU/APU) Financial Analysis
 Error (339) : No devices found. : Floating-Point (Normal/Single Precision) : GP(GPU) call failed. Try another interface (e.g. OpenCL/ComputeShader/CUDA/etc.) or update video drivers.
 Finished Successfully : No
Processor Financial Analysis
 Aggregated Score : 0.94kOPT/s
 Result ID : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz (2.2GHz, 256kB L2, 20MB L3)
 Speed : 2200MHz
 Capacity : 1Unit(s)
 Finished Successfully : Yes
GP (GPU/CPU/APU) Bandwidth
 Error (339) : No devices found. : GP(GPU) call failed. Try another interface (e.g. OpenCL/ComputeShader/CUDA/etc.) or update video drivers.
 Finished Successfully : No
Video Memory Bandwidth
 Error (334) : DirectX 11 Device(s) : VMware SVGA 3D (8MB) : Shader call failed. Try another interface (e.g. OpenGL) or update video drivers.
 Error (334) : DirectX 10.1 Device(s) : VMware SVGA 3D (8MB) : Shader call failed. Try another interface (e.g. OpenGL) or update video drivers.
 Error (334) : DirectX 10 Device(s) : VMware SVGA 3D (8MB) : Shader call failed. Try another interface (e.g. OpenGL) or update video drivers.
 Finished Successfully : No
Memory Bandwidth
 Aggregated Score : 6.613GB/s
 Result ID : VMWare VMXNET3 Ethernet Adapter; 640MB EDO DIMM SDRAM
 Capacity : 640MB
 Finished Successfully : Yes

Overall Score
Aggregated Score : 0.83kPT
Results Interpretation : Higher Scores mean Better Performance.
Decimal Numeral System (base 10) : 1GPT = 1000MPT, 1MPT = 1000kPT, 1kPT = 1000PT, etc.
Result ID : VMware Virtual Platform (Intel 440BX Desktop Reference Platfor (Intel Xeon CPU E5-2660 0 @ 2.20GHz; VMWare VMXNET3 Ethernet Adapter; 640MB EDO DIMM SDRAM; VMware Virtual disk; Intel Xeon CPU E5-2660 0 @ 2.20GHz)
Finished Successfully : Yes

Next up is the EC2 result set.  Once again, screenshot first:

Screen Shot 2014-04-23 at 3.49.19 PM

And the results:

SiSoftware Sandra
Connection : Local Computer
Processor Arithmetic
Aggregated Score : 1.72GOPS
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
Processor Multi-Media
Aggregated Score : 3.68MPix/s
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
Aggregated Score : 0.027GB/s
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
.NET Arithmetic
Aggregated Score : 1.76GOPS
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
.NET Multi-Media
Aggregated Score : 0.42MPix/s
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
Memory Bandwidth
Aggregated Score : 1.030GB/s
Result ID : XenSource Xen Platform Device; 615MB DIMM
Capacity : 615MB
Finished Successfully : Yes
Cache & Memory Latency
Aggregated Score : 979.1ns
Result ID : XenSource Xen Platform Device; 615MB DIMM
Capacity : 615MB
Finished Successfully : Yes
File System Bandwidth
Aggregated Score : 77.998MB/s
Speed : 10000rpm
Capacity : 32.21GB
Finished Successfully : Yes
File System I/O
Aggregated Score : 790.9IOPS
Speed : 2560Mbps
Capacity : 32212MB
Finished Successfully : Yes
GP (GPU/CPU/APU) Processing
Error (339) : No devices found. : GP(GPU) call failed. Try another interface (e.g. OpenCL/ComputeShader/CUDA/etc.) or update video drivers.
Finished Successfully : No
Video Shader Compute
Error (335) : DirectX 11 Device(s) : RDPDD Chained DD : Display call failed. Try another interface or update video drivers.
Error (335) : DirectX 10.1 Device(s) : RDPDD Chained DD : Display call failed. Try another interface or update video drivers.
Error (335) : DirectX 10 Device(s) : RDPDD Chained DD : Display call failed. Try another interface or update video drivers.
Error (335) : DirectX 9.3 Device(s) : RDPDD Chained DD : Display call failed. Try another interface or update video drivers.
Error (335) : OpenGL Device(s) : RDPDD Chained DD : Display call failed. Try another interface or update video drivers.
Finished Successfully : No
Processor Multi-Media
Aggregated Score : 3.16MPix/s
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
GP (GPU/CPU/APU) Financial Analysis
Error (339) : No devices found. : Floating-Point (Normal/Single Precision) : GP(GPU) call failed. Try another interface (e.g. OpenCL/ComputeShader/CUDA/etc.) or update video drivers.
Finished Successfully : No
Processor Financial Analysis
Aggregated Score : 0.29kOPT/s
Result ID : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (1.8GHz/2GHz, 1.8GHz IMC, 256kB L2, 20MB L3)
Speed : 1796MHz
Capacity : 1Unit(s)
Power : 95.00W
Finished Successfully : Yes
GP (GPU/CPU/APU) Bandwidth
Error (339) : No devices found. : GP(GPU) call failed. Try another interface (e.g. OpenCL/ComputeShader/CUDA/etc.) or update video drivers.
Finished Successfully : No
Video Memory Bandwidth
Error (334) : DirectX 11 Device(s) : RDPDD Chained DD : Shader call failed. Try another interface (e.g. OpenGL) or update video drivers.
Error (334) : DirectX 10.1 Device(s) : RDPDD Chained DD : Shader call failed. Try another interface (e.g. OpenGL) or update video drivers.
Error (334) : DirectX 10 Device(s) : RDPDD Chained DD : Shader call failed. Try another interface (e.g. OpenGL) or update video drivers.
Finished Successfully : No
Memory Bandwidth
Aggregated Score : 4.622GB/s
Result ID : XenSource Xen Platform Device; 615MB DIMM
Capacity : 615MB
Finished Successfully : Yes

Overall Score
Aggregated Score : 0.22kPT
Results Interpretation : Higher Scores mean Better Performance.
Decimal Numeral System (base 10) : 1GPT = 1000MPT, 1MPT = 1000kPT, 1kPT = 1000PT, etc.
Result ID : Xen HVM domU (Intel Xeon CPU E5-2650 0 @ 2.00GHz; XenSource Xen Platform Device; 615MB DIMM; XENSRC PVDISK; Intel Xeon CPU E5-2650 0 @ 2.00GHz)
Finished Successfully : Yes

Holy smokes!  Another huge slam dunk for vCHS!  Look at that aggregated score difference – .83kPT for vCHS vs .22kPT for the EC2 t1.micro.  That’s a nearly 4x increase in performance for vCHS.  It seems like the 1.8Ghz virtual Sandy EX core in the t1.micro is underperforming and the 2.1Ghz virtual Sandy EX in vCHS is over performing.  This overall result is clearly reflected in each of the discrete CPU tests where we see consistent advantages for vCHS.  This result was really interesting and will be even more interesting to see in the context of other CPU benchmarks.

Looking at memory we see a similar trend.  4.6GB/s for EC2 vs 6.6GB/s for vCHS – a 50% advantage to vCHS.

Storage matches what we saw in CrystalMark with 790IOPs for EBS being bested by over 1100IOPs for vCHS – another 50% advantage to vCHS.

DaCapo 9.12 Java Benchmark Suite Results

DaCapo Benchmarking Project is an open source suite of Java based benchmarks, shipped in a 167MB monolithic JAR, that aims to provide a level performance measurement across platforms.  It is particularly useful in benchmarking cloud instances since pretty much anything can run Java.  The suite consists of a series of 14 tests:

simulates a number of programs run on a grid of AVR microcontrollers
produces a number of Scalable Vector Graphics (SVG) images based on the unit tests in Apache Batik
executes some of the (non-gui) jdt performance tests for the Eclipse IDE
takes an XSL-FO file, parses it and formats it, generating a PDF file.
executes a JDBCbench-like in-memory benchmark, executing a number of transactions against a model of a banking application, replacing the hsqldb benchmark
inteprets a the pybench Python benchmark
Uses lucene to indexes a set of documents; the works of Shakespeare and the King James Bible
Uses lucene to do a text search of keywords over a corpus of data comprising the works of Shakespeare and the King James Bible
analyzes a set of Java classes for a range of source code problems
renders a set of images using ray tracing
runs a set of queries against a Tomcat server retrieving and verifying the resulting webpages
runs the daytrader benchmark via a Jave Beans to a GERONIMO backend with an in memory h2 as the underlying database
runs the daytrader benchmark via a SOAP to a GERONIMO backend with in memory h2 as the underlying database
transforms XML documents into HTML

For my testing I ended up with a subset.  The tests I excluded threw Java exceptions that I wasn’t in the mood to troubleshoot.  I included a screenshot of one such exception below for reference though:

Screen Shot 2014-04-23 at 3.46.30 PM

Here is the list of tests that ran correctly:

simulates a number of programs run on a grid of AVR microcontrollers
produces a number of Scalable Vector Graphics (SVG) images based on the unit tests in Apache Batik
executes some of the (non-gui) jdt performance tests for the Eclipse IDE
Uses lucene to indexes a set of documents; the works of Shakespeare and the King James Bible
Uses lucene to do a text search of keywords over a corpus of data comprising the works of Shakespeare and the King James Bible
analyzes a set of Java classes for a range of source code problems
renders a set of images using ray tracing
transforms XML documents into HTML

First up, the results for vCHS:

Screen Shot 2014-04-23 at 4.52.13 PM

The total completion time for all tests came in at: 142320 msec.  I also decided to add a cost dimension here.  Keep in mind that this is a bit tricky since vCHS is based on a subscription cost for capacity, and in order to simulate the t1.micro we needed the capabilities of the Dedicated Cloud offering which provides dedicated host hardware.  To try to level the cost playing field, I modeled the 2Ghz slice of the DC against a t1.micro dedicated 1 year RI with heavy util. What’s that you say?  There is no such thing as a dedicated t1.micro?  Yes, I know.  To account for this I added the generalized 10% upcharge for dedicated instances to the t1.micro.  Highly synthetic, yes, but it does seem reasonable for our purposes here.

In order to normalize the vCHS subscription aspect, I took the total monthly cost for a dedicated cloud (from the vCHS Public Site) and extrapolated the cost for a single second of compute time.  Included in this calculation is compute, storage and support.  I then multiplied the cost per second by the 142 seconds it took the vCHS VM to complete the suite.  Here are the results:


Storage 0.000213657
Compute 0.001183333


So vCHS cost was $.026 to run the suite.  Let’s take a look at EC2:

Screen Shot 2014-04-23 at 5.28.16 PM

Holy smokes! 629369 msec for the t1.micro to complete the same test matrix. Wow!  The EC2 cost advantage would have to be massive to offset that.  Leaving aside that technically we are running on free tier (this is only good for a year anyhow), let’s take a look at what this test would have cost had we been paying for that t1.micro:

Support 0
Storage 0.000401042
Compute 0.001178819
RI fee 0.001255787


So the EC2 cost advantage is impressive in this scenario.  Despite a massive performance deficit of nearly 4.5x, the cost advantage of a t1.micro actually does make up for it in terms of total cost for the test with the suite run time coming in at $.0028, a factor of 10x less.  I did included the one time RI fee divided over the one year period in this calculation, and this was a 1 year heavy util RI which provides an excellent cost per hour, but then I also added an artificial “dedicated instance” fee.  I set bandwidth out (a cost item for AWS) at 100GB/month.  I think the above modeling is a reasonable representation of what an enterprise would be likely to pay for this capacity over the given period of time, and so is fair for this test.  The same is true on the vCHS side.

WINNER: vCHS on performance (by a wide margin) AWS on cost (by a wide margin)

At this stage the conclusion is a bit tricky.  vCHS clearly wins by a wide margin on performance.  That said, the offerings aren’t really directly comparable from a commercial model standpoint at this stage, and EC2 did come in significantly cheaper.  If the cheapest performance is what you’re after, EC2 would have been the winner here (not to mention that all of this Windows work would be free for a year).  Of course in the real world cost is not the sole metric.  vCHS provided such a significant performance benefit that the cost might just wash, even at scale.  In addition, the vCHS offering includes a much higher level of support.  Adding business support on the AWS side would have increased the cost to $0.030 and left vCHS a winner across the board.  This is a critically important dimension to keep in mind if you are evaluating enterprise adoption.

As promised this series will continue with more tests, more platforms and more metrics.  Suggestions are most welcome!

Well I decided that a hosted desktop would be cool for the new lab (especially since I’m going to provide VPN access to friends and colleagues as a favor). Of course the easy way to do this would be to just install a desktop VM right? Of course. But we don’t do things the easy way here at Complaints HQ! If we did, we’d have nothing to complain about! So what was my solution to the hosted desktop requirement? Well install VMware View of course! I haven’t had a chance to document the full experience yet, but before I even get started I want to provide one little tidbit that drove me crazy this weekend.  VMware View has multiple components and concepts worth getting to know and understand:


  • VDI – this one is a no brainer right?  Desktop OS images that run on a server and are accessed by users remotely who run some sort of client software (VMware View client), using a connection protocol (PCoIP for VMware, RDP for Microsoft, ICA for Citrix, etc) to send screen images back and forth from the Windows, Mac, Android (etc etc) computing device.
  • Linked Clones - storing tons of desktops for users takes a lot of space.  And not cheap Dell commodity SATA2 drive space either. Expensive, possibly SSD backed, server blade space.  Since Windows X.Y is pretty much 90% of the same for Z users (you don’t need 900 copies of GDI32.DLL in other words) there is an opportunity to be more storage efficient.  The concept of “Linked Clones” takes the “gold master” approach and makes it real-time and dynamic.  You load up one master Windows image, you spin off a number of replicas of it, and as users order up desks, any deltas from the standard are stored temporarily in small, dynamic, clones linked to a replica.  Saves tons of space and also allows for smart placement of the active bits of an OS volume (so you don’t need to spend zillions on SSD to get the infrastructure to be performant)
  • User Profiles - a chunk of the data that is unique in a set of images is the user personalization data.  There are a number of ways this can be managed, but the idea is that most VDI systems provide some way of managing it.  The sledge hammer way is to persist entire images forever (the essentially static 1:1 relationship), the elegant way is to use the Linked Clone approach for the common OS bits and take user profile and persist it in a separate storage location and draw any needed associations via configuration metadata.
  • Desktop Pools – since the idea of VDI is to manage lots of virtual desktops at scale, and to provide different tiers of service and possibly different architectures (maybe some power users do get the 1:1 permanently persisted desktop while call center workers, for example, make due with the near fully commodity “built on demand” desks), an extensible management structure is needed.  The concept of pools provide this flexibility.  A “pool” is essentially a set of configuration options with together for a service catalog entry.
  • Entitlements – lastly, since we want this all to be easy for users, the idea of entitlements is taking a user identity (typically from AD when we’re talking about Windows users and Windows desks) and associate it with some level of configuration and a designated pool.  This way when the call center worker logs in they just get presented a desk without having to think, and the same goes for the power user.  The fact that they have taken two very different paths to that login screen should be completely invisible to them.



  • View Connection Server – this is the connection broker.  Basically, think of the connection broker as your workflow orchestrator and service catalog for your Virtual Desktop Infrastructure.  Without a connection broker, you’d have to create and deploy individual desktop VMs for every user and then provide them direct login info.  The connection broker components allow you to fully automate this process as well as provide other awesome and advanced features like dynamic setup and teardown of desktop images, full customization and linked clones for desktop images for storage efficiency (and thats just naming a few!).  This is clearly the heart and soul of View.  The Connection Server pretty much requires a stand alone install, uses a web front end for configuration on standard https, and is the point from which you manage all of the other (pretty much headless) components.  The install is part of the “connectionserver” package.
  • View Security Server - this box is basically a front-end proxy for the Connection Server designed to be dropped in a DMZ.  These can scale out in a non-fixed ration of SS:CS.  Important for real deployments at scale that plan to advertise desks directly to users over the internet.  This is also a stand alone box pretty much by design since it should sit in the DMZ.  The install is also part of the “connectionserver” package (it is a selectable role)
  • View Composer – this is the component that provides the automated provisioning management and the aforementioned storage efficiency.  It’s important, it runs headless, and it is configured through the View Connection Server UI (the “VMware View Administrator” portal).  It doesn’t necessarily need to run stand alone.  By default it needs to run over SSL at port 18443.
  • View Transfer Server – transfer servers manage the dynamic aspects of end-user desktop subscription and, as a result, are designed to naturally scale out.  The Transfer Server is basically a web server with code that handles the check-in and check-out process of a user requesting, being assigned, and releasing desktop instances from the pools of desktops that you create (the dynamic setup and teardown aspect whereby desktops are “built to order” and then destroyed when no longer needed).  For some unknown reason it runs on Apache for Windows (?!) at port 80 and should be ok as a shared component.

And that brings me to my heads up.  I attempted to “one box” the View Composer and the Transfer server, mainly because having 50 VMs to handle the launching of 3 desktops seems pretty ridiculous, and I hit a wall whereby the transfer server became “bad repository” no matter what I tried to do.  Here is what the issue turned out to be:

Screen Shot 2014-04-21 at 1.31.01 AM Screen Shot 2014-04-21 at 1.31.15 AM


What do these two screenshots mean?  They mean that it seems IIS was stepping on Apache.  I believe IIS got installed (and activated) as part of the .NET 3.51 feature addition (required by Composer) and of course at that point took over port 80 and killed Apache.  This also caused Apache to shutdown.  Disabling IIS and restarting Apache did the trick.  Interestingly, Composer still seems to work.  I am not sure why, however.  I know that Composer does not use Apache, but it may not use IIS either.  It may be a stand alone app listening on port 18443 and just utilizing IIS as a certificate generator.  We shall see!  On a side note, it would be really nice if VMware rationalized their supporting technology infrastructure.  So not Oracle and SQL Server and Postgres and Java and .NET and IIS and Apache, but rather just one from each category.  Hopefully this will happen eventually.

UPDATE: it looks like the one-box is working and that IIS and Apache should be configured to co-exist on the one box.  More on this below

Handy Checklist to Avoid the Gotchas:

Lots and lots has been written on View and there are plenty of end-to-end walkthroughs.  So instead of pushing another of those onto the stack, I thought it might be more useful to put together a handy list of things to keep in mind when doing View testing.  I also plan to keep this a living list as I discover new and potentially interesting things.

  • One Boxing – As mentioned above in the components sections and the specific example of Transfer Server and Composer co-existing, it is possible to one-box certain bits, but there are a few things to keep in mind.  The first is that certain components do naturally separate and, even in a lab setting, it is best to keep them separate (Connection Server and Security Server is the example here).  The second is that many View components are web services and, unless you get into complex port customization, can’t necessarily co-exist.  In particular the Transfer Server and the Connection Server both want port 80.  What I’ve found to be the minimum footprint is as follows:
    • Connection Server – this guy should stand alone just to keep your sanity
    • Security Server – this guy is optional, but if you need one for your testing it should be kept alone and kept on a separate VLAN to simulate a DMZ
    • Transfer Server – this guy should also be kept separate because it wants port 80 and therefore can’t co-exist with the Connection or Security servers without customizing ports.  Alongside the Transfer Server, however, the Composer can be collocated.
    • Database Server – View likes to consume databases.  It is probably best to stand up a SQL database (or use an existing one) rather than attempting to use SQL Express.  Note that you cannot use anything newer than SQL 2008 R2.  The databases that are relevant here are the main View Connection Server database, the View Composer database and the Event database.
  • Master Image Prep - A default Windows installation is not ready for usage as a View master image.  There is a checklist of items that must be done in order to make a Windows install a valid template:
    • Make sure the vNIC is set to VMXNET3
    • Make sure that PCoIP is configured in the firewall and allowed for the appropriate networks (this part is significant as the firewall rules in Windows are based on “domain”, “public” and “private”.  Consider your network path for clients carefully)
    • Set the Windows configuration to DHCP
    • Activate
    • Join the domain
    • Disable power options for screen (set screen turns off to “never”) for PCoIP
    • Disable hiberante
    • Install VMware tools
    • Install the VMware View Agent (make sure to install 5.3.1 for Windows 8! Earlier versions will not work)
    • Take a snapshot
  • Storage Placement - we touched on this above in discussing Linked Clones as a concept, but it is worth expanding on this topic as we consider on which datastore each type of file should be placed.  Consider that View desktop storage is hierarchical.  At the most static, we have a master image (also known as the “Parent VM” in View admin parlance) and a snapshot.  These provide the source for a given desktop version.  Changes made to this master file set will propagate to any children provisioned from them.  These are full size (meaning a full desktop and a full snapshot) and are accessed only during initial provisioning of replicas so can be kept on a datastore backed by high capacity, lower IO, storage.  One level out from the Master Image we have the Replicas.  The Replicas are really the “working bits” and are the image from which Linked Clones are dynamically provisioned.  Replicas are full sized, are accessed frequently, but are read only.  From the Replicas, Linked Clones are built.  The Linked Clones are disposable (used for the life of a desktop instance then discarded), but are accessed heavily in the read/write mix typical of a desktop OS.  They contain the dynamic parts of a desktop image that must be changed, or bits that have changed for a given session.  Personally I think that, if budget allows, all VDI should really run off SSD in production.  That said, there are some minimums worth considering:
    • Master Images can be stored on NL-SAS/7.2k SATA arrays
    • Replicas should be placed on either SSD or SAS/15k SATA arrays
    • Linked Clones should be placed on the highest IO storage you can afford (SSD strongly preferable)
    • Profile data can be placed on NL-SAS/7.2k SATA arrays although 15k of course can’t hurt

One final point of note is that all of the above does apply very specifically to Linked Clones.  With Dedicated Instances, each desktop is a full copy of the Master Image.  As a result you need both capacity and high I/O for storing dedicated disks.  I prepared a quick diagram to help visualize the Linked Clone storage relationship.  The file folders represent user profile data being persisted on a separate class of storage so user profile state is maintained across generations of ephemeral desktop:


  • Certs – you will see all kinds of warnings regarding using private certs and not having an enterprise PKI with a configured trusted roots and CRLs.  You can ignore them.  While a proper PKI infrastructure configured correctly at the client level is certainly a better user experience in production, private untrusted certs work just fine for testing.  You’ll just see the usual warnings at the access layers and errors in the View event logs.
  • DNS – DNS is a bit trickier.  VMware products are often picky about DNS vs IP access in places where one was used during initial configuration.  With View I observed that the OSX client really wants to access the View Connection Server by name.  The Android client, on the exact same infrastructure, was fine connecting via IP.  In my opinion the safe bet here is to populate your local DNS with the FQDNs for each View component (this is very easy if you are using your AD DNS as primary since they’re all domain joined).  Exception would be the security server which should be populated into your public DNS in cases where production access via the Internet is being allowed.

In any event, this entry is just getting started so stay tuned!

With nested ESX humming along happily I decided it was time to get serious about simulating a typical enterprise virtualization architecture.  My goal is to be able to test pretty much anything in the VMware product line and that means SRM which means multiple datacenters, but more importantly, multiple vCenters.  With nsted, this becomes super easy.  Here is how I decided to parcel out my resources:


A few things to keep in mind when setting up the hosts (this applies to all VMware designs):

  • It’s a good best practice to isolate different types of management traffic. Even though this isn’t really relevant when all of these NICs are virtual and riding on one (or a few) physical host NICs, it’s also very easy with nested since it’s just software and it’s a good habit to get into.  Be sure to setup at least two vmkernel NICs for the vSwitch where the management port groups live.  Make sure that these vNICs are configured for fault tolerance and that at least two have “Management Traffic” checked.  This will remove any warnings about Management Network Fault Tolerance.  Done right, the final visual overview of the vswitch should look like this:


  • When configuring an HA/DRS cluster, vCenter will look to find two shared datastores that can be used for heartbeat exchange.  This is less easy in a nested setup where you aren’t connecting hosts to a SAN or NAS.  The best solution here is really to introduce a NAS into the mix.  In my case I created both an iSCSI target and an NFS share and configured each host for access to both.  Within a few minutes vCenter will pick up on the fact that the cluster member hosts have two datastores each in common and will clear the warning.  Below you can see both the NFS and iSCSI datastore overviews.  Note that the iSCSI target also has the physical host configured since I use a single iSCSI target globally:


Ultimately my multi-vCenter, multi-virtual datacenter nested implementation will follow the architecture diagram presented below.  Note that the plan is to deploy vCD instances into both vCenter implementations:


This is another topic that is done thousands of times (and has actually been done in these pages as well!), but I thought with new waves of both vCenter and Windows it might be worth documenting one more time. So with that in mind I give you a visual walk through of Windows 2k12, W2k12 AD, W2k8 and vCenter 5.5 setup!  First let’s create a new virtual machine for AD.  I am creating AD and vCenter on the physical ESX host.  These are likely to be some of the only services I will run outside of the nested ESX hosts.  As per usual, from the vSphere client (web or legacy) we select Create a New Virtual Machine from the host focus and, in this case, we can stick with “Typical”:

Screenshot 2014-04-14 18.02.26

Give our new VM a name:

Screenshot 2014-04-14 18.02.39

Select a datastore:

Screenshot 2014-04-14 18.02.45

Choose the OS (latest version of vSphere provides 2k12 64bit as an option):

Screenshot 2014-04-14 18.02.53

Assign the vNIC to a VSS port group:

Screenshot 2014-04-14 18.03.02

Provide a virtual disk (40GB is fine):

Screenshot 2014-04-14 18.03.06

Go ahead and Finish, but check “edit Settings” so we can attach a virtual CD/DVD for first boot:

Screenshot 2014-04-14 18.03.12

Browse to an ISO on a datastore (in this case my NFS install share):

Screenshot 2014-04-14 18.03.23

Select the Windows 2012 ISO:

Screenshot 2014-04-14 18.03.35


We can now power on the VM and launch the VM remote console.  The Windows installation boot should start:

Screenshot 2014-04-14 18.04.01

Enter the old product key if you have it:

Screenshot 2014-04-14 18.04.25

Pick an OS (I went with datacenter to entitle the entire host to unlimited guests):

Screenshot 2014-04-14 18.07.27

Agree to stuff no one reads and hopefully will never be called accountable on:

Screenshot 2014-04-14 18.07.57

Go for “Custom Install” since this is a new build (I feel “Custom Install”, complete with an ominous “advanced” warning is misleading here, but in any event…):

Screenshot 2014-04-14 18.08.05


Select a destination volume:

Screenshot 2014-04-14 18.11.37

And go ahead and Install Now:

Screenshot 2014-04-14 18.08.45


Files will copy as always:
Screenshot 2014-04-14 18.11.47

And when complete, and after a reboot, we will be greeted by the “weird to see on a server and not in a good way” MetroUI login:

Screenshot 2014-04-14 18.19.17

First up let’s install the old VMware tools:

Screenshot 2014-04-14 18.20.25

Yes yes, very scary:

Screenshot 2014-04-14 18.20.07

Install prep starts:

Screenshot 2014-04-14 18.20.37


Screenshot 2014-04-14 18.20.51

I always go with “Complete” here since it can’t hurt:

Screenshot 2014-04-14 18.21.02

Fire off the Install:

Screenshot 2014-04-14 18.21.07

Files will copy:

Screenshot 2014-04-14 18.21.24

And we’re done:

Screenshot 2014-04-14 18.21.29

We now need to restart which sucks (although it doesn’t suck as much as actually trying to find how to shutdown in the MetroUI!):

Screenshot 2014-04-14 18.21.35

Once we’re back it’s time to setup the network:

Screenshot 2014-04-14 18.22.54

UI elements here pretty much unchanged since 2k8:

Screenshot 2014-04-14 18.23.04

UI elements here pretty much unchanged since Windows NT 4!:

Screenshot 2014-04-14 18.23.15

Next we give this beast a name:

Screenshot 2014-04-14 18.28.03

After a reboot to make the name stick we head right into Server Manager (this is very new compared to 2k8) in order to manage our roles:

Screenshot 2014-04-14 18.24.00

Acknowledge that, yes, this is all very amazing:

Screenshot 2014-04-14 18.24.09

We are planning to do a role based install:


Screenshot 2014-04-14 18.24.18

Select our server:

Screenshot 2014-04-14 18.24.41

Choose our roles.  In my case I am doing AD so I select Active Directory Domain Services and DNS.  I leave File Services checked since that can be useful as well:

Screenshot 2014-04-14 18.24.54

Accept the pre-determined minimum required feature set (I don’t add any additional):

Screenshot 2014-04-14 18.29.18

Read some interesting fun facts about AD:

Screenshot 2014-04-14 18.30.32

And DNS…

Screenshot 2014-04-14 18.30.38

Confirm our task list:

Screenshot 2014-04-14 18.30.43

Install begins:

Screenshot 2014-04-14 18.30.49

Pretty good verbosity on progress updates in the new server manager:

Screenshot 2014-04-14 18.32.56

Configure AD.  I am creating a new forest so select “Add a new forest”:

Screenshot 2014-04-14 18.33.16

And give it a name:

Screenshot 2014-04-14 18.35.11

Provide functional level for the forest and domain. This is a net new install and I don’t plan on introducing any legacy domain controllers, so 2k12R2 native is fine (although it is interesting that R2 is called out as a functional level).  I make every AD DC a DNS server and a GC also so these are checked.  Last step is to provide a DS recovery password:

Screenshot 2014-04-14 18.35.32

Next we set DNS options (of which there are none):

Screenshot 2014-04-14 18.35.55

Provide the NetBIOS name (amazing… NetBIOS may never fully die.  Viva la NetBIOS!):

Screenshot 2014-04-14 18.36.13

Accept the default paths (or don’t, your choice):

Screenshot 2014-04-14 18.36.25

Sign off on the actions to be performed:

Screenshot 2014-04-14 18.36.30

Notice that “View Script” button? Now this is absolutely awesome if you ask me.  Like it or not, “operations” is evolving into “devops”.  This “push button get script” option here is gold for any traditional infrastructure administrator interested in self preservation.  It provides an opportunity to see what everything that is about to happen would look like if we were to be doing it programmatically in PowerShell.  I cannot say enough how much I love this feature.  And look how simple this script is!  It might actually be easier to write that script than to click through the GUI:

Screenshot 2014-04-14 18.36.40

With all of the pre-work done we can go ahead and fire off the Install:

Screenshot 2014-04-14 18.37.25


With that our AD domain is finished an online! Of course we don’t really need it thanks to vCenter SSO, but it certainly can’t hurt!  Next up let’s install the actual vCenter.  Now as far as I know, there are some compatibility issues with vCenter 5.5 and Windows Server 2012R2.  I’d rather not take any risks, or run into any weirdness.  I’d also prefer to not fragment my Windows 2012 footprint, so instead of doing R1 I go ahead and just deploy what I am sure works – Windows Server 2008 R2.  This is a good example of what enterprises deal with as I am now, even in my small home lab, dealing with 3 discrete Windows images (including my Windows 8 Pro admin console).  First thing is to create another VM, just as per the instructions above, but in this case setting the guest OS to Windows Server 2008R2 64bit and pointing the virtual CD/DVD at the W2k8 ISO.  On first power up, Windows 2008 installation should boot:

Screenshot 2014-04-14 18.53.09

Setup starts… Nothing new here:

Screenshot 2014-04-14 18.53.27

More license terms:

Screenshot 2014-04-14 18.53.39

Once again, as with 2k12, the Custom (advanced) option is for new installs:

Screenshot 2014-04-14 18.53.47

After the files copy Windows will do final configuration:

Screenshot 2014-04-14 19.01.03

And we’re done:

Screenshot 2014-04-14 19.01.29

The decidedly less slick but nearly equally function 2k8 Server Manager greets us:

Screenshot 2014-04-14 19.02.06

First step is to setup our network:

Screenshot 2014-04-14 19.02.36

And give her a name:\ Screenshot 2014-04-14 19.03.51

Next we join our shiny new Windows AD domain:

Screenshot 2014-04-14 19.13.43

Provide the creds with sufficient privilege to join a PC:

Screenshot 2014-04-14 19.14.03

And we’re in!

Screenshot 2014-04-14 19.14.24

After returning from the reboot it’s time to activate:

Screenshot 2014-04-14 19.16.08

This should work with no issues, but if the key was used previously activation is just a (fully automated) phone call away:

Screenshot 2014-04-14 19.18.31

Hurray we’re genuine!

Screenshot 2014-04-14 19.19.06

Next up is the tools install once again:

Screenshot 2014-04-14 19.19.55

Restart to complete:

Screenshot 2014-04-14 19.28.28

When we return it is time to setup vSphere 5.5.  Pop in the VIMSetup-ALL volume and the Autorun will bring up the main setup:

Screenshot 2014-04-14 19.46.39

Pre-reqs check is very easy and should pass with no issues if DNS has been correctly configured and the PC can resolve its own name:

Screenshot 2014-04-14 19.47.44

Next we provide a password for the vCenter Single Sign On facility administrator account.  This is super important as it will be the account you have to use for initial logon to the vCenter:

Screenshot 2014-04-14 19.47.58

Here we can provide a site name.  I just stick with “default-first-site” in the lab, but in a real scenario a properly descriptive site name should be used and follow some reasonable naming convention:

Screenshot 2014-04-14 19.48.38

Here we set the TCP port for the SSO service (I leave the default):

Screenshot 2014-04-14 19.50.41

You can change the destination folder for vCenter if you want or need to:

Screenshot 2014-04-14 19.50.46

With all of the upfront work done we can go ahead and Install:

Screenshot 2014-04-14 19.50.51

Files will copy…

Screenshot 2014-04-14 19.50.55

The installation process is scripted.  At points it will appear to stop and return to the main Install screen.  It is not in fact stopping, but rather the script is still working in the background and launching the next component install.  Be patient until the final notification that all setup is completed:

Screenshot 2014-04-14 19.54.39

Here we can see the next module install (in this case vSphere Web Client) has triggered:

Screenshot 2014-04-14 19.55.25

Now the Inventory Service:

Screenshot 2014-04-14 19.57.52

And the main server itself:

Screenshot 2014-04-14 20.01.02

At this stage we are prompted for our license key:

Screenshot 2014-04-14 20.01.44

And now we must select our vCenter database.  There are two options here.  We can either utilize the included SQL 2008 Express package, which is theoretically limited to 5 hosts and 50 VMs, or we can configure an external data source (meaning a SQL server that we have already installed and have online).  If you take the latter approach, just be sure that you have your SQL authentication setup properly configured (either Windows or SQL auth) and you know which user you will provide vCenter in order to login (should be able to create and own a database).  In my case I opt for SQL Express:

Screenshot 2014-04-14 20.07.39

We can now choose to have the SSO Service sign-on as a service account rather than Local System if we want or need to:

Screenshot 2014-04-14 20.07.45

Great dialogue box here giving us full control 0ver TCP port assignment for the various vCenter network services.  I stick with defaults, your mileage will almost certainly vary:

Screenshot 2014-04-14 20.07.52

Next we size the inventory according to our projected deployment scale.   Small is the right match for almost any lab:

Screenshot 2014-04-14 20.08.01

With the options all set we can go ahead and Install:

Screenshot 2014-04-14 20.08.11

Files will be copied…

Screenshot 2014-04-14 20.08.38

SQL will be installed and configured via unattended script:

Screenshot 2014-04-14 20.09.09

Have patience while it runs…

Screenshot 2014-04-14 20.09.33

You will watch the entire SQL Express install process run lights out:

Screenshot 2014-04-14 20.11.08

When it is complete, vCenter install will continue:

Screenshot 2014-04-14 20.12.48

Still more files will be copied…

Screenshot 2014-04-14 20.13.26

Various configuration tasks will be run:

Screenshot 2014-04-14 20.14.16

Once completed, the services will start:

Screenshot 2014-04-14 20.17.13

Additional components will be installed (in this case Orchestrator):

Screenshot 2014-04-14 20.17.34

Profile driven storage…

Screenshot 2014-04-14 20.19.57

And we’re done!

Screenshot 2014-04-14 20.20.14


At this stage the main Installer finally gives us the “all clear”:

Screenshot 2014-04-14 20.20.29

Next I choose to install the optional Update Manager.  Update Manager should be installed on the administrative console that will be used for managing the server farm via GUI.  In my case I tend to run the GUI right off of the vCenter server quite often, so I install here:

Screenshot 2014-04-14 20.20.43


Install starts:

Screenshot 2014-04-14 20.20.49

Warning that Update Manager will upgrade hosts and also a chance to setup the first download immediately following install:

Screenshot 2014-04-14 20.21.15

Provide the vCenter creds for Update Manger (note: the SSO Admin creds, or another admin user if you have created an additional one are wanted here):

Screenshot 2014-04-14 22.03.56

Once again we select a data store:

Screenshot 2014-04-14 22.04.19

And once again an opportunity to specify network port and address assignments, this time for Update Manager:

Screenshot 2014-04-14 22.04.43

A change to change the path:

Screenshot 2014-04-14 22.05.01

A warning about disk space if the installation volume is south of 120GB.  I disregard as you can always grow this volume if you need to and my lab won’t exceed 40GB anyhow:

Screenshot 2014-04-14 22.05.24

Files copying, a recurring theme!

Screenshot 2014-04-14 22.06.02

And we are done with Update Manager:

Screenshot 2014-04-14 22.06.08

Next I decide to check out the vSphere Web Client since this has become the official client (the legacy client is being deprecated).  Of course Microsoft chose to annoy admins the world over nearly a decade ago and lock down Internet Explorer to a ridiculous degree by default.  As a result surfing anywhere is a nightmare initially.  First step (for me) is to kill IE Enhanced Security Configuration which is done through Server Manager:

Screenshot 2014-04-14 20.38.15

With that done we can check out the web client.  Note that it is on port 9443 (as per our install configuration) and you will need Flash (boo! hiss! seriously though, this requirement needs to go).  To login you will again need the SSO admin credentials if and until an alternate user is created.  The web client looks really sharp:

Screenshot 2014-04-14 20.40.38

First stop I decide to explore the SSO config and add Active Directory as an authentication provider.  Head over to Roles:

Screenshot 2014-04-14 21.47.31

We can “Add an Identity Source”.  I choose AD as an LDAP server.  You will need to provide domain, DN and context info and syntax is super important.  You can refer to the screenshot to see the syntax requirements and substitute your own domain info for mine when configuring your own lab.  For the login I created a service account, but any domain account that can do a lookup against the global catalog (basically any account) should work:

Screenshot 2014-04-14 22.01.25

And our AD has been configured as an identity source!

Screenshot 2014-04-14 22.09.24

The only thing left to do is configure our base vCenter objects and add our main host to the new vCenter.  Let’s go ahead and walk through this quick and painless process.  For this I revert to the legacy client just because I’m finding it hard to cut that cord and I am less efficient in the new client.  It’s probably good that VMware is taking away the crutch though or I’d likely never learn my way around the new one!  For now we’ll stick with legacy though.  After connecting, we see a pretty blank slate.  The first step is to go ahead and “Create a Datacenter”.  This pretty much just requires choosing a name at this stage:

Screenshot 2014-04-14 22.09.59

With our new datacenter object in place, we can go ahead and “Add a Host”:

Screenshot 2014-04-14 22.10.18

We need our host IP and root login credentials to get started:

Screenshot 2014-04-14 22.10.25

Acknowledge the certificate alert (incidentally running an enterprise PKI and configuring all of the elements to use it and reference an enterprise root would remediate the endless alerts):

Screenshot 2014-04-14 22.10.42

Confirmation that the host was discovered and a chance to verify before continuing:

Screenshot 2014-04-14 22.10.50

Enter a license key (redacted to protect the innocent!):

Screenshot 2014-04-14 22.10.57

Configure lockdown mode if that’s your thing.  I endlessly SSH into hosts so this definitely stays off for me:

Screenshot 2014-04-14 22.11.01

Choose a datacenter to add the host to (we only have one):

Screenshot 2014-04-14 22.11.07

Review all of the info provided so far and finish:

Screenshot 2014-04-14 22.11.12

And that’s it!  Our host and it’s associated resources and VMs have been added to the vCenter and should now be managed through the vCenter interface:

Screenshot 2014-04-14 22.12.23

OK, that’s it for the sidebar.  You’ve seen how vCenter was setup and configured in between the first host installation and the nested ESX configuration where we left off.  Back to the main action!

I know that there have been (literally) thousands of articles written on nested ESX, but I decided to do one anyhow as, over time, I plan to build on this foundation entry with some content that actually will be new and interesting as it relates to hybrid cloud (stay tuned for that).  So with that out of the way, let’s review some basics about “nested ESX”.

What is Nested?

Nested ESX is exactly what it sounds like.  The idea is that you install ESX into a guest VM on a physical ESX host.  What you end up with is hypervisor on hypervisor thereby making the CPU time slicing and overall resource allocation and consumption even more complex.  So why would one do this?  Well as it turns out, this is a fantastic setup for lab testing.  You can basically build multiple virtual datacenters on a single machine and do nifty things like SRM testing.  So certainly not something one would recommend for production, but literally miraculous for labs.

Whats the Catch?

There’s always a catch, right?  Well nested is no exception, although the good news is that “out of the box” support has gotten better and better with each iteration for what started essentially as a skunk works science project.  So today there is no ESX command line hacking required, believe it or not, and ESX is actually recognized as a valid (if unsupported) guest OS.  All of that said, there are some caveats to be aware of.  The first one concerns networking.  To understand what the catch is here we first must consider what is happening in a standard ESX installation:


With virtualization, we have a physical host, running a hypervisor OS, which abstracts physical resources into virtual resource pools and brokers their consumption by guest operating systems.  So the physical uplinks which connect the host to a physical switch are connected through software to a “virtual switch”.  As virtual machines are created and deployed onto the host, they are configured with a set of virtualized hardware.  This hardware is either passed through (hardware virtualization), brokered by special software support in the guest (paravirtualization), in some cases, emulated.  With x86 virtualization, the CPU is time sliced and instructions are passed through.  So “virtual CPUs” are essentially timeshare units on the actual physical CPU.  I have a more extensive article on the various flavors of x86 virtualization that provides more background on these concepts.  Under ESX, networking is interesting in that there are two options.  Using the VMware VMXNET virtual network interface you are using a paravirtualized driver which requires installation inside the guest OS and as a result delivers optimized performance.  Alternatively, the host can emulate the function of the Intel E1000 NIC and trick the guest OS into thinking one of those actually physically exists at a given PCI I/O address range.  Whichever approach you choose, ultimately the virtual NIC will be connecting to the virtual switch.  The diagram above captures the flow.   The key point here is that the relationships are all 1:1.  A guest OS has one (or more) NICs that connect to the virtual switch, but it basically replicates how the physical world would work.  Now consider what happens when it is a hypervisor in the guest OS.

As expected, what happens is a bit of a mess.  Now you have a guest OS virtual NIC being used as the uplinks for yet another virtual switch which in turn provides a connection point for additional virtual NICs that connect guests.  Where we run into trouble is that the foundation host (the physical one) managing the base virtual switch has no idea about any virtual NICs that are provisioned by a guest OS hypervisor.  As a result, this traffic get’s dropped.  In turn, any destination traffic headed anywhere other than the virtual NIC belonging to the guest OS that the host does know about (our “primary guest”) will be dropped.  So what is the answer here?  Well it turns out we really need two things.  First, we need MAC addresses that are unknown to the physical host to be allowed to pass (these are the MAC addresses created by the guest OS hypervisor for its guests).  In addition, we then need a way for all of those guests sitting unknown up in the second hypervisor to participate in the main virtual switch.  Luckily ESX does provide two enabling configuration options that solve both of these problems.  Let’s take a look:

Screenshot 2014-04-15 14.23.10


Doesn’t this look promising?  Let’s go through them one by one:

  • Promiscuous Mode - this one is exactly what it sounds like. When enabled on a virtual switch port group, that switch essentially becomes “full broadcast”.   Any VM attached will be able to see all traffic in the port group.  Why is this?  Simply put it ensure that the primary VSS’s ignorance of the existence of MAC addresses upstream from it doesn’t matter.  Since every frame will be broadcast, these frames will hit the virtual port whether the switch intelligence thinks that port is a valid destination or not. In other words this is a sledgehammer fix to the problem.  It would be much cooler if a VSS had intelligence to actually recognize nested and learn upstream MAC addresses, but maybe that is something for the future (or maybe it won’t matter because we will all be on NSX!)
  • MAC Address Change – this setting deals with the problem going the other way.  This setting basically allows the guest OS to do locally administered station address control of its virtual NIC MAC address.  This is us telling the virtual switch intelligence to not worry about it if the MAC address allocated to the guest VM virtual NIC happens to change.
  • Forged Retransmit – a companion setting, forged retransmits basically says that the virtual switch shouldn’t be concerned if MAC address 00:00:00:00:00:0B suddenly shows up at the virtual port where  00:00:00:00:00:0A had originally attached.

Taken as a group, these settings allow traffic to flow out from nested guests (MAC changes and forged retransmits) and the return traffic to flow in to them (promiscuous mode).  So with that networking configuration done, we must be good to go right?  Well not so fast!  There is more that needs to be done for nested to work.  The next complication comes from the configuration of the virtual machine itself.  After all, we are going to be installing a hypervisor into this guest.  These days, the virtual machine monitor is no longer a pure software thing.  Even VMware (the grand daddy of x86 VMMs and last to move away from pure software) now utilizes CPU and chipset support for virtualization – namely Intel VT and AMD V.  As a result, this support (normally obscured from the guest) needs to be exposed to it.  For these options we actually need the vSphere Web Client to configure them (interesting requirement that basically makes vCenter mandatory for nested implementations in one way).  Luckily I do have a vCenter that I put up immediately after the initial ESXi 5.5 install on the physical host.  I documented the setup as a sidebar in case anyone would like to see the latest changes in both Windows and vCenter.


If we bring up the settings in the web client for a new virtual machine we are looking for the extended options under CPU:

Screenshot 2014-04-16 23.03.26

What we want here is two things:

  • Expose Hardware Virtualization to the Guest OS:  this means that the guest will be able to identify and access hardware based virtualization support in Intel VT and AMD-V
  • CPU/MMU Virtualization: this setting locked on “Hardware” for both ensures that hardware accelerated virtualization will be provided to this guest for both CPU instruction set as well as I/O MMU operations.  The alternative is “Automatic”, but since we are installing hypervisor on hypervisor we know we will need it

In addition to these settings, once the VM has been created (and it can be in the immediate “configure settings” step that follows initial creation), we can set the OS to correctly reflect our guest.  As we can see here, “VMware ESX 5.X” is now selectable as an OS under “Other”.  This step, incidentally, should alleviate the old need to set hvh.enable=true in the 5.0 and 5.1 days in order to get 64 bit guest on virtual hypervisor to work:

Screenshot 2014-04-15 14.26.52


With the above we now have everything we need to get started deploying virtual ESX guests. Installing these, with the pre-reqs done and the caveats in mind, is as easy as deploying any other OS.  If you follow the Windows guest deployment installs, but make sure to address the VM config caveats above, and point to the VMVisor Installation ISO, you will have no issues.  Similarly, adding these virtual ESX hosts to vCenter is exactly as described in the vCenter configuration entry for the physical host.  The behavior is exactly as expected.  One thing I did choose to do was create a dedicated VSS for each virtual ESX host, and assign a dedicated NIC to it.  This is a very straightforward operation from either client by selecting Networking with the Host as a focus and choosing “Add Networking”:

Screenshot 2014-04-16 23.20.30

Because the virtual ESX is really just a guest VM from the primary host view, we select new Virtual Machine Port Group:

Screenshot 2014-04-16 23.21.00

We want to go ahead and Create a New Virtual Switch here since we are dedicating VSS to virtual ESX guest:

Screenshot 2014-04-16 23.21.10

Here we will have an adapter list where we can checkbox assign an available adapter to the new VSS.  In this case my configuration is already complete so no adapters are showing, but this would be a straightforward selection followed by a straight click through on the remaining options (including naming the new VSS):

Screenshot 2014-04-16 23.21.21


When building the VM for the nested ESX guest, I attach the network adapter to the associated VSS.  So in my case 4 nested ESX instances map to 4 VSS hosts and 4 physical NIC ports on the host.  Speaking of architecture, this is ultimately what I am targeting as my design:



Some points worth calling out here:

  • I am planning 4 virtual ESX hosts rather than 3.  They will have 32GB of RAM, 1TB of disk, 4 vCPUs (2 virtual dual cores) and a 15GB SSD (for vSAN).
  • Each virtual ESX will be connected to a dedicated VSS on the main host which will have a dedicated physical NIC
  • all of the virtual ESX hosts will be joined to vCenter
  • a VDS will be configured across the virtual hosts only
  • I plan to ultimately install vCloud Director on top of all of this and configure tenant organizations (the VLAN and network config info up top)
  • vCenter and AD will run on the physical host along with some other core bits (vCenter Mobile Access, maybe one or two other things).  The main host is left with 5TB of disk, 10GB of SSD, 64GB of RAM, and 4 full time CPUs.
  • may separate the 4 hosts into 2 vCenters in order to be able to simulate two sites and do SRM (still debating this)

OK, with the above architecture in mind, let’s go ahead and create a virtual ESX guests.  And *poof*, we’re done!  Cooking show style here seems appropriate, so I will show the final product:

Screenshot 2014-04-16 15.40.53


Above we can see the physical host ( with it’s guests (vESX1-4).  Below it we can see each of these guests represented as hosts in vCenter -  The last step for this entry is to create a distributed virtual switch for these hosts so we can get started with some more advanced configuration like VXLAN and vCD.  As a refresher, a virtual distributed switch requires a dedicated NIC to assign as the vDS uplink.  Well in this case, since our hosts are virtual, this is as easy as it gets!  We just need to go into the VM settings for each virtual host and “add” a new “network adapter”.  The only catch is this will require a reboot for the new (virtual) hardware to be recognized by the (virtual) server.  Once complete, we can go ahead and create a new vDS by selecting “Create a distributed switch” with the virtual datacenter as the focus in the web client:

Screenshot 2014-04-16 15.40.53

First we choose a version for our switch (I choose current for maximum compatibility with my scenario which is “testing new stuff”):

Screenshot 2014-04-16 16.32.48

Next we can name the switch and set the number of uplink ports which governs the number of “physical” connections per host (in our case just vNICs, but also largely irrelevant):

Screenshot 2014-04-16 16.32.55

We now decide if we want to add hosts now or later (I choose now – Carpe Diem!):

Screenshot 2014-04-16 16.33.06

Next dialogue we are given an opportunity to select the hosts that will participate and the available NIC that will be connected to the vDS.  We can see here the NICs I added to each host VM (vNIC1) for this purpose:

Screenshot 2014-04-16 17.23.33

Next step is just to commit the config and create the switch.  As we can see below everything went as smooth as can be and the 4 virtual hosts are now linked by a vDS!  We are now about 70% of the way towards our diagram and all of the foundation has been laid, so this is a good stopping point.   Hope you enjoyed the (million and first) entry on Nested and stay tuned for the next entry!

Screenshot 2014-04-16 17.24.15

Two things I’ve always wanted to try in the ESX lab were DirectPath I/O and vSAN.  For the former, I always liked the idea of having a GPU accelerated virtual desktop to use as a jump server and also to test just how good GPU acceleration can be in a virtual environment. vSAN is extremely compelling because I find the idea of highly scalable and efficient distributed file systems based on DAS to be a perfect fit for many cloud scenarios if you can architect an efficient enough capacity planning and resource allocation/consumption model to match.  In the past I never had a test server platform with VT-D (or AMD-Vi), pre-requisites for VMware DirectPath directed I/O, so the GPU scenario was out.  With vSAN it was more about finding the time and catalyst to implement.  As it turns out, the T620 and the new lab effort solve both of these problems!  Before laying down the ESXi install, I decided to do a few hardware tweaks in service of my two stretch goals.  I had two passively cooled AMD PCI-E GPUs on hand (both R800 era… nothing fancy) and I also had a spare 80GB Intel SSD laying around (yes, I have SSDs “laying around” and should probably seek help).  In the case of the SSD, this particular requirement of vSAN (SSDs required to act as a buffer for the virtual SAN) can be bypassed with some ESXCLI wizardry as explained by Duncan Epping, but since I had a spare I figured I might as well use it as long as the server had a spare SATA port.  First step was to open her up (something I knew I had to do at some point anyhow just to see how clean the cable routing is!).  First up is to find the ingress point.  Absolutely fantastic design element here as there are no screws (thumb or otherwise) and the entry point is fully intuitive.  A nicely molded and solid feeling lockable handle right on the side panel.  I unlocked it, pushed down on the release trigger by gripping the handle, and pulled forward.  The door opens smoothly and settles straight down to the horizontal on its tabs.  It can be removed as well if need be:   WP_20140415_23_13_20_Pro   Inside, things are looking clean and really great: WP_20140415_23_14_08_Pro   Another cool thing is that the second the case was open, the intrusion detection tripped and the front panel LCD went amber and displayed the alert.  Very neat: WP_20140415_23_16_01_Pro Of  course the alert can be acknowledged in the iDRAC (which is accessible even with the server powered off – excellent stuff): WP_20140415_23_16_18_Pro Scoping out the interior, I noticed right away that the tool-less design approach applies to all components and that there appears to be two free x16 PCI-E slots (one top and one bottom) as well as plenty of disk shelf space above the array cage, a spare power connector on the SATA power cable going to the DVD drive, and a single free SATA connector available on the motherboard.  So far so good!  First step was to get access to the PCI-E slots by removing the card braces: WP_20140415_23_27_18_Pro The brackets are easily removed by following the directions provided by the arrow and pressing down on the tab while pulling forward.  Once out, there is free access to the PCI-E slots.  The slot clips, also tool-less, can be removed with a similar squeeze and pull motion: WP_20140415_23_28_43_Pro With the slots cleared, it was easy work installing the two GPUs in the roomy case (top and bottom shown with clips back in place): WP_20140416_00_08_50_Pro WP_20140416_00_08_45_Pro Next up was the SSD.  I decided not to do anything fancy (especially since I wasn’t 100% sure this would work).  The server is very secure and isn’t going anywhere and the disk shelves are free and clear and very conveniently placed.  The SSD is small and light so I opted to just cable it up and sit it on the shelf.  Here is a quick pic of the SSD in question before we get into the “installation”.  80GB Intel, a decent performing and very reliable (in terms of write degradation) drive back in the day: WP_20140416_00_13_21_Pro First up, a shot of the one free onboard SATA port (Shuttle SATA cable used for comedic effect): WP_20140416_00_13_11_Pro Next up, a shot of the drive bay area and free SATA power plug with the SSD “mounted”: WP_20140416_00_15_46_Pro And finally, a close up of the SSD nestled in the free bay: WP_20140416_00_16_46_Pro That’s it for the hardware tweaks.  Time to close it up and get started on the ESXi 5.5 install!  As always, this is a straightforward process.  Download and burn VMware-VMvisor-Installer-5.5.0-1331820.x86_64 to a DVD, boot her up, and let ‘er whirl.  Installer will autoload: WP_20140414_00_25_56_Pro   Initial load:   WP_20140414_00_26_17_Pro Installer file load: WP_20140414_00_28_49_Pro Installer welcome screen: WP_20140414_01_05_36_Pro EULA Acceptance: WP_20140414_01_05_50_Pro Select install disk (this is the physical host, so the PERC H710 is the target): WP_20140414_01_06_20_Pro Select a keyboard layout: WP_20140414_01_06_56_Pro Set a root password: WP_20140414_01_07_16_Pro Final system scan: WP_20140414_01_07_31_Pro “Last exit before toll”: WP_20140414_01_09_45_Pro Off to the races! WP_20140414_01_09_59_Pro Like magic (many) second later, installation is complete: WP_20140414_01_32_40_Pro WP_20140414_01_32_45_Pro First boot of shiny ESX 5.5 host: WP_20140414_01_36_38_Pro Splash screen and initializations are a good sign: WP_20140414_01_37_06_Pro As always, first step is to configure the management network (shown here post config): WP_20140414_02_09_41_Pro Interesting to have a look at all of the network adapters available in this loaded system.  Select one to use for the initial management network: WP_20140414_02_08_33_Pro   Provide some IP and DNS info or rely on DHCP: WP_20140414_02_08_46_Pro   Commit the changes, restart the network and give it a test! WP_20140414_02_09_53_Pro Did everything work? Indeed it did, thanks for asking! Screenshot 2014-04-16 13.56.55

How about the SSD and the DirecPath GPUs? Let’s take a look.  First DirecPath because the anticipation is killing me.  From the vSphere client, DirectPath settings are found under the Advanced subsection of the Configuration tab when the focus is a host.  The view will initial display an error if the server is incapable of DirectPath (no VT-D or AMD Vi), or a blank box with no errors or warnings if it can.  From here, we click “Edit” in the upper right hand corner to mark devices for passthrough usage.  The following (very interesting) dialogue box pops up:

Screenshot 2014-04-16 00.34.14

Here we can see all of the PCI devices installed in the system and recognized by ESXi.  In the list we can see the AMD GPUs and their sub-devices.  We are also able to select them.  So far so good!  Click the checkboxes and you will get a notice that the sub-devices will also be selected.  Acknowledge and click OK.  We can see that the AMD GPUs have been added and will, in theory, be available for assignment pending a host reboot (yikes):

Screenshot 2014-04-16 00.34.39


Following the (long) reboot cycle, return here and find that the GPUs are in fact available for assignment.  Hallelujah!  I am not going to assign them to a guest yet, but we will revisit this when I create the Windows 8 jump VM:

Screenshot 2014-04-16 13.01.54

So far so good.  The DirectPath seemed like the more complicated mission, so I am feeling a bit cocky as I move forward with the SSD configuration.  Of course, as always in technology, that is exactly when Murphy’s Law chooses to strike.  As it turns out, I had forgotten that the last time I used this SSD it was part of a GPT RAID 0 array. As a result, it has an extremely invalid partition table.  ESXi can see it, but errors out attempting to use it.  I decided to see how things were looking from the command line view.  Of course as always, that means first enabling SSH.  The first step is to set the focus to the host and head over to the Security Profile section of the Configuration:

Screenshot 2014-04-16 00.40.45

Select the SSH service under Services and click Properties in the upper right corner.  This will invoke the Services Properties dialogue where we can highlight the SSH service and select Options.  :

Screenshot 2014-04-16 00.41.08

In the following dialogue box we can start the service as well as configure its future startup behavior:

Screenshot 2014-04-16 00.41.12

Next up its time to hit up PUTTY.  Of course on first connect we will get the SSH certificate warning that we can just acknowledge and ignore:

Screenshot 2014-04-16 00.41.43

At that point after a quick root login we are in.  The first step is to find out how the system views the SSD device.  The best way to do this is with the ESXCLI storage enumeration command esxcli storage core device list:

Screenshot 2014-04-16 00.46.07

Wow, the SSD has a pretty odd device header!  That’s ok though, this is why copy/paste was invented!  Device in hand, and knowing that this disk has a GPT partition, I give partedUtil a try. Unfortunately  partedUtil isn’t interested either as it reports that “ERROR: Partition cannot be outside of disk”.  My luck this was the first disk in the span set and so is the one that has a “too big” partition table (the partition table for the span set).  After rebooting into various “live CD” and “boot repair” tools I have onhand, and failing miserably for various reasons (inability to recognize the Dell onboard SATA, confusion over the system having 3 GPUs, inability to recognize the onboard GPU, etc) I finally had a brainstorm – the trusty ESX installation DVD!  Sure enough, the ESXi 5.5 install was able to see the SSD and was perfectly happy nuking and repartitioning it.  At that point I had an ESX 5.5 installation partition structure (9 partitions!) hogging up about 6GB of space.  On an 80GB SSD that’s a lot of space, so I went back to the ESX command line to try partedUtil.  This time things went much better!

partedUtil get /dev/disks/t10.ATA_____INTEL_SSDSA2M080G2GC____________________CVPO012504S9080JGN__

This command returned the partition info.  A list of 9 partitions with starting/ending locations enumerated

partedUtil delete /dev/disks/t10.ATA_____INTEL_SSDSA2M080G2GC____________________CVPO012504S9080JGN__ N

With this command I was able to go ahead and delete the partitions iterating through 1-9.  Once deleted, the vSphere client was able to easily add the new datastore.  With the host as the focus again, under the Storage subsection of the Configuration tab, we can now Add a new datastore.  We’re adding a local disk, so we select Disk/LUN:

Screenshot 2014-04-16 00.36.11

Next up we select it and we can see the SSD here with its wacky device name:

Screenshot 2014-04-16 00.36.17

Next we select a file system (I’m going with VMFS 5 to keep all datastores consistent and because I plan to do vSAN):

Screenshot 2014-04-16 00.36.23

Current Disk Layout is where things errored out the first time through when the partition table was wonky.  This time we sail right past both Disk Layout and Properties (naming it “SSD”) with no errors.  For formatting, I choose to allocate the maximum space available:

Screenshot 2014-04-16 13.01.27

With everything looking good, we can click Finish to create the new datastore:
Screenshot 2014-04-16 13.01.33
And voila!  One shiny new datastore online and ready for later experimentation!

Screenshot 2014-04-16 12.51.04

Well that’s a wrap for yet another entry in the series!  We now have a fully function ESXi 5.5 base host with a 9.2TB of primary RAID based DAS for VMs and a secondary 80GB SSD datastore that will be used to support vSAN.  DirectPath is ready to go for the VDI guest.  Next up is the nested ESX installation followed by the VDI install.  Stay tuned and thanks for reading!



Last entry we got to the point where we had swapped out one of the 1TB Toshiba 7.2k SATA 2 drives for a Western Digital Red 2TB SATA3 drive and were ready to power on the server to see what would happen.  My assumption based on past experience was that the PERC H710 would complain (a lot) at best, or outright block the disk at worst.  Here is how things turned out:

Initial boot actually seemed to go well with no complaints evident at all!  Front lights were all green and nothing was showing in the PERC logs, nor were there any warning comments being thrown.  Inspired, I went ahead and replaced the other 5 drives, then prepared to do a real configuration of the RAID card and see if the behavior stayed consistent.  This was my first time configuring a PERC and I was pleasantly surprised by how intuitive the config went.  I captured it on video rather then taking screen shots:

Really a great outcome!  9.2TB RAID 5 volume online in no time at all with no warnings and solid green lights.   The addition of the T620 into the home office necessitated a full redesign of equipment placement due to space returns.  I initially considered keeping my 4 slimline white box hosts, but after seeing how cool and quiet the T620 was at first boot, I made the decision to stick with a nested ESX/one-box configuration.  This decision went a long way towards reducing the clutter and made for a pretty decent final layout.  Here is a shot of the full set of lab kit tidily stacked together awaiting installation:


2014-04-13 00.51.46


And here is a shot of the server nestled under the wife’s desk along with the fully cabled switch, firewall, NAS and UPS:

2014-04-14 20.50.38

At this point, with all of the equipment fully installed and in-place, I was satisfied with the room layout and so I decided to go ahead and explore the T620 BIOS as well as setup the iDRAC 7 in preparation for the OS install.  Rather than do a video, I took screenshots for these so I could more easily insert some written commentary.  First up lets have a look at the initial power on screens.  On power-up we are greeted with an update letting us know that the server is “Configuring Memory”.  I assume that this is just the usual memory count, quick parity check and initialization as even with 192GB of ECC RAM it passes fairly quickly (10 seconds or so):



With the memory subsystem initialized, next up is the iDRAC board.  Extremely cool that the iDRAC is initialized as early as possible during POST (very useful for an out-of-band management lights out board!):



Next up a nicely comprehensive report on CPU and memory configuration.  I really like how verbose this is including a voltage report for the RAM.  In addition here you  can see your pre-OS options – F2 for System Setup, F10 to invoke the enormously cool LifeCycle Controller capabilities which are a part of the iDRAC 7 and allow you to do full bare metal setup (including BIOS settings) remotely via the iDRAC, F11 to bring up the BIOS boot menu and prompt for boot device selection and finally F12 to invoke PXE Boot via the NIC ROM.  Speaking of which the last bit of info we can see here is the PXE boot agents header as well as the SATA AHCI BIOS header:



For this boot I selected “System Setup” which, after about 15 seconds thinking or so, launches the GUI based (with mouse support) main setup menu from which BIOS, iDRAC and Device settings can be reviewed and configured:


Starting off with the System BIOS we can see a fairly intuitive list of settings groups – System Information, Memory and Processor, SATA, Boot and Integrated Devices, Serial, System Profile, Security and last but not least Miscellaneous:



Let’s take a deeper look at a few of these starting with the Integrated Devices Settings… Most of these are self-explanatory (USB, NIC, etc), but a few are worth some additional discussion.  I/OAT DMA Engine is an Intel I/O Acceleration Technology which is part of the Intel Virtualization Enablement technologies and provides increased network interface throughput and efficiency by allowing direct chipset control of the NIC.  The I/OAT DMA setting enables this capability allowing direct memory access from the NIC via the chipset.  Of course in order for this to work, the host OS has to be aware of it.  The default setting is disabled and, considering the current VMware support statement for this capability, I opted to leave it that way. SR IOV or, “Single Root IO Virtualization” is another peripheral virtualization technology, this time controlled by the PCI SIG, which allows a single PCI device to appear as multiple devices.  Examples of where this technology has been commonly implemented is in advanced 10Gb/s adapters allowing them to virtualize themselves and present multiple discrete interfaces from a single physical to the host OS.  Once again the default here is disabled and I opted to leave it that way.  Last but not least, Memory Mapped I/O Above 4GB is pretty standard stuff for 64 bit PCI systems and allows, as it implies, 64 bit PCI devices to map IO to 64 bit memory ranges (above 4GB):



The next interesting grouping is the System Profile Settings.  Lot’s of great goodies in here including some of the usual suspects like memory voltage, frequency and turbo boost tuning.   The most interesting aspect, however, is that the top line allows for quick profile setting via template using some pre-defined defaults.  In my case I am most concerned with power and thermal efficiency so I set my configuration to “Performance Per Watt”.  It’s great how much granularity is provided for control of CPU and memory power efficiency:


Next up are the Security Settings.  Here we can set the basic system access and configuration passwords as well as control the Trusted Platform Module (TPM) state and settings:


Leaving most settings at either their default or “power efficient” values I left the System BIOS Settings behind and moved on to the iDRAC configuration.  Up top you get a summary of both the settings and firmware versions as well as the option to dive deeper on the Summary, Event Log, Network settings, Alert status, Front Panel Security settings, VirtualMedia settings, vFlash Mode, LifeCycle Controller, System Location, User Accounts, Power and Thermal settings:


The System Summary section gives you an opportunity to set some basic asset information including Data Center location Name, rack position, etc.  Super handy if you are centrally managing hundreds (or thousands) of iDRACs across a global footprint (cue salivating Dell reps!).   For a single server home lab setup it isn’t super relevant, but I set some info down just for fun and testing:


We also have an opportunity to create iDRAC users and assign roles.  This is critical as these will be the credentials you will use to access the iDRAC remotely via its web portal or other remote interfaces.  Privilege levels are Administrator (full control), Operator (limited task management) and User (restricted access):


The System Event Log menu, as expected, provides visibility into the iDRACs log:


As we can see here on a new install the log is pretty sparsely populated!


The Network Settings menu is the diametrical opposite of the System Event Log; lots and lots of fun levels to pull here.  Up top you can enable network access for the iDRAC and, if you have an Enterprise class board, configure it to use the onboard NIC for access.  In addition one of the onboard Ethernet ports can be used as well.  The usual Ethernet settings are here along with the option to register the iDRAC board with a DNS:


Of course it is helpful to be able to set a name for something configured to register in DNS and we are able to do that here as well.  In addition we can provide a static domain name or allow auto configuration as part of the DNS registration process.  We can also do all of the expected IPV4 and IPV6 configuration:



Last up we have two really cool options: IPMI over LAN and VLAN configuration.  VLAN configuration of course allows us to configure 802.1Q tagging which is super important for any management device (which in production will almost certainly sit on a management VLAN).   IPMI over LAN allows the iDRAC to participate in an Intelligent Platform Management Interface based console implementation:



Under Alerts we are able to set SNMP trap destinations for platform events generated by the iDRAC:

Thermal settings allow us to set a thermal profile for the board or control the fan independently if we prefer:


Power Configuration is a really powerful option which provides the capability to set hardware level power capping for the system as well as set the configuration of the redundant PSUs (up to 2).   Used in conjunction with Dell OpenManage and vCenter DPM the iDRAC power capping capability can be used to keep server energy consumption at predictable levels and shift resources around as needed when those levels are exceeded in order to maintain the balance.  Ultimately this can get as aggressive as choosing to shutdown VMs if needed in order to stay within a “power budget” if architected correctly:



Front Panel Security provides a really rich set of physical security controls for cases where the server is sitting in a real production environment.  One neat item is that you can control the message displayed on the front panel LCD:



With the iDRAC all configured and the network interface online and cabled up, we can now access the excellent web based interface.  Login page is clean and quite slick:

Screenshot 2014-04-13 19.58.46

System Summary provides huge detail at a glance as well as hyperlinks for deeper dive info for individual subsystems.  You can also see the thumbnail snapshot of the virtual console here.  The virtual console is absolutely a killer feature of the iDRAC and provides hardware backed remote console support outside of any OS allowing for bare metal configuration of management even headless:

Screenshot 2014-04-13 20.00.05

Fantastic view of power consumption both in real time and historically:

Screenshot 2014-04-14 18.12.57

In keeping with the common theme, another fantastically detailed view this time focused on the disk subsystem both Physical:

Screenshot 2014-04-13 20.21.04

And Virtual:

Screenshot 2014-04-13 20.21.47

Last but not least, the bells and whistles.  Configuration options here for the front panel LCD. Set to display real-time power consumption:

Screenshot 2014-04-15 20.44.22


With the iDRAC rockin’ and rollin’ there is only one top level System Setting sub-menu left – Device Settings.  Here we can see enumerated the installed and recognized devices in the system as well as access additional configuration detail.  In this build we have the iDRAC, Intel NICs both add-in and embedded, the Broadcomm 10Gb/s NIC and the PERC H710: WP_20140413_19_37_53_Pro
Speaking of which, let’s have a look at the PERC H710 setup.  On first boot we determined that the T620 had no problem with the Western Digital Red 2TB OEM disks and, as you can see in the iDRAC physical disk ‘spoiler’ screen above, even recognizes them at 6Gb/s!  With a full boat of 6 drives on-board it’s time to make a virtual disk.   CTRL-R during POST invokes the RAID BIOS setup utility.  I decided to do a quick video of the initial setup:

And here are some static shots of the successful configuration.  All six physical disks displayed along with the 9.2TB virtual disk:


Detailed view of the Physical Disk management tab (product ID correct, no warnings to be found):



Controller Management tab allowing you to enable the BIOS and control its behavior as well as set the default boot device for the RAID controller:



WP_20140413_19_28_38_ProController properties including temperature.  Running at 49C.  Seems a bit warm, but certainly acceptable:


And that pretty much wraps up this entry!  We now have a fully installed and configured server with out-of-band management ready to go and a nice juicy virtual disk awaiting OS install.  Next up, installing ESXi 5.5 so stay tuned!