VMware VCP 2020 vSphere 7 Edition Exam Prep Guide (pt.2)

Picking up where we left off, here is Section 2. Once again, this version has been shaken up quite a bit from previous VCP objectives; this section is a bit lighter than Section 1. Let’s dig in.

Section 2 – VMware Products and Solutions

Objective 2.1 – Describe the role of vSphere in the software-defined data center (SDDC)

While I think most are acquainted with what VMware is referring to when they say SDDC or Software-Defined Data Center, let us do a quick refresh for anyone that may not be aware.

VMware’s vision is a data center that is fully virtualized and completely automated. The end goal is where all these different pieces are delivered as a service. vSphere is one of the main cornerstones and what makes the rest of this vision possible. What does this look like? Here is a picture (credit to VMware)

The bottom layer is hardware. From there, the next layer is vSphere, which provides software-defined compute and memory. Next, we see vSAN, which provides software-defined storage—finally, NSX, which provides software-defined networking. Cloud Management is the next layer up.

Becoming cloud-like is the goal. Why? Cloud services are mobile, easy to move around as needed, and are easy to start up and scale, both up and down. With a self-service portal and cloud-like services, requests that previously took weeks or months to fulfill now take hours or even minutes. Using automation to deliver these services ensure it’s done the same way, every time. Using automation also makes sure it’s easy to track requestors and do appropriate charge-backs. vRealize Operations ensure that you quickly see and are notified if low on resources and when to plan for more. Site Recovery Manager and vSphere Replication enable you to continue offering those services even in the case of disaster. But it all begins with vSphere.

Objective 2.2 – Identify use cases for vCloud Foundation

vCloud Foundation is a large portion of that SDDC picture above, but instead of needing to install each piece manually, it gives you that easy install button. This easy button comes in two ways – first from an appliance called VMware Cloud Builder. This appliance initially was a way to help VMware professional services to implement VMware Validated Designs. It released to the general public in January of 2019. The appliance itself can deploy the full SDDC stack, including:

        • VMware ESXi
        • VMware vCenter Server
        • VMware NSX for vSphere
        • VMware vRealize Suite Lifecycle Manager
        • VMware vRealize Operations Manager
        • VMware vRealize Log Insight
        • Content Packs for Log Insight
        • VMware vRealize Automation
        • VMware vRealize Business for Cloud
        • VMware Site Recovery Manager
        • vSphere Replication

The second easy button is an appliance that is installed in vCloud Foundations called SDDC Manager. This tool automates the entire lifecycle management from bring-up, to configuration and provisioning, and updates and patching. Not only for the initial management cluster but infrastructure and workload clusters as well. It also makes deploying VMware Kubernetes much easier. For VMware vCloud Foundations, the Cloud Builder appliance only installs the following:

        • SDDC Manager
        • VMware vSphere
        • VMware vSAN
        • NSX for vSphere
        • vRealize Suite

We now have a better understanding of what vCloud Foundations is, let us talk use cases. VMware has highlighted the main ones here. Those use cases are:

        • Private and Hybrid Cloud
        • Modern Apps (Development)
        • VDI (Virtual Desktop Infrastructure)

It’s an exciting product, and VMware says that it simplifies management and deployment and reduces operational time. If you want to take a look at it, there are free Hands-On Labs VMware has made available here.

Objective 2.3 – Identify migration options

One of the coolest, in my opinion, features of vSphere is the ability to migrate VMs. The first iteration of this was in VMware Virtual Center 1.0 in 2003. Specifically, this was a live migration. A live migration is a virtual machine running an application that could move to another host, with no interruption. This was amazing for the time, and it’s still a fantastic feature today. There are several different types of migrations. They are:

        • Cold Migration – This migration is moving a powered-off or suspended VM to another host
        • Hot Migration – This migration involves moving a powered-on VM to another host.

Additionally, different sub-types exist depending on what resource you want to migrate. Those are:

        • Compute only – This is migrating a VM (compute and memory), but not it’s storage to another host.
        • Storage only – This is migrating a VM’s storage, but not compute and memory to another datastore.
        • Both compute and storage – This is how it sounds. Moves both compute memory and storage to a different location.

Previously these migrations were known as a vMotion (compute only), svMotion (storage only), and xvMotion or Enhanced vMotion (both compute and storage). To enable hosts to use this feature, hosts on both sides of the migration must have a VMkernel network adapter enabled for vMotion. Other requirements include:

        • If a compute migration, both hosts must be able to access the datastore where the VM’s data resides.
        • At least a 1 Gb Ethernet connection
        • Compatible CPUs (or Enhanced vMotion Compatibility mode enabled on the cluster.)

Another type of migration is a cross vCenter Migration. This migrates a VM between vCenter Systems that are connected via Enhanced Link Mode. TheirvCenter’s times must be synchronized with each other, and they must both be at vSphere version 6.0 or later. Using cross vCenter Server migration, you can also perform a Long-Distance vSphere vMotion Migration. This type of migration is a vMotion to another geographical area within 150 milliseconds latency of each other, and they must have a connection speed of at least 250 Mbps per migration.

Now that we have identified the types of migrations, what exactly is vSphere doing to work this magic? When the administrator initiates a compute migration :

        • A VM is created on the destination host called a “shadow VM.”
        • The source VM’s memory is copied over the vMotion network to the destination’s host VM. The source VM is still running and being accessed by users during this, potentially updating memory pages.
        • Another copy pass starts to capture those updated memory pages.
        • When almost all the memory has been copied, the source VM is stunned or paused for the final copy and transfer of the device state.
        • A Gratuitous ARP or GARP is sent on the subnet updating the VM’s location, and users begin using the new VM.
        • The source VM’s memory pages are cleaned up.

What about a storage vMotion?

        • Initiate the svMotion in the UI.
        • vSphere uses something called the VMkernel data mover or if you have a storage array that supports vSphere Storage APIs Array Integration (VAAI) to copy the data.
        • A new VM process is started
        • Ongoing I/O is split using a “mirror driver” to be sent to the old and new virtual disks while this is ongoing.
        • vSphere cuts over to the new VM files.

Migrations are useful for many reasons. Being able to relocate a VM off one host or datastore to another enables sysadmins to perform hardware maintenance, upgrade or update software, and redistribute load for better performance. You can enable encryption for migration as well to be more secure—a massive tool in your toolbox.

Objective 2.4 – Identify DR use cases

Many types of disasters can happen in the datacenter. From something smaller such as power outage of a host to large, major scale natural disasters, VMware tries to cover you with several types of DR protection.

High Availability (HA):

HA works by pooling hosts and VMs into a single resource group. Hosts are monitored, and in the event of a failure, VMs are restarted on another host. When you create a HA cluster, an election is held, and one of the hosts is elected master. All others are subordinates. The master host has the job of keeping track of all the VMs that are protected and communication with the vCenter Server. It also needs to determine when a host fails and distinguish that from when a host no longer has network access. Hosts communicate with each other over the management network. There are a few requirements for HA to work.

        • All hosts must have a static IP or persistent DHCP reservation
        • All hosts must be able to communicate with each other, sharing a management network

HA has several essential jobs. One is determining priority and order that VMs are restarted when an event occurs. HA also has VM and Application Monitoring. The VM monitoring feature directs HA to restart a VM if it doesn’t detect a heartbeat received from VM Tools. Application Monitoring does the same task with heartbeats from an application. VM Component Monitoring or VMCP allows vSphere to detect datastore accessibility and restart the VM if a datastore is unavailable. For exam takers, in the past, VMware tried to trick people on exams by using the old name for HA, which was FDM or Fault Domain Manager

There are several options in HA you can configure. Most defaults will work fine and don’t need to be changed unless you have a specific use case. They are:

  • Proactive HA – This feature receives messages from a provider like Dell’s Open Manage Integration plugin. Based on those messages, HA migrates VMs to a different host due to the possible impending doom of a host. It makes recommendations in Manual mode or automatically moves them in Automatic mode. After VMs are off the host, you can choose how to remediate the sick host. You can place it in maintenance mode, which prevents running any future workloads on it. Or you could put it in Quarantine mode, which allows it to run some workloads if performance is low. Or a mix of those with…. Mixed Mode.
  • Failure Conditions and responses – This is a list of possible host failure scenarios and how you want vSphere to respond to them. This is better and gives you way more control than in past versions (5.x).
  • Admission Control – What good is a feature to restart VMs if you don’t have enough resources to do so? Admission Control is the gatekeeper that makes sure you have enough resources to restart your VMs in case of a host failure. You can ensure resource availability in several ways. Dedicated failover hosts, cluster resource percentage, slot policy, or you can disable it (not useful unless you have a specific reason). Dedicated hosts are dedicated hot spares. They do no work or run VMs unless there is a host failure. This is the most expensive (other than failure itself). Slot policy takes the largest VM’s CPU and the largest VM’s memory (can be two different VMs) and makes that into a “slot” then it determines how many slots your cluster can satisfy. Then it looks at how many hosts can fail and still keep all VMs powered on based on that base slot size. Cluster Resources Percentage looks at total resources needed and total available and tries to keep enough resources free to permit you to lose the number of hosts you specify (subtracting the number of resources of those hosts). You can also override this and set aside a specific percentage. For any of these policies, if the cluster can’t satisfy resources for more than existing VMs in the case of a failure, it prevents new VMs from powering on.
  • Heartbeat Datastores – Used to monitor hosts and VMs when the HA network has failed. It determines if the host is still running or if a VM is still running by looking for lock files. This automatically uses at least 2 datastores that all the hosts are connected to. You can specify more or specific datastores to use.
  • Advanced Options – You can use this to set advanced options for the HA Cluster. One might be setting a second gateway to determine host isolation. To use this, you need to set two options.
    1) das.usedefaultisolationaddress and
    2) das.isolationaddress[…]

    The first specifies not to use the default gateway, and the second sets additional addresses.

There are a few other solutions that touch more on Disaster Recovery.

Fault Tolerance

While HA keeps downtime to a minimum, the VM still needs to power back on from a different host. If you have a higher priority VM that can’t withstand almost any outage, Fault Tolerance is the feature you need to enable.

Fault Tolerance or FT creates a second running “shadow” copy of a VM. In the event the primary VM fails, the secondary VM takes over, and vSphere creates a new shadow VM. This feature makes sure there is always a backup VM running on a second, separate host in case of failure. Fault Tolerance has a higher resource cost due to higher resilience; you are running two exact copies of the same VM, after all. There are a few requirements for FT.

        • Supports up to 4 FT VMs with no more than 8 vCPUs between them
        • VMs can have a maximum of 8vCPUs and 128 GB of RAM
        • HA is required
        • There needs to be a VMkernel with the Fault Tolerance Logging role enabled
        • If using DRS, EVC mode must be enabled.

Fault Tolerance works essentially by being a vMotion that never ends. It uses a technology called Fast Checkpointing to take checkpoints of the source VM every 10 milliseconds or so and send that data to the shadow VM. This data is sent using a VMkernel port with Fault Tolerance logging enabled. There are two files behind the scenes that are important. One is shared.vmft and .ft-generation. The first is to make sure the UUID or identifier for the VM’s disk stays the same. The second is in case you lose connectivity between the two. That file determines which VM has the latest data and that VM is designated the primary when they are both back online.

vSphere Replication

Remote site Disaster Recovery options include vSphere Replication and Site Recovery Manager. You can use vSphere Replication or both in conjunction to replicate a site or individual VMs in case of failure or disaster. While I’m not going to delve deep into vSphere Replication or SRM, you should know their capabilities and, at a high level, how they work.

vSphere Replication is configured on a per-VM basis. Replication can happen from a primary to a secondary site or from multiple sites to a single target site. It uses a server-client model with appliances on both sides. A VMkernel with the vSphere Replication and vSphere Replication NFC (network file copy) role can exist to create an isolated network for replication.

Once you have your appliances setup and you choose which VMs you want to be replicated, you need to figure out what RPO to enable. RPO is short for Recovery Point Objective. RPO is how often you want it to replicate the VM and can be as short as 5 minutes or as long as every 24 hours.

Site Recovery Manager uses vSphere Replication but is much more complex and detailed. You can specify runbooks (recovery plans), how to bring the other side up, test your failovers, and more.

The above tools are in addition to VMware’s ability to integrate with many companies to do backups.

Objective 2.5 – Describe vSphere integration with VMware Skyline

VMware Skyline is a product available to VMware supported customers with a current Production or VMware Premier Support contract. What is it? A proactive support service integrated with vSphere, allowing VMware support to view your environment’s configurations and logs needed to speed up the resolution to a problem.

Skyline does this in a couple of ways. Skyline has a Collector appliance and a Log Assist where it can upload log files directly to VMware (with customer’s permission). Products supported by Skyline include vSphere, NSX for vSphere, vRealize Operations, and VMware Horizon. If you want to learn even more, visit the datasheet here.

That covers the second section. The next post is coming soon.

VMware VCP 2020 vSphere 7 Edition Exam Prep Guide

Introduction

Hello again. My 2019 VCP Study Guide was well received, so, to help the community further, I decided to embark on another exam study guide with vSphere 7. This guide is exciting for me to write due to the many new things I’ll get to learn myself, and I look forward to learning with everyone.

I am writing this guide pretty much how I talk and teach in real life with a bit of Grammarly on the back end, to make sure I don’t go completely off the rails. You may also find the formatting a little weird. This is because I plan on taking this guide and binding it in a single guide at the end of this blog series. I will try to finish a full section per blog post unless it gets too large. I don’t have a large attention span to read huge technical blogs in one sitting and find most people learn better with smaller chunks of information at a time. (I wrote this before I saw the first section.)

In these endeavors, I personally always start with the Exam Prep guide. That can be found on VMware’s website here. The official code for this exam is 2VO-21.20, and the cost of the exam is $250.00. There are a total of 70 questions with a duration of 130 minutes. The passing score, as always, is 300 on a scale of 1-500. The exam questions are presented in a single and multiple-choice format. You can now take these exams online, in the comfort of your own home. A webcam is required, and you need to pan your webcam at the beginning of the session, and it needs to be on the whole time.

The exam itself focuses on the following topics:

  • Section 1 – Architecture and Technologies
  • Section 2 – Products and Solutions
  • Section 3 – Planning and Designing
  • Section 4 – Installing, Configuring, and Setup
  • Section 5 – Performance-tuning, Optimization, and Upgrades
  • Section 6 – Troubleshooting and Repairing
  • Section 7 – Administrative and Operational Tasks

Each of these topics can be found in the class materials for Install, Configure, and Manage, or Optimize and Scale classes, or supplemental papers by VMware on the web. Let’s begin with the first topic.

Section 1 – Architectures and Technologies

Objective 1.1 – Identify the pre-requisites and components for a vSphere Implementation

A vSphere implementation or deployment has two main parts. ESXi server and vCenter Server.

ESXi Server

The first is the virtual server itself or ESXi server. The ESXi host server is the piece of the solution that allows you to run virtual machines and other components of the solution (such as NSX kernel modules). It provides the compute, memory, and in some cases, storage resources for a company to run. There are requirements the server needs to meet for ESXi. They are:

  • A supported hardware platform. VMware has a compatibility guide they make available here. If running a production environment, your server should be checked against that.
  • ESXi requires a minimum of two CPU cores.
  • ESXi requires the NX/XD or No Execute bit enabled for the CPU. The NX/XD setting is in the BIOS of a server.
  • ESXi requires a minimum of 4 GB of RAM. It would be best if you had more to run a lot of the workloads a business requires, however.
  • The Intel VT-x or AMD RVI setting in the BIOS must be enabled. Most of the time, this is already enabled on servers, and you won’t need to worry about it.
  • 1+ Gigabit network controller is a requirement. Using the compatibility guide above, make sure your controller is supported.
  • SCSI disk or RAID LUN. Because of their higher reliability, ESXi calls them “local drives,” and you can use them as a “scratch” volume. A scratch partition is a disk partition used by VMware to host logs, updates, or other temporary files.
  • SATA drives. You can use these but are labeled “remote” drives. Because of being labeled “remote,” you can’t use them for a scratch partition.

vSphere 7.0 can be installed using UEFI BIOS mode or regular old BIOS mode. If using UEFI, you have a wider variety of drives you can use to boot. Once you use one of those modes (UEFI or Legacy) to boot from, it is not advisable to try to change after installed. If you do, you may be required to reinstall. The error you might receive is “Not a VMware boot bank.”

One significant change in vSphere 7.0 is system storage requirements. ESXi 7.0 system storage volumes can now occupy up to 138 GB of space. A VMFS datastore is only created if there is an additional 4 GB of space. If one of the “local” disks aren’t found, then ESXi operates in a degraded where the scratch disk is placed in a RAMDISK or all in RAM. This is not persistent through reboots of the physical machine and displays an unhappy message until you specify a location for the scratch disk.

Now that being said, you CAN install vSphere 7 on a USB as small as 8 GB. You should, if at all possible, use a larger flash device. Why? ESXi uses the additional space for an expanded core dump file, and it uses the other memory cells to prolong the life of the media. So try to use a 32 GB or larger flash device.

With the increased usage of flash media, VMware saw fit to talk about it in the install guide. In this case, it specifically called out using M.2 and other Non-USB low-end flash media. There are many types of flash media available on the market that have different purposes. Mixed-use case, high performance, and more. The use case for the drive should determine the type bought. VMware recommends you don’t use low-end flash media for datastores due to VMs causing a high level of wear quickly, possibly causing the drives to fail prematurely.

While the guide doesn’t ask to call this out, I thought it would be a good thing to show a picture of how the OS disk layout differs from the previous version of ESXi. You should know that when you upgrade the drive from the previous version, you can’t rollback.

vCenter Server

The ESXi has the resources and runs the virtual machines. In anything larger than a few hosts, management becomes an issue. vCenter Server allows you to manage and aggregate all your server hardware and resources. But, vCenter Server allows you to do so much more. Using vCenter Server, you can also keep tabs on performance, licensing, and update software. You can also do advanced tasks such as move virtual machines around your environment. Now that you realize you MUST have one, let’s talk about what it is and what you need.

vCenter is deployed on an ESXi host. So you have to have one of those running first. It is deployed using its included installer to the ESXi host, not as you would an OVA. The machine itself is upgraded from previous versions. It now contains the following:

  • Photon OS 3.0 – This is the Linux variant used by VMware
  • vSphere authentication services
  • PostgreSQL (v11.0) – Database software used
  • VMware vSphere Lifecycle Manager Extension
  • VMware vSphere Lifecycle Manager

But wait… there used to be the vSphere vCenter Server and Platform Services? You are correct. In the future, due to design flows and simplicity, etc., VMware combined all services into a single VM. So what services are actually on this machine now? I’m glad you asked.

  • Authentication Services – which includes
    • vCenter Single Sign-On
    • vSphere License Service
    • VMware Certificate Authority
  • PostgreSQL
  • vSphere Client -HMTL5 client that replaces the previous FLEX version (Thank God)
  • vSphere ESXi Dump Collector – Support tool that saves active memory of a host to a network server if the host crashes
  • vSphere Auto Deploy – Support tool that can provision ESXi hosts automagically once setup for it is completed
  • VMware vSphere Lifecycle Manager Extension – tool for patch and version management
  • VMware vCenter Lifecycle Manager – a tool to automate the process of virtual machines and removing them

Now that we have covered the components let’s talk deployment. You can install vCenter Server using either the GUI or CLI. If using the GUI install, there are two stages. The first stage installs the files on the ESXi host. The second stage configures parameters you feed into it. The hardware requirements have changed from the previous version as well. Here is a table showing the changes in green.

Objective 1.2 Describe vCenter Topology

Topology is a lot simpler to talk about going forward because there is a flat topology. There is no vCenter Server service and Platform Controllers anymore. Everything is consolidated into one machine. If you are running a previous version and have broken vCenter Server out into those roles, don’t despair! There are tools VMware has created that allow you to consolidate them back. There are a few things to add to that.

First, Enhanced Link Mode. This is where you can log into one vCenter and manage up to 15 total vCenter instances in a single Single Sign-On domain. This is where the flat topology comes in. Enhanced Link Mode is set up during the installation of vCenter. Once you exceed the limits of a vCenter, you install a new one and link it. There is also vCenter Server High Availablity. Later on, in this guide, we cover how its configured. For now, here is a quick overview of what it is.

vCenter High Availability is a mechanism that protects your vCenter Server against host and hardware failures. It also helps reduce downtime associated with patching your vCenter Server. It does this by using 3 VMs. It uses two full VCSA nodes and a witness node. One VCSA node is active and one passive. They are connected by a vCenter HA network, which is created when you set this up. This network is used to replicate data across and connectivity to the witness node.

For a quick look at vCenter limits compared to the previous version:

Objective 1.3 – Identify and differentiate storage access protocols for vSphere (NFS, iSCSI, SAN, etc.)

The section I wrote in the previous guide still covers this well, so I am using that.

Local Storage
Local storage is storage connected directly to the server. This includes a Direct Attached Storage (DAS) enclosure that connects to an external SAS card or storage in the server itself. ESXi supports SCSI, IDE, SATA, USB, SAS, flash, and NVMe devices. You cannot use IDE/ATA or USB to store virtual machines. Any of the other types can host VMs. The problem with local storage is that the server is a single point of failure or SPOF. If the server fails, no other server can access the VM. There is a unique configuration that you can use that would allow sharing local storage, however, and that is vSAN. vSAN requires flash drives for cache and either flash or regular spinning disks for capacity drives. These are aggregated across servers and collected into a single datastore or drive. VM’s are duplicated across servers, so if one goes down, access is still retained, and the VM can still be started and accessed.
Network Storage
Network Storage consists of dedicated enclosures that have controllers that run a specialized OS on them. There are several types, but they share some things in common. They use a high-speed network to share the storage, and they allow multiple hosts to read and write to the storage concurrently. You connect to a single LUN through only one protocol. You can use multiple protocols on a host for different LUNs

Fibre Channel or FC is a specialized type of network storage. FC uses specific adapters that allow your server to access it, known as Fibre Channel Host Bus Adapters or HBAs. Fibre Channel typically uses cables of glass to transport their signal, but occasionally use copper. Another type of Fibre Channel can connect using a regular LAN. It is known as Fibre Channel over Ethernet or FCoE.

ISCSI is another storage type supported by vSphere. This uses regular ethernet to transport data. Several types of adapters are available to communicate to the storage device. You can use a hardware ISCSI adapter or software. If you use a hardware adapter, the server offloads the SCSI and possibly the network processing. There are dependent hardware and independent hardware adapters. The first still needs to use the ESXi host’s networking. Independent hardware adapters can offload both the ISCSI and networking to it. A software ISCSI adapter uses a standard ethernet adapter, and all the processing takes place in the CPU of the hosts.

VMware supports a new type of adapter known as iSER or ISCSI Extensions for RDMA. This allows ESXi to use RDMA protocol instead of TCP/IP to transport ISCSI commands and is much faster.

Finally, vSphere also supports the NFS 3 and 4.1 protocol for file-based storage. This type of storage is presented as a share to the host instead of block-level raw disks. Here is a small table on networked storage for more leisurely perusal.

Technology Protocol Transfer Interface
Fibre Channel FC/SCSI Block access FC HBA
Fibre Channel over Ethernet (FCoE) FCoE / SCSI Block access
  • Converged Network Adapter

  • NIC with FCoE support
ISCSI ISCSI Block access
  • ISCSI adapter (dependent or independent)

  • NIC (Software adapter)
NAS IP / NFS File level Network adapter

Objective 1.3.1 – Describe datastore types for vSphere

vSphere supports several different types of datastores. Some of them have features ties to particular versions, which you should know. Here are the types:

  • VMFS – VMFS can be either version 5 or 6. VMFS is the file system installed on a block storage device such as an ISCSI LUN or local storage. You cannot upgrade a datastore to VMFS 6 from 5. You have to create new and migrate VMs to it. On VMFS, vSphere handles all the locking of files and controls access to them. It is a clustering file system that allows access of files to more than one host at a time.
  • NFS – Version 3 and 4.1 are supported. NFS is a NAS file system accessed over a TCP/IP network. You can’t access the same volume using both versions at the same time. Unlike VMFS, the NAS device controls access to the files.
  • vSAN – vSAN aggregates local storage drives on a server into a single datastore accessible by the nodes in the vSAN cluster.
  • vVol – A vVol datastore is a storage container on a block device.

Objective 1.3.2 – Explain the importance of advanced storage configuration (VASA, VAAI, etc.)

This is the first time I’ve seen this covered in an objective. I like that some of the objectives are covering more in-depth material. It’s hard to legitimize the importance of them without describing them and what they do a bit. I will explain what they are and then explain why they are essential.

  • VASA – VASA stands for vSphere APIs for Storage Awareness. VASA is extremely important because hardware storage vendors use it to inform vCenter Server about their capabilities, health, and configurations. VASA is essential for vVols, vSAN, and Storage Policies. Using Storage Policies and VASA, you can specify that VMs need a specific performance profile or configuration, such as RAID type.
  • VAAI – VAAI stands for vSphere APIs for Array Integration. There are two APIs or Application Programming Interfaces, which are:
    • Hardware Acceleration APIs – This is for arrays to offload some storage operations directly to the array better. In turn, this reduces the CPU cycles needed for specific tasks.
    • Array Thin Provisioning APIs – This helps monitor space usage on thin-provisioned storage arrays to prevent out of space conditions, and does space reclamation when data is deleted.
  • PSA – PSA stands for Pluggable Storage Architecture. These APIs allow storage vendors to create and deliver specific multipathing and load-balancing plug-ins that are best optimized for specific storage arrays.

Especially with some of the technology VMware offers (vSAN), these APIs are undoubtedly helpful for sysadmins and your infrastructure. Being able to determine health and adequately fit and apply a customer’s requirements for a VM is essential for business.

Objective 1.3.3 – Describe Storage Policies

Storage Policies are a mechanism by which you can assign storage characteristics to a specific VM. Let me explain. Say you have a critical VM, and you want to make sure it sits on a datastore that is backed-up every 4 hours. Using Storage Policies, you can assign that to that VM. You can ensure that the only datastores that it can use are ones that satisfy that requirement. Or you need to limit a VM to a specific performance. You can do that via Storage Policies. You can create policies based on the capabilities of your storage array, or you can even create ones using tags. To learn even more, you can read about it in VMware’s documentation here.

Objective 1.3.4 – Describe basic storage concepts in K8s, vSAN, and vSphere Virtual Volumes (vVols)

K8s
I couldn’t find this in the materials listed, so I went hunting. For anyone wanting to read more about it, I found the info HERE.

vSphere with Kubernetes supports three types of storage.

  • Ephemeral virtual disks – As the name signifies, this storage is very much temporary—this type of virtual disk stores objects such as logs or other temporary data. Once the pod ceases to exist, so does this disk. This type of disk persists across restarts. Each pod only has one disk.
  • Container Image virtual disks – This disk contains the software that is to be run. When the pod is deleted, the virtual disks are detached.
  • Persistent volume virtual disks – Certain K8s workloads require persistent storage to save data independent of the pod. Persistent volumes objects are backed by First Class Disks or an Improved Virtual Disk. This First Class Disk is identified by UUIDs, which remain valid even if the disk is relocated or snapshotted.

vSAN
vSAN is converged, software-defined storage that uses local storage on all nodes and aggregates them into a single datastore. This usable by all machines in the vSAN cluster.

A minimum of 3 disks is required to be part of a vSphere cluster and enabled for vSAN. Each ESXi host has a minimum of 1 flash cache disk and 1 spinning or 1 flash capacity disk. A max of 7 capacity disks can be in a single disk group, and up to 5 disk groups can exist per host.

vSan is object-based, uses a proprietary VMware protocol to communicate over the network, and uses policies to enable features needed by VMs. You can use policies to enable multiple copies of data, performance throttling, or stripe requirements.

vVols
vVols shakes storage up a bit. How so? Typically you would carve storage out into LUNs, and then you would create datastores on them. The storage administrator would be drawn into architectural meetings with the virtualization administrators to decide on storage schemas and layouts. This had to be done in advance, and it was difficult to change later if something different was needed.

Another problem was that management such as speeds or functionality was controller at a datastore level. Multiple VMs are stored on the same datastore, and if they required different things, it would be challenging to meet their needs. vVols helps change that. It improves granular control, allowing you to cater storage functionality to the needs of individual VMs.

vVols map virtual disks and different pieces, such as clones, snapshots, and replicas, directly to objects (virtual volumes) on a storage array. Doing this allows vSphere to offload tasks such as cloning, and snapshots to the storage array, freeing up resources on the host. Because you are creating individual volumes for each virtual disk, you can apply policies at a much more granular level—controlling aspects such as performance better.

vVols creates a minimum of three virtual volumes, the data-vVol (virtual disk), config-vVol (config, log, and descriptor files), and swap-vVol (swap file created for VM memory pages). It may create more if there are other features used, such as snapshots or read-cache.

vVols start by creating a Storage Container on the storage array. The storage container is a pool of raw storage the array is making available to vSphere. Then you register the storage provider with vSphere. You then create datastores in vCenter and create storage policies for them. Next, you deploy VMs to the vVols, and they send data by way of Protocol Endpoints. The best picture I’ve seen I’m going to lift and use here from the Fast Track v7 course by VMware.

Objective 1.4 – Differentiate between vSphere Network I/O Control (NIOC) and vSphere Storage I/O Control (SIOC)

NIOC = Network I/O Control
SIOC = Storage I/O Control

Network I/O Control allows you to determine and shape bandwidth for your vSphere networks. They work in conjunction with Network Resource Pools to allow you to determine the bandwidth for specific types of traffic. You enable NIOC on a vSphere Distributed Switch and then set shares according to needs in the configuration of the VDS. This is a feature requiring Enterprise Plus licensing or higher. Here is what it looks like in the UI.

Storage I/O Control allows cluster-wide storage I/O prioritization. You can control the amount of storage I/O that is allocated to virtual machines to get preference over less critical virtual machines. This is accomplished by enabling SIOC on the datastore and set shares and upper limit IOPS per VM. SIOC is enabled by default on SDRS clusters. Here is what the screen looks like to enable it.

Objective 1.5 – Describe instant clone architecture and use cases

Instant Clone technology is not new. It was initially around in vSphere 6.0 days but was initially called VMFork. But what is it? It allows you to create powered-on virtual machines from the running state of another. How? The source VM is stunned for a short period. During this time, a new Delta disk is created for each virtual disk, a checkpoint created and transferred to the destination virtual machine. Everything is identical to the original VM. So identical, you need to customize the virtual hardware to prevent MAC address conflicts. You must manually edit the guest OS. Instant clones are created using API calls.

Going a little further in-depth, using William Lam’s and Duncan Epping’s blog posts here and here, we learn that as of vSphere 6.7, we can use vMotion, DRS, and other features on these instant clones. Transparent Page Sharing is used between the Source and Destination VMs. There are two ways instant clones are created. One is Running Source VM Workflow where a delta disk is created for each of the destination VMs created on the source VM. This workflow can cause issues the more of them created due to an excessive amount of delta disks on the source VM. The second is the Frozen Source VM Workflow. This workflow uses a single delta on the source VM and a single delta disk on each of the Destination VMs. This workflow is much more efficient. If you visit their blogs linked above, you can see diagrams depicting the two workflows.

Use cases (per Duncan) are VDI, Container hosts, Hadoop workers, Dev/Test, and DevOps.

Objective 1.6 – Describe Cluster Concepts

A vSphere cluster is a group of ESXi host machines. When grouped, vSphere aggregates all of the resources of each host and treats it as a single pool. There are several features and capabilities you can only do with clusters.

Objective 1.6.1 – Describe Distributed Resource Scheduler

vSphere’s Distributed Resource Scheduler is a tool used to keep VMs running smoothly. It does this, at a high level, by monitoring the VMs and migrating them to the hosts that allow them to run best. In vSphere 6.x, DRS ran every 5 minutes and concentrated on making sure the hosts were happy and had plenty of free resources. In vSphere 7, DRS runs every 60 seconds and is much more concentrated on VMs and their “happiness.” DRS scores each VM and, based on that, migrates or makes recommendations depending on what DRS is set to do. A bit more in-depth in objective 1.6.3.

Objective 1.6.2 – Describe vSphere Enhanced vMotion Compatibility (EVC)

EVC or Enhanced vMotion Compatibility allows you to take different processor generation hosts and still combine them and their resources in a cluster. Different generation processors have different features sets and options on them. EVC masks the newer ones, so there is a level feature set. Setting EVC means you might not receive all the benefits of newer processors. Why? A lot of newer processors are more efficient, therefore lower clock speed. If you mask off their newer feature sets (in some cases how they are faster), you are left with lower clock speeds. Starting with vSphere 6.7, you can enable EVC on a per VM basis allowing for migration to different clusters or across clouds. EVC becomes part of the VM itself. To enable per-VM EVC, the VM must be off. If cloned, the VM retains the EVC attributes.

Objective 1.6.3 – Describe how Distributed Resource Scheduler (DRS) scores virtual machines

VM “Happiness” is the concept that VMs have an ideal or best case throughput, or resource usage, and actual throughput. If there is no contention or competition on a host for a resource, those two should match, which makes the VM’s “happiness” 100%. DRS takes a look at the hosts in the cluster to determine if another host can provide a better score for the VM; it takes steps to migrate or recommend it to another host. Several costs are determined to see if it makes sense to move it. CPU costs, Memory costs, Networking Costs, and even Migration costs. A lower score does not necessarily mean that the VM is running poorly. Why? Some costs taken into account include if the host can accommodate a burst in that resource. The actual equation (thanks Niels Hagoort)

  • Goodness (actual throughput) = Demand (ideal throughput) – Cost (loss of throughput)
  • Efficiency = Goodness (actual throughput) / Demand (ideal throughput)
  • Total efficiency = EfficiencyCPU * EfficiencyMemory * EfficiencyNetwork
  • Total efficiency on host = VM DRS score

Keep in mind that the score is not indicative of a health score but an indicator of resource contention. A higher number indicates less resource contention, and the VM is receiving the resources it needs to perform.

Objective 1.6.4 – Describe vSphere High Availability

vSphere HA or High Availablity, is a feature designed for VM resilience. Hosts and VMs are monitored, and in the event of a failure, VMs restart on another host.

There are several configuration options to configure. Most defaults work well unless you have a specific use case. Let’s go through them:

  • Proactive HA – This feature receives messages from a provider like Dell’s Open Manage Integration plug-in and, based on those messages, migrate VMs to a different host due to the impending doom of the original host. It can make recommendations on the Manual mode or Automatically. After all VMs are off the host, you can choose how to remediate the sick host. You can either place it in maintenance mode, which prevents running any workloads on it. You can also put it in Quarantine mode, which allows it to run some workloads if performance is affected. Or a mix of those with…. Mixed Mode.
  • Failure Conditions and responses – This is a list of possible host failure scenarios and how you want vSphere to respond to them. This is expanded and gives you wayyy more control than in the past.
  • Admission Control – What good is a feature to restart VMs if you don’t have enough resources to do so? Not very. Admission Control is the gatekeeper that makes sure you have enough resources to restart your VMs in the case of a host failure. You ensure this a couple of ways. Dedicated failover hosts, cluster resource percentage, slot policy, or you can disable it. Dedicated hosts are like a dedicated hot spare in a RAID. They do no work or run VMs until there is a host failure. This is the most expensive option (other than failure itself). Slot policy takes the largest VM’s CPU and the largest VM’s memory (can be two different VMs) and makes that into a “slot.” It then determines how many slots your cluster can satisfy. Next, it looks at how many hosts can fail and keep all VMs powered on. Cluster Resources percentage looks at the total resources needed and total available and tries to keep enough to lose a certain number of hosts you specify. You can also override and set a specific percentage to reserve. For any of these policies, if the cluster can’t satisfy needed VMs, it prevents new VMs from turning on.
  • Datastore for Heartbeating – This is to monitor hosts and VMs when the HA network as failed. Using a datastore heartbeat can determine if the host is still running or if a VM is still running, by looking at the lock files. This setting automatically tries to make sure that it has at least 2 datastores connected to all the hosts. You can specify more or specific datastores to use.
  • Advanced Options – This option is to set advanced options for the HA Cluster. One such setting might be setting a second gateway to determine host isolation. To enable you need to set two options. 1) das.usedefaultisolationaddress and 2) das.isolationaddress[…] The first specifies not to use the default gateway, and the second sets additional addresses.

Objective 1.7 – Identify vSphere distributed switch and vSphere standard switch capabilities

VDS and VSS are networking objects in vSphere. VDS stands for Virtual Distributed Switch, and VSS is Virtual Standard Switch.

Virtual Standard Switch is the default switch. It is what the installer creates when you deploy ESXi. It has only a few features and requires you to configure a switch on every host manually. As you can imagine, this is tedious and difficult to configure the same every time, which is what you need to do for VM’s to move across hosts seamlessly. (You could create a host profile template to make sure they are the same.)

Standard Switches create a link between physical NICs and virtual NICs. You can name them essentially whatever you want, and you can assign VLAN IDs. You can shape traffic but only outbound. Here is a picture I lifted from the official documentation for a pictorial representation of a VSS.

VDSs, on the other hand, add a management plane to your networking. Why is this important? It allows you to control all host networking through one UI. Distributed switches require a vCenter and a certain level of licensing-Enterprise Plus or higher unless you buy vSAN licensing. Essentially you are still adding a switch to every host, just a little bit fancier one that can do more things, that you only have to change once to change all hosts.

There are different versions of VDS you can create, which are based on the version they were introduced. Each newer version adds features. A higher version retains all the features of the lower one and adds to it. Some features include Network I/O Control (NIOC), which allows you to shape your bandwidth incoming and outgoing. VDS also includes a rollback ability, so if you make a change and it loses connectivity, it reverts the changes automatically.

Here is a screenshot of me making a new VDS and some of the features that each version adds:

Here is a small table showing the differences between the switches.

Feature vSphere Standard Switch vSphere Distributed Switch
VLAN Segmentation Yes Yes
802.1q tagging Yes Yes
NIC Teaming Yes Yes
Outbound traffic shaping Yes Yes
Inbound traffic shaping No Yes
VM port blocking No Yes
Private VLANs No Yes (3 Types – Promiscuous, Community, Isolated)
Load Based Teaming No Yes
Network vMotion No Yes
NetFlow No Yes
Port Mirroring No Yes
LACP support No Yes
Backup and restore network configuration No Yes
Link Layer Discovery Protocol No Yes
NIOC No Yes

Objective 1.7.1 – Describe VMkernel Networking

VMkernel adapters are set up on the host, for the host itself to interact with the network. Your management and other functions of the host are taken care of by VMkernel adapters. The roles specifically are:

  • Management traffic – Using the VMkernel for this by selecting the checkbox, carries configuration and management communication for the host, vCenter Server, and HA traffic. When ESXi is first installed, a VMkernel adapter is created with management selected on it. You should have more than one VMkernel to carry management traffic for redundancy.
  • vMotion traffic – Selecting this enables you to migrate VMs from one host to another. Both hosts must have vMotion enabled. You can use multiple physical NICs for faster migrations. Be aware that vMotion traffic is not encrypted – separate this network for greater security.
  • Provisioning traffic – This is used for you to separate VM cold migrations, cloning, and snapshot migration. A use case could be VDI for this, or just using a slower network to keep live vMotions separated and not slowed by migrations that don’t need the performance.
  • IP Storage and discovery – This is not a selection box when you create a VMkernel, but still an important role. This role allows you to connect to ISCSI and NFS storage. You can use multiple physical NICs and “bind” each to a single VMkernel. This enables multipathing for additional throughput and redundancy.
  • Fault Tolerance traffic – One of the features you can enable, Fault Tolerance, allows you to create a second mirror copy of a VM. To keep both machines precisely the same requires a lot of network traffic. This role must be enabled and is used for that traffic.
  • vSphere Replication traffic – As it sounds like, this role handles the replication traffic sent to a vSphere Replication server.
  • vSAN traffic – If you have a vSAN cluster, every host that participates must have a vSAN VMkernel to handle and separate the large amount of traffic needed for vSAN. Movement of objects and retrieval requires a large amount of network bandwidth, so it would be best to have this on as fast of a connection as you can. vSAN does support multiple VMkernels for vSAN but not on the same subnet.

Objective 1.7.2 – Manage networking on multiple hosts with vSphere distributed switch

You should have a decent idea now of what a vSphere distributed switch is and what it can do. The next part is to show you what the pieces are and describe how to use them.

First, you need to create the vSphere distributed switch. Go to the networking tab by clicking on the globe in the HTML5 client. Then right-click on the datacenter and select Distributed Switch > New Distributed Switch

You must now give the switch a name – you should make it descriptive, so it’s easy to know what it does

Choose the version corresponding to the features you want to use.

You need to tell VMware how many uplinks per host you want to use. This is the number of physical NICs that are used by this switch. Also, select if you want to enable Network I/O Control and if you want vSphere to create a default port group for you – if so, give it a name.

Finish the wizard.

You can now look at a quick topology of the switch by clicking on the switch, then Configure and Topology.

After creating the vSphere distributed switch, hosts must be associated with it to use it. To do that, you can right-click on the vSphere distributed switch and click on Add and Manage Hosts.

You now have a screen that has the following options: Add Hosts, Manage host networking, and Remove hosts.

Since your switch is new, you need to Add hosts. Select that and on the next screen, click on New Hosts.

Select the hosts that you want to be attached to this switch and click OK and then Next again.

Now assign the physical NICs to an uplink and click Next

You can now move any VMkernel adapters over to this vSphere distributed switch if desired.

Same with VM networking

You now complete it. And of course, you notice you can make changes to all the hosts during the same process. This is one part of what makes vSphere distributed switches great.

Objective 1.7.3 – Describe Networking Policies

Networking policies are rules on how you want virtual switches, both standard or distributed, to work. Several policies can be configured on your switches. They apply at a switch level. If needed, however, you CAN override them at a port group level. Here is a bit of information on them:

Virtual Standard Switch Policies:

vSphere Distributed Switch Policies:

  • Traffic Shaping – This is different depending on which switch you are using. Standard switches can only do Egress (outgoing), and vSphere distributed switches can do ingress as well. You can establish an average bandwidth over time, peak bandwidth in bursts, and burst size.
  • Teaming and Failover – This setting enables you to use more than one physical NIC to create a team. You then select load balancing algorithms and what should happen in the case of a NIC failure
  • Security – Most homelabers know this setting due to needing to set Promiscuous Mode to allow nested VMs to talk externally. Promiscuous mode rejects or allows network frames to the VM. Mac Address Changes will either reject or allow MAC addresses different than the one assigned to the VM. Forged Transmits drop outbound frames from a VM with a MAC address different than the one specified for the VM in the .vmx configuration file.
  • VLAN – enables you to specify a VLAN type (VLAN, VLAN trunking, or Private PLAN) and assigns a value.
  • Monitoring – Using this, you can turn on NetFlow monitoring.
  • Traffic Filtering and marking – This policy lets you protect the network from unwanted traffic and apply tags to delineate types of traffic.
  • Port Blocking – This allows you to block ports from sending or receiving data selectively.

Objective 1.7.4 – Manage Network I/O Control on a vSphere distributed switch

One of the features that you can take advantage of on a vSphere distributed switch is NIOC or Network I/O Control. Why is this important? Using NIOC, you control your network traffic. You set shares or priorities to specific types of traffic, and you can also set reservations and hard limits. To get to it, select the vSphere distributed switch and then in the center pane, Configure, then Resource Allocation. Here is a picture of NIOC:

If you edit one of the data types, this is the box for that.

There are several settings to go through here. Let’s discuss them.

  • Shares – This is the weight you associate with the type of network traffic when there is congestion. You can assign Low, Normal, High, or Custom. Low = 25, Normal = 50, High = 100 shares. Custom can be any number you want it to be from 1-100. Shares do not equal percentage; in other words, the total doesn’t add up to 100%. If you have one with Normal shares of 50 and another with 100, the one with 100 will receive twice as much bandwidth as the one with 50. Again this only comes into play when there is network congestion.
  • Reservation – This is a guarantee that vSphere makes available to this type of traffic. If not needed, this bandwidth becomes available to other types of system traffic (not VM.) A maximum of 75% of the total bandwidth can be reserved
  • Limit – The maximum bandwidth allowed for that type of traffic. If the system has plenty of extra, it still won’t allow a limit to be exceeded.

You can also set up a custom type of traffic with the Network Resource Pool.

Objective 1.8 – Describe vSphere Lifecycle Manager concepts (baselines, cluster images, etc.)

Managing a large number of servers gets difficult and cumbersome quickly. In previous versions of vSphere, there was a tool called VUM or vSphere Update Manager. VUM was able to do a limited number of things for us. It could upgrade and patch hosts, install and update third-party software on hosts, and upgrade virtual machine hardware and VMware Tools. This was useful but left a few important things out. Things like hardware firmware and maintain a baseline image for cluster hosts. Well, fret no more! Starting with vSphere 7, a new tool called Lifecycle Manager was introduced. Here are some of the things you can do:

  • Check hardware of hosts against the compatibility guide, and vSAN Hardware Compatibility List
  • Install a single ESXi image on all hosts in a cluster
  • Update the firmware of all ESXi in a cluster
  • Update and Upgrade all ESXi hosts in a cluster together

Just as with VUM, you can download updates and patches from the internet, or you can manually download them for dark sites. Keep in mind to use some of these features, you need to be using vSphere 7 on your hosts. Here is a primer just for those that are new to this or those needing a refresh.

Baseline – this is a group of patches, extensions, or an upgrade. There are 3 default baselines in Lifecycle Manager: Host Security Patches, Critical Host Patches, and Non-Critical Host Patches. You cannot edit or delete these. You can create your own.

Baseline Group – is a collection of non-conflicting baselines. For example, you can combine Host Security Patches, Critical Host Patches, and Non-Critical Host Patches into a single Baseline Group. You then attach this to an inventory object, such as a cluster or a host. You can then check the object for compliance. If it isn’t in compliance, remediation installs the updates. If the host can’t be rebooted, staging the software to it first loads the software and waits to install until a time of your choosing.

In vSphere 7, there are now Cluster baseline images. You set up an image and use that as the baseline for all ESXi 7.0 hosts in a cluster. Here is what that looks like:

In the image, you can see you load an image of ESXi (the .zip file, not ISO), and you can add a vendor add-on and firmware and drivers. Components allow you to load individual VIBs (VMware Installation Bundles) for hardware or features.

From the above, you can deduce that the new Lifecycle Manager will be a great help in managing the host’s software and hardware.

Objective 1.9 – Describe the basics of vSAN as primary storage

vSAN is VMware’s in-kernel software-defined storage solution that uses local storage and aggregates them into a single distributed datastore to be used by cluster nodes. vSAN requires a cluster and hardware that has been approved and on the vSAN hardware compatibility guide. vSAN is object-based, and when you provision a VM, its pieces are broken down into specific objects. They are:

  • VM Home namespace – stores configuration files such as the .vmx file.
  • VMDK – virtual disk
  • VM Swap – this is the swap file created when the VM is powered on
  • VM memory – this is the VM’s memory state if the VM is suspended or has snapshots taken with preserve memory option
  • Snapshot Delta – Created if a snapshot is taken

VMs are assigned storage policies that are rules applied to the VM. Policies can be availability, performance, or other storage characteristics that need to be assigned to the VM.

A vSAN cluster can be a “Hybrid” or “All-Flash” cluster. A hybrid cluster is made up of flash drives and rotational disks, whereas an all-flash cluster consists of just flash drives. Each host, or node, contributes at least one disk group to storage. Each disk group consists of 1 flash cache drive, and 1-7 capacity drives, rotational or flash. A total of 5 disk groups can reside on a node for a total of 40 disks. The cache disk on a hybrid cluster is used for read caching and write buffering (70% read, 30% write.) On an all-flash cluster, the cache disk is just for write buffering (up to 600GB.)

vSAN clusters are limited by vSphere maximums of 64 nodes per cluster but typically use a max of 32. You can scale up, out, or back and supports RAID 1, 5, and 6. Different VM’s can have different policies and different storage characteristics using the same datastore.

Objective 1.9.1 – Identify basic vSAN requirements (networking, disk count, type)

We went over a few of them above but let’s list vSAN’s requirements entirely.

  • 1 Flash drive for cache per disk group– can be SAS, SATA, or PCIe
  • 1-7 drives per disk group – can be SAS, SATA, or PCIe flash
  • 1 GB NIC for Hybrid or 10 Gbe + for all-flash clusters with a VMkernel port tagged for vSAN
  • SAS / SATA / NVMe Controller – must be able to work in pass-thru or Raid 0 mode (per disk) to allow vSAN to control it
  • IPv4 or IPv6 and supports Unicast
  • Minimum of 32 GB RAM per host to accommodate a maximum of 5 disk groups and 7 disks per disk group.

Although typically you need 3 nodes minimum for a vSAN cluster, 4 is better for N+1 and taking maintenance into account. In other cases, 2-node clusters also exist for smaller Remote Branch Office or ROBO installations.

Objective 1.10 – Describe vSphere Trust Authority Architecture

Starting with vSphere 6.7, VMware introduced support for Trusted Platform Module or TPM 2.0 and the host attestation model. TPMs are that little device can be installed in servers that can serve as a cryptographic processor and can generate keys. It can also store materials, such as keys, certificates, and signatures. They are tied to specific hardware (hence the security part), so you can’t buy a used one off eBay to install in your server. The final feature of TPMs is what we are going to use here or determining if a system’s integrity is intact. It does this by an act called attestation. Using UEFI and TPM, it can determine if a server booted with authentic software.

Well, that’s all great, but vSphere 6.7 was view-only; there were no penalties or repercussions if the software wasn’t authentic. What’s changed?

Now, introduced in vSphere 7, we have vSphere Trust Authority. This reminds me of Microsoft’s version of this called Hyper-V Shielded Installs. Essentially you would create a hyper-secure cluster called Host Guardian Service, and then you would have 1 or more guarded hosts and shielded VMs. This is essentially the same concept.

You create a vSphere Trust Authority which can either establish its own management cluster apart from your regular hosts. The better way is to have a completely separate cluster, but to get started, it can use an existing management cluster. They won’t be running any normal workload VMs so they can be small machines. Once established, it has two tasks to perform:

  • Distribution of encryption keys from the KMS (taking over this task for the vCenter server)
  • Attestation of other hosts

If a host fails attestation now, the vTA will withhold keys for it, preventing secure VMs from running on that host until it passes attestation. Thanks to Bob Planker’s blog here for explaining it.

Objective 1.11 – Explain Software Guard Extensions (SGX)

Intel’s Software Guard Extensions or SGX were created to meet the needs of the trusted computing industry. How so? SGX is a security extension on some modern CPUs. SGX allows software to create private memory regions called enclaves. The data in enclaves is only able to be accessed by the intended program and is isolated from everything else. Typically this is used for blockchain and secure remote computing.

vSphere 7 now has a feature called vSGX or virtual SGX. This feature allows the VMs to access Intel’s technology if it’s available. You can enable it for a VM through the HTML5 web client. For obvious reasons (can’t access the memory), you can’t use this feature with some of vSphere’s other features such as vMotion, suspend and resume, or snapshots (unless you don’t snapshot memory).

That ends the first section. Next up, we will go over VMware Products and Solutions, which is a lot lighter than this one was. Seriously my fingers hurt.

Can you upgrade and Upsize your VCSA?

While brainstorming about one of our labs, the question was raised on whether you can upsize your VCSA while upgrading to a newer version. Specifically, from 6.5u2 to 6.7U1 (build 8815520 to 11726888). We wanted to upgrade to the latest version but we also believe we had outgrown the original VCSA size that we deployed. VMware has made this really simple. I did a quick test in my home lab, and that is what this post will be based on.

To start you obtain the VCSA .iso that you are going to upgrade to. After you download it, you go ahead and mount it. Just like you would with a normal install/upgrade, run the appropriate installer. For me it is the Windows one which is located under the \vcsa-ui-installer\win32 directory. The installer.exe launches the following window:

We chose the Upgrade icon here. The next screen lets you know this is a 2-stage process. How it will perform this is:

  1. It will deploy a new appliance that will be your new vCenter.
  2. All of your current data and configurations will be moved over from the old VCSA to the new.

After the copy process is complete it will power off the old VCSA but not delete it. Move to the next screen and accept the License Agreement. The third screen looks like this:

You need to put in the information of the source VCSA that you will be migrating from here. Once you click Connect To Source it will ask you for more info. Specifically, what is your source VCSA being hosted on. This could be a single host or it could be another vCenter.

You will be asked to accept the SSL Certificates. The next screen will ask you for where you are going to put the new appliance. This can be either a host or a vCenter instance.

Step 5 is setting up the target appliance VM. This is the new VCSA that you will be deploying. Specifically, what do you want to name it and what the root password is.

Step 6 is where we can change the size of the deployment. I had a tiny in the previous deployment and I decided that was too small. This time I want to go one step up to the “Small” size. You can see the deployment requirements listed below in a table.

Next step is configuring your network settings.

And the last screen to this stage is just to confirm all your settings. This will then deploy the appliance (during which you grab a nice glass of scotch and wait…preferably something nice like my Macallan 12yr)
Once that has been finished, you are off to Part 2 of the process: Moving your information over. The first screen you will be presented with (after running checks) is Select Upgrade Data. You will be given a list of the data you can move over and approximate number of scotches you will need for the wait. (Maybe that last part is made up, but hey you can find out anyway amirite?)

Since my environment that I am moving is relatively pristine, I don’t have much data to move. It is estimating 39 min but it actually took less time. You make your decision (seems pretty straightforward what kind of data you would be interested in) and move to the next screen which is whether you join VMware’s CEIP program or Customer Experience Program. The last screen before the operation kicks off is a quick summary and then a check box at the bottom asking you to make sure you were a decent sysadmin and backed up the source vCenter before you start this process. I personally did not on this, but like I said there was no data on it anyways. So we kick off the operation.

Clicking Finish gives you notification box that the source vCenter will be shutdown once this is complete. Acknowledge that and away we go!

Once completed successfully, you are given the prompt to enter into your new vCenter which I have done here and here is the brand new shiny.

I will also link the video here to the process. The video is about 15 min long (truncated from about 45 min total) Disclaimers include: There are many more things you will need to think about before doing this to a production environment. Among them are, will all the versions of VMware products I have work together. You can find that out by referencing here:

https://www.vmware.com/resources/compatibility/sim/interop_matrix.php#interop
Interoperability Matrix for VMware products

You also need to make sure you can Upgrade from your current version to the selected version by going to the same page above but on the Upgrade Path

Another really important thing to consider is what order you need to upgrade your products. You can find that for 6.7 here.

https://kb.vmware.com/s/article/53710
Update sequence for vSphere 6.7 and its compatible VMware products (53710)