VMware VCP 2020 vSphere 7 Edition Exam Prep Guide (pt.2)

Picking up where we left off, here is Section 2. Once again, this version has been shaken up quite a bit from previous VCP objectives; this section is a bit lighter than Section 1. Let’s dig in.

Section 2 – VMware Products and Solutions

Objective 2.1 – Describe the role of vSphere in the software-defined data center (SDDC)

While I think most are acquainted with what VMware is referring to when they say SDDC or Software-Defined Data Center, let us do a quick refresh for anyone that may not be aware.

VMware’s vision is a data center that is fully virtualized and completely automated. The end goal is where all these different pieces are delivered as a service. vSphere is one of the main cornerstones and what makes the rest of this vision possible. What does this look like? Here is a picture (credit to VMware)

The bottom layer is hardware. From there, the next layer is vSphere, which provides software-defined compute and memory. Next, we see vSAN, which provides software-defined storage—finally, NSX, which provides software-defined networking. Cloud Management is the next layer up.

Becoming cloud-like is the goal. Why? Cloud services are mobile, easy to move around as needed, and are easy to start up and scale, both up and down. With a self-service portal and cloud-like services, requests that previously took weeks or months to fulfill now take hours or even minutes. Using automation to deliver these services ensure it’s done the same way, every time. Using automation also makes sure it’s easy to track requestors and do appropriate charge-backs. vRealize Operations ensure that you quickly see and are notified if low on resources and when to plan for more. Site Recovery Manager and vSphere Replication enable you to continue offering those services even in the case of disaster. But it all begins with vSphere.

Objective 2.2 – Identify use cases for vCloud Foundation

vCloud Foundation is a large portion of that SDDC picture above, but instead of needing to install each piece manually, it gives you that easy install button. This easy button comes in two ways – first from an appliance called VMware Cloud Builder. This appliance initially was a way to help VMware professional services to implement VMware Validated Designs. It released to the general public in January of 2019. The appliance itself can deploy the full SDDC stack, including:

        • VMware ESXi
        • VMware vCenter Server
        • VMware NSX for vSphere
        • VMware vRealize Suite Lifecycle Manager
        • VMware vRealize Operations Manager
        • VMware vRealize Log Insight
        • Content Packs for Log Insight
        • VMware vRealize Automation
        • VMware vRealize Business for Cloud
        • VMware Site Recovery Manager
        • vSphere Replication

The second easy button is an appliance that is installed in vCloud Foundations called SDDC Manager. This tool automates the entire lifecycle management from bring-up, to configuration and provisioning, and updates and patching. Not only for the initial management cluster but infrastructure and workload clusters as well. It also makes deploying VMware Kubernetes much easier. For VMware vCloud Foundations, the Cloud Builder appliance only installs the following:

        • SDDC Manager
        • VMware vSphere
        • VMware vSAN
        • NSX for vSphere
        • vRealize Suite

We now have a better understanding of what vCloud Foundations is, let us talk use cases. VMware has highlighted the main ones here. Those use cases are:

        • Private and Hybrid Cloud
        • Modern Apps (Development)
        • VDI (Virtual Desktop Infrastructure)

It’s an exciting product, and VMware says that it simplifies management and deployment and reduces operational time. If you want to take a look at it, there are free Hands-On Labs VMware has made available here.

Objective 2.3 – Identify migration options

One of the coolest, in my opinion, features of vSphere is the ability to migrate VMs. The first iteration of this was in VMware Virtual Center 1.0 in 2003. Specifically, this was a live migration. A live migration is a virtual machine running an application that could move to another host, with no interruption. This was amazing for the time, and it’s still a fantastic feature today. There are several different types of migrations. They are:

        • Cold Migration – This migration is moving a powered-off or suspended VM to another host
        • Hot Migration – This migration involves moving a powered-on VM to another host.

Additionally, different sub-types exist depending on what resource you want to migrate. Those are:

        • Compute only – This is migrating a VM (compute and memory), but not it’s storage to another host.
        • Storage only – This is migrating a VM’s storage, but not compute and memory to another datastore.
        • Both compute and storage – This is how it sounds. Moves both compute memory and storage to a different location.

Previously these migrations were known as a vMotion (compute only), svMotion (storage only), and xvMotion or Enhanced vMotion (both compute and storage). To enable hosts to use this feature, hosts on both sides of the migration must have a VMkernel network adapter enabled for vMotion. Other requirements include:

        • If a compute migration, both hosts must be able to access the datastore where the VM’s data resides.
        • At least a 1 Gb Ethernet connection
        • Compatible CPUs (or Enhanced vMotion Compatibility mode enabled on the cluster.)

Another type of migration is a cross vCenter Migration. This migrates a VM between vCenter Systems that are connected via Enhanced Link Mode. TheirvCenter’s times must be synchronized with each other, and they must both be at vSphere version 6.0 or later. Using cross vCenter Server migration, you can also perform a Long-Distance vSphere vMotion Migration. This type of migration is a vMotion to another geographical area within 150 milliseconds latency of each other, and they must have a connection speed of at least 250 Mbps per migration.

Now that we have identified the types of migrations, what exactly is vSphere doing to work this magic? When the administrator initiates a compute migration :

        • A VM is created on the destination host called a “shadow VM.”
        • The source VM’s memory is copied over the vMotion network to the destination’s host VM. The source VM is still running and being accessed by users during this, potentially updating memory pages.
        • Another copy pass starts to capture those updated memory pages.
        • When almost all the memory has been copied, the source VM is stunned or paused for the final copy and transfer of the device state.
        • A Gratuitous ARP or GARP is sent on the subnet updating the VM’s location, and users begin using the new VM.
        • The source VM’s memory pages are cleaned up.

What about a storage vMotion?

        • Initiate the svMotion in the UI.
        • vSphere uses something called the VMkernel data mover or if you have a storage array that supports vSphere Storage APIs Array Integration (VAAI) to copy the data.
        • A new VM process is started
        • Ongoing I/O is split using a “mirror driver” to be sent to the old and new virtual disks while this is ongoing.
        • vSphere cuts over to the new VM files.

Migrations are useful for many reasons. Being able to relocate a VM off one host or datastore to another enables sysadmins to perform hardware maintenance, upgrade or update software, and redistribute load for better performance. You can enable encryption for migration as well to be more secure—a massive tool in your toolbox.

Objective 2.4 – Identify DR use cases

Many types of disasters can happen in the datacenter. From something smaller such as power outage of a host to large, major scale natural disasters, VMware tries to cover you with several types of DR protection.

High Availability (HA):

HA works by pooling hosts and VMs into a single resource group. Hosts are monitored, and in the event of a failure, VMs are restarted on another host. When you create a HA cluster, an election is held, and one of the hosts is elected master. All others are subordinates. The master host has the job of keeping track of all the VMs that are protected and communication with the vCenter Server. It also needs to determine when a host fails and distinguish that from when a host no longer has network access. Hosts communicate with each other over the management network. There are a few requirements for HA to work.

        • All hosts must have a static IP or persistent DHCP reservation
        • All hosts must be able to communicate with each other, sharing a management network

HA has several essential jobs. One is determining priority and order that VMs are restarted when an event occurs. HA also has VM and Application Monitoring. The VM monitoring feature directs HA to restart a VM if it doesn’t detect a heartbeat received from VM Tools. Application Monitoring does the same task with heartbeats from an application. VM Component Monitoring or VMCP allows vSphere to detect datastore accessibility and restart the VM if a datastore is unavailable. For exam takers, in the past, VMware tried to trick people on exams by using the old name for HA, which was FDM or Fault Domain Manager

There are several options in HA you can configure. Most defaults will work fine and don’t need to be changed unless you have a specific use case. They are:

  • Proactive HA – This feature receives messages from a provider like Dell’s Open Manage Integration plugin. Based on those messages, HA migrates VMs to a different host due to the possible impending doom of a host. It makes recommendations in Manual mode or automatically moves them in Automatic mode. After VMs are off the host, you can choose how to remediate the sick host. You can place it in maintenance mode, which prevents running any future workloads on it. Or you could put it in Quarantine mode, which allows it to run some workloads if performance is low. Or a mix of those with…. Mixed Mode.
  • Failure Conditions and responses – This is a list of possible host failure scenarios and how you want vSphere to respond to them. This is better and gives you way more control than in past versions (5.x).
  • Admission Control – What good is a feature to restart VMs if you don’t have enough resources to do so? Admission Control is the gatekeeper that makes sure you have enough resources to restart your VMs in case of a host failure. You can ensure resource availability in several ways. Dedicated failover hosts, cluster resource percentage, slot policy, or you can disable it (not useful unless you have a specific reason). Dedicated hosts are dedicated hot spares. They do no work or run VMs unless there is a host failure. This is the most expensive (other than failure itself). Slot policy takes the largest VM’s CPU and the largest VM’s memory (can be two different VMs) and makes that into a “slot” then it determines how many slots your cluster can satisfy. Then it looks at how many hosts can fail and still keep all VMs powered on based on that base slot size. Cluster Resources Percentage looks at total resources needed and total available and tries to keep enough resources free to permit you to lose the number of hosts you specify (subtracting the number of resources of those hosts). You can also override this and set aside a specific percentage. For any of these policies, if the cluster can’t satisfy resources for more than existing VMs in the case of a failure, it prevents new VMs from powering on.
  • Heartbeat Datastores – Used to monitor hosts and VMs when the HA network has failed. It determines if the host is still running or if a VM is still running by looking for lock files. This automatically uses at least 2 datastores that all the hosts are connected to. You can specify more or specific datastores to use.
  • Advanced Options – You can use this to set advanced options for the HA Cluster. One might be setting a second gateway to determine host isolation. To use this, you need to set two options.
    1) das.usedefaultisolationaddress and
    2) das.isolationaddress[…]

    The first specifies not to use the default gateway, and the second sets additional addresses.

There are a few other solutions that touch more on Disaster Recovery.

Fault Tolerance

While HA keeps downtime to a minimum, the VM still needs to power back on from a different host. If you have a higher priority VM that can’t withstand almost any outage, Fault Tolerance is the feature you need to enable.

Fault Tolerance or FT creates a second running “shadow” copy of a VM. In the event the primary VM fails, the secondary VM takes over, and vSphere creates a new shadow VM. This feature makes sure there is always a backup VM running on a second, separate host in case of failure. Fault Tolerance has a higher resource cost due to higher resilience; you are running two exact copies of the same VM, after all. There are a few requirements for FT.

        • Supports up to 4 FT VMs with no more than 8 vCPUs between them
        • VMs can have a maximum of 8vCPUs and 128 GB of RAM
        • HA is required
        • There needs to be a VMkernel with the Fault Tolerance Logging role enabled
        • If using DRS, EVC mode must be enabled.

Fault Tolerance works essentially by being a vMotion that never ends. It uses a technology called Fast Checkpointing to take checkpoints of the source VM every 10 milliseconds or so and send that data to the shadow VM. This data is sent using a VMkernel port with Fault Tolerance logging enabled. There are two files behind the scenes that are important. One is shared.vmft and .ft-generation. The first is to make sure the UUID or identifier for the VM’s disk stays the same. The second is in case you lose connectivity between the two. That file determines which VM has the latest data and that VM is designated the primary when they are both back online.

vSphere Replication

Remote site Disaster Recovery options include vSphere Replication and Site Recovery Manager. You can use vSphere Replication or both in conjunction to replicate a site or individual VMs in case of failure or disaster. While I’m not going to delve deep into vSphere Replication or SRM, you should know their capabilities and, at a high level, how they work.

vSphere Replication is configured on a per-VM basis. Replication can happen from a primary to a secondary site or from multiple sites to a single target site. It uses a server-client model with appliances on both sides. A VMkernel with the vSphere Replication and vSphere Replication NFC (network file copy) role can exist to create an isolated network for replication.

Once you have your appliances setup and you choose which VMs you want to be replicated, you need to figure out what RPO to enable. RPO is short for Recovery Point Objective. RPO is how often you want it to replicate the VM and can be as short as 5 minutes or as long as every 24 hours.

Site Recovery Manager uses vSphere Replication but is much more complex and detailed. You can specify runbooks (recovery plans), how to bring the other side up, test your failovers, and more.

The above tools are in addition to VMware’s ability to integrate with many companies to do backups.

Objective 2.5 – Describe vSphere integration with VMware Skyline

VMware Skyline is a product available to VMware supported customers with a current Production or VMware Premier Support contract. What is it? A proactive support service integrated with vSphere, allowing VMware support to view your environment’s configurations and logs needed to speed up the resolution to a problem.

Skyline does this in a couple of ways. Skyline has a Collector appliance and a Log Assist where it can upload log files directly to VMware (with customer’s permission). Products supported by Skyline include vSphere, NSX for vSphere, vRealize Operations, and VMware Horizon. If you want to learn even more, visit the datasheet here.

That covers the second section. The next post is coming soon.