Blog

Oracle RAC with Eager Zeroed Thick VMDK expansion breaks the cluster!

Hello,

Extending a VMDK which is EagerZeroedThick in a cluster, the extended part is only LazyZeroed.

This is what we figured out recently while expanding one of the VMDK’s in a Oracle RAC cluster and it broke the cluster.

If you have clustered VM’s with VMDK’s provisioned as eager zeroed thick it is important to note that we should not increase the VMDK size from GUI else it will change the format to Lazy zeroed thick and might affect the cluster.

You may get following error if you try doing that: –

Failed to start the virtual machine.

Module DiskEarly power on failed.

Cannot open the disk ‘/vmfs/volumes/584ae890-e1a5f7e8-0a80-0090fac15f4e/VM.vmdk’ or one of the snapshot disks it depends on.

Thin/TBZ/Sparse disks cannot be opened in multiwriter mode.

VMware ESX cannot open the virtual disk “/vmfs/volumes/584ae890-e1a5f7e8-0a80-0090fac15f4e/VM.vmdk” for clustering. Verify that the virtual disk was created using the thick option.

The best would be to add a new VMDK (eager zeroed thick provision) or increase the space via command line. Refer blog https://blogs.vmware.com/vsphere/2012/06/extending-an-eagerzeroedthick-disk.html  which explains about this.

Note: – A code change has been introduced in ESXi 6.0 which changed the process to extend full eager-zero virtual disks, but this involves a stun operation on the virtual machine, which is an expected behaviour.

VMware KB https://kb.vmware.com/kb/2135380

The pre-code change method to extend eager-zeroed virtual disks is done by a lazy-extended operation which does not require the virtual machine to be stunned, because it does not eager-zero the new extension.

NB:- I have even tried this on vCenter 6.0 but the format does change to lazy zeroed.

Hope this helps!

vRealize Orchestrator cluster in inconsistent state !

Hello,

Recently we were trying to perform migration of vCAC 6.2.0 to vRA 7.3 and encountered an issue while setting up vRealize Orchestrator cluster.

The environment is distributed installation with redundant components.

During troubleshooting we found vRealize Orchestrator cluster was not in sync.

Found below erros in logs:

2017-08-30 05:57:34.682+0000 [membershipScheduler-1] WARN  {} [ClusterTimeSynchVerifier] Current node: 456bf2ec-40ab-4b6f-9065-d539a8088ede time NOT SYNCHED with 89967fd2-3cca-4c83-a199-e22ec6915558 time

2017-08-30 05:57:39.683+0000 [membershipScheduler-1] WARN  {} [ClusterTimeSynchVerifier] Current node: 456bf2ec-40ab-4b6f-9065-d539a8088ede time NOT SYNCHED with 89967fd2-3cca-4c83-a199-e22ec6915558 time

This clearly states the time between both the vRO components were not in sync.

Below steps were carried out by GSS to resolve the issue :-

  • Checked in vRA VAMI page for time settings , NTP servers had different IP’s – Time diff was approx. 6-7 minutes.
  • Removed and later we added correct IP’s. Node 1 had 2 NTP servers and node 2 only had 1 NTP server.
  • After rebooting both the appliances time was in sync and we did not see any errors.
  • Had to manually remove both the nodes from cluster.
  • Joined both the nodes manually one by one in the cluster.
  • Restarted vco-server services manually.

This helped resolve the cluster inconsistent issue.

 

Hope this helps!

 

 

 

“Unable to reverse replication for the virtual machine ‘VM NAME’, VRM Server generic error”

Hello,

One of my customers was performing a DR for one of the critical applications. The DR was successful and we were able to recover the VM’s.

But since the application team had a requirement to run the application on the DR site for a week it was important to re-protect the VM and perform reverse replication.

Now while performing re-protect we encountered an error “Unable to reverse replication for the virtual machine ‘VM NAME’, VRM Server generic error” 

Please check the documentation for any troubleshooting information. The detailed exception is: ‘Unexpected status code: 503’.

Below is the screenshot of the error

SRM_VR.PNG

 

We first looked for services on SRM and vSphere replication manager service server and saw it running.

We still  restarted VRMS service on each of the replication appliances and tried the re-protect again.

This time we saw below error in the SRM logs :-

(dr.hbrProvider.fault.UnableToReverseReplication) {
–>    dynamicType = <unset>,
–>    faultCause = (hms.remote.fault.PbmConnectionFault) {
–>       dynamicType = <unset>,
–>       faultCause = (vmodl.MethodFault) null,
–>       faultMessage = (vmodl.LocalizableMessage) [
–>          (vmodl.LocalizableMessage) {
–>             dynamicType = <unset>,
–>             key = “com.vmware.vim.binding.hms.remote.fault.PbmConnectionFault.en”,
–>             message = “Cannot connect to Profile-driven storage service.”,
–>          }
–>       ],
–>       originalMessage = “Cannot connect to Profile-driven storage service.”,
–>       msg = “Cannot connect to Profile-driven storage service.”
–>    },
–>    productionVm = ‘vim.VirtualMachine:vm-669’,
–>    recoveredVm = ‘vim.VirtualMachine:vm-1022’,
–>    vmName = “XXXXXXXX”,
–>    protectedVm = ‘dr.replication.ProtectedVm:protected-vm-2317’,
–>    msg = “”,
–> }
So it was giving an error that it is unable to connect to Profile driven storage service.

When we looked at both the Production and recovery vCenter servers we saw the Profile-driven storage service was stopped.

Later we started the service and did wait for a few minutes and started the re-protect again which worked and reversed the replication of the VM’s.

If you see similar issues in your infrastructure look at the state of the Profie driven storage service on vCenter servers. It should be running.

Hope this helps!

 

Get Ready for VMware Horizon Cloud on Microsoft…

Get Ready for VMware Horizon Cloud on Microsoft Azure

Get Ready for VMware Horizon Cloud on Microsoft…

Businesses today are increasingly mobile, globally distributed and fast paced. While this increasingly diverse workforce expects a consumer-simple interface for accessing work data and applications, IT faces the same challenge as before: securing the enterprise. The number one question facing organizations is: How can I ensure my users are productive, while securing data and driving […] The post Get Ready for VMware Horizon Cloud on Microsoft Azure appeared first on VMware End-User Computing…Read More


VMware Social Media Advocacy

vCenter displays incorrect IP address order in VM summary page for Windows & Linux OS !

One of my customers was facing an issue wrt IP address order on summary page of Virtual machine on vCenter server.

The Virtual machine had two network adapters inside the Guest OS, the first configured with a service IP and the second configured with a Backup IP.

Now the customer has integrated vCenter with an external monitoring solution, which picks the first IP of the VM from the summary page and run scripts remotely on the server to gather VM statistics.

In one of the cases, the first IP address on the VM summary page was the backup IP instead of the Service IP.

It is because of this reason the scripts were failing on respective VMs.

We quickly looked into Edit settings of the VM to check if the order of the network adapters were correct and found they were set to correct order.

41.png

Later we bumped into the VMware KB https://kb.vmware.com/kb/1002627 that says it is a known issue and VMware is aware of this issue.

The KB also lists a workaround of this issue by changing the network connection order from inside the Widows Guest.

To change the order of network adapters in Windows 2000/Windows 2003/Windows XP:

  1. Click Start > Control panel > Network Connections.
  2. Click the Advanced Tab.
  3. Click Advanced Settings.
  4. Click Adapters and Bindings.
  5. Change the order of network cards and bring the production network at the top using Up and Down buttons.

To changing the order of network adapters in Windows 2008/Windows 7:

  1. Click Start > Control Panel > Network and Sharing Center > Change adapter settings.
  2. Press and release the Alt key.
  3. Click Advanced > Advanced Settings….
  4. Change the order of network cards and bring the production network at the top using Up and Down buttons.

Note: You may need to restart VMware tools for settings to take effect immediately.

The twist is there is no workaround mentioned in the KB for Linux Operating system.

Later on we figure out there is no workaround available for Linux OS but we were a little keen to know from where vCenter by the help of VMware tools take this IP address order for Linux OS.

We logged in to one of the Linux OS and ran ifconfig –a.

Here the first network adapter was the loopback adapter, second was for Service IP and third for the Backup IP.

Hence, It is again the same order that we saw in Edit settings of the VM.

42

 

By further looking deep into it, we figured out that the order of the network adapter comes from entries in /proc/net/dev.

So if you see the below output, eno33449296 is first in the order that has Backup IP assigned to it in the ifconfig –a output.

Whereas eno16780032 is second in the order that has Service IP assigned to it.

43

Hence, this is the place from where we are picking the order of the IP and later we display on the VM summary page.

The order of interfaces in /proc/net/dev is determined by Linux kernel.

We also tried hot adding the network interfaces one after another but we could not change the order in /proc/net/dev.

It seems there is no way of changing the interface order in /proc/net/dev 🙂

Hope this helps!

 

401 – Unauthorized: Access is denied due to invalid credentials on VMware vRealize Automation 6.2.0

Hello,

One of my customers was facing this issue too often on their cloud portal.

This issue was quite intermittent and was affecting random users.

The page on the portal used to throw below Server Error “401 – Unauthorized: Access is denied due to invalid credentials”

unauthorised error

Before proceeding ahead just want to clarify that the customer was running vRealize Automation 6.2.0.

Initially we thought it is an issue with certificates on the IaaS web server as we recently renewed certificates on all the cloud components.

Some of the articles talk about SSL communication issue where the certificate mismatch (when compared from what has been stored in the IAAS database) occurs.

Solution:

Perform below steps on IAAS server

Navigate to the following directory:

  1. ‘C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe’ using command prompt.
  2. Run vcac-config.exe UpdateServerCertificates -d vCAC -s servername

here vcac is the database name you created during IAAS installation, server name is the name of the server where SQL is installed.

  1. Type the command iisreset
  2. Wait for IIS to restart and re-launch browser, try to login. It should solve the issue.

But this did not help us resolve the issue.

Later we found it was due to persistence sourceip entry timeout which has a default value of 180 seconds on the Load balancer.

The customer had a distributed installation of vRA where there were two IaaS nodes behind the load balancer.

“The default source IP address persistence option persists traffic based on the source IP address of the client for the life of that session and until the persistence, entry timeout expires.

The next time a persistent session from that same client is initiated, it might be persisted to a different member of the pool and this decision is made by the load-balancing algorithm and is non-deterministic.”

To resolve the issue it was recommended to change the persistence entry timeout to 1800 seconds (30 minutes) and also match the vRealize Automation GUI timeout in the application profile of vRA GUI.

This change on the load balancer resolved the issue.

The changes were made on the application profile of IaaS Web, vRealize Automation VA web and vRO configured on the load balancer by changing the timeout value to 1800.

LB application profile.png

That’s it.

This resolved the issue for the customer where they did not face the annoying “401 – Un authorized: Access is denied due to invalid credentials” error anymore.

 

Hope this helps!

Learn basic to advance vSAN in 80 minutes -…

Learn basic to advance vSAN in 80 minutes

Learn basic to advance vSAN in 80 minutes -…

Purpose: If you are working in Virtualization and Cloud space, then you must have heard about Software Defined Storage (SDS). It is part of Software-Defined Data Center (SDDC) (a term coined by VMware). Since VMware started the SDDC journey and is a pioneer in this technology, so, today I am going to cover VMware solution in…Read More


VMware Social Media Advocacy