Wednesday, 23 August 2017

NUMA Optimization on HP Proliant Gen9 Servers

G'day,

I've just come across Ingo Gegenwarth's post about NUMA optimization on HP ProLiant Gen9 servers that may boost a server's performance significantly.

Read his post at https://ingogegenwarth.wordpress.com/2017/07/27/numa-settings/

Cheers,
Zoltan

O365 IDM Bug: DirSync Errors but No Objects in Error

Hi All,

First things first, a Big Red WARNING: The following content applies to the specific context encountered in one of my client's O365 tenant. If you think you should follow it, please be extra careful that you understand the whats and whys of this post. It worked in my environment for my situation. It may not work in yours, or you may have to take additional steps to avoid data loss. If not careful, you MAY LOOSE DATA.

I do NOT accept responsibility for any data loss. Proceed at your own risk.

I recently worked on an issue where the O365 admin portal reported DirSync errors...,


... but when I looked at the detailed list of objects in error, the list was blank:


While I don't know what exactly happened, here is what I think the customer has done, as it makes the most sense:
  • Some users left the company. Their external contacts kept sending e-mail, and these e-mails had to be delivered somewhere so that business wasn't affected.
  • The client's admin decided to delete their mailbox and AD object. Created a distribution list, assigned the departed user's e-mail address, and populated the distribution list with recipients assigned to handle incoming e-mails for the departed users. No idea what happened to the departed users' mailbox content, and it is totally irrelevant for the purpose of this blog.
  • AADConnect deleted the O365 objects by means of AD sync. Consequently O365 moved objects into the Dumpster.
  • AADConnect tried to synch the new distribution groups, however an email address conflict was detected between the objects in the dumpster. The sync succeeded but with errors. For more details see the Duplicate Attribute Resiliency feature described in the first bullet point in the References section at the end of this post.
From this moment on, no matter what was done, the DirSync error persisted.

To fix it, I had to:
  • Purge the old accounts from the O365 dumpster.
  • Delete the distribution groups that were in error from O365 by moving them out of the AADConnect sync scope and having it do an AD sync.
  • Recreate the distribution groups by moving them back into the AADConnect sync scope and having it do an AD sync.

Here comes the long story. You'll need the Windows Azure Active Directory Module for Windows PowerShell.

Step 1: Connect to the O365 tenant with a tenant admin account:

Connect-MsolService

Step 2: List the objects in error:

Get-MsolDirSyncProvisioningError


The command gives us a couple of useful tips:
- a DisplayName
- the conflicting properties in the ProvisioningErrors field

I also searched recipients for any conflicting addresses - no joy.

I checked the O365 dumpster for any deleted accounts. Bingo! Some entries had property values also found in the objects in error.

Step 3: List the objects in the dumpster, along with their proxy addresses. Note that I used the -Wrap parameter in the FT (Format-Table)  command in case the list of proxy addresses is too long, to avoid truncation (type it all on one line).

Get-MsolUser -ReturnDeletedUsers | ft DisplayName,ProxyAddresses -Wrap



In my case the tenant deleted some user accounts for staff who moved on, and replaced them with distribution groups populated with other users who then got the deleted user's e-mails, assigning the departed user's address to the distribution group. Therefore the same e-mail address was on the new distribution group, as well as on the deleted object in the dumpster which wasn't yet purged.

Hint: Purging the dumpster and re-synching AD doesn't solve the problem. It looks like it's a bug in the O365 identity management system. We still need to purge the dumpster but some additional steps are needed, so read on.

Step 4: To remove objects from the dumpster, use the command (type it all on one line):

Get-MsolUser -ReturnDeletedUsers | Remove-MsolUser -RemoveFromRecycleBin -Force

The command has no output:


Once the dumpster is purged and a new AD sync is initiated (or wait for the next cycle), you would think it's all sorted. Unfortunately it's not so. Once you do a re-sync and issue the Get-MsolDirSyncProvisioningError command, you'll still see that the conflicting objects are still in error.

No, the sync process isn't (yet) that evolved. A few more steps to go to get it sorted.

Step 5: Delete the object that is reported by Get-MsolDirSyncProvisioningError as being in conflict:
  • In the on-premises AD, move the objects in conflict outside the scope of the AADConnect sync scope (you didn't set the scope to the entire domain, right?)
  • Initiate an AD sync or wait for it to happen on schedule.
  • Confirm that the objects have been deleted in the O365 tenant.
  • Check the dumpster again. If the objects were moved there, then purge them as detailed at point 4 above.

Step 6: Run the Get-MsolDirSyncProvisioningError command again. It should return nothing:


Step 7: Move the object back into the AADConnect sync scope in the local AD and initiate a sync.

Step 8: Confirm that the objects have been re-created in O365.

Step 9: Confirm that the Get-MsolDirSyncProvisioningError command returns no more entries, and that's also reflected in the portal:


That's it.

Please note that in my case I only had to deal with Distribution Groups. Deleting and recreating them in O365 did not involve any user data, contrary to mailbox objects. That would have complicated things. Please see my warning at the top of this post.

References:

Happy error hunting!

Friday, 11 August 2017

I Moved to a New ISP and Mail Stopped Flowing - What Now?

Hi There,

A customer of mine has a Hybrid Exchange environment. It recently moved from one ISP to another and mail stopped flowing.

DNS, firewall and NAT rules have been updated. The customer waited for long enough for the DNS to propagate. Still no joy.

What was missing? Searching the Internet returned no meaningful results.

For those of you who are going through the same experience, once you moved to the new ISP, updated your DNS records and changed your firewall and NAT rules, simply re-run the Hybrid Configuration Wizard to populate configuration items with the new details to restore functionality.

The same applies when the public IP address changes, regardless of the reason.

Have a nice day :-)

Password Sync - No Recent Synchronization

Hi There,

This is another one of many posts about AAD Connect failing to synchronise passwords, this time with some additional clarifications.

The error:


The context:

  • The admin configured his own account in the AD-DS connector in the management agent.
  • The admin changed his password over time. AD sync broke.
  • A new service account has been created, dedicated for AD access, and configured the connector to use it to correct the above problem. AD sync started working again.
What didn't happen is permission wasn't granted for the new account to synchronise passwords. User properties were synchronised, but not password hashes.

There were informational 611 events in the Application event log by Directory Synchronization:


The relevant bit: RPC Error 8453 : Replication access was denied. There was an error calling _IDL_DRSGetNCChanges

This is due to the fact that the connector account did not have the following permissions - see https://msdn.microsoft.com/en-us/library/azure/dn757602.aspx:
  • Replicating Directory Changes
  • Replicating Directory Changes All
These permissions are granted on the domain root.

Open Active Directory Users and Computers and in the View menu enable Advanced Features.


Right-click on the domain name, Properties, Security. Add the account and grant the permissions:


Wait for the next synchronization cycle or kick one off manually. Passwords should now sync successfully.

One last thing: the account you have to give permissions to is NOT what's configured in the Microsoft Azure AD Sync service:



Instead, the permissions have to be granted to the account configured on the AD connector:


References:
Happy syncing!

Friday, 3 March 2017

Where Has My Licence Gone?!

"Who removed my licence?"
"Probably Microsoft. Your 30-day grace period has likely expired..."

Hi All,

Recently one of my clients reported that some users lost their O365 licence. They were working yesterday and no longer could log on today - the licence was wiped. Completely. No prior notice.

What was going on?

I searched the audit log. I could see the admin re-assigning the licence so that the user can work, but no trace of its removal.

To cut it short, it ended up on Microsoft's laps. After a couple of weeks of log analysis and investigation, it turned out that following a hybrid mailbox on-boarding, the O365 licence failed to be applied to a small group of users. Even though unlicensed, they were able to connect and use their e-mail because they were in the 30-day grace period. Once the 30-day grace period ended, the mailbox has been disconnected

Why didn't I see an entry in the audit log? Simple: A licence removal event didn't occur because there was no licence to remove in the first place - remember, the license assignment failed. Going back as far as when the mailbox was migrated, we could see an audit log entry for the failed attempt assigning a licence. This can happen when scripting it, and it is easily missed, especially when you have lots of accounts.

While this resolved the mystery, it also revealed a couple of shortcomings of O365:

  • We couldn't tell from the O365 audit log entries what licences were assigned or removed from the user. Microsoft pointed out during the case that O365 and Azure maintain separate audit logs, and the Azure log is more detailed. Not so the O365 log. You can see the activity though and who actioned it. NOTE: It may take up to 12 hours for the action to appear in the log.
  • O365 does NOT alert about the imminent end of the 30-day grace period.
  • Microsoft has very little documentation about the 30-day grace period.
On the topic of documentation, the engineer who worked on the case passed me this link, which states (clutter removed by me):
Assume that you have a hybrid deployment of Microsoft Exchange Online in Microsoft Office 365 and on-premises Microsoft Exchange Server...
If a license is not assigned to the user, the mailbox may be disconnected...
This issue occurs if the mailbox was migrated to Exchange Online as a regular user mailbox ... If the user isn't licensed, and if the 30-day grace period has ended, the mailbox is disconnected...

Now that I know what to look for, I've come across this link which states:
After you create a new mailbox using the Exchange Management Shell, you have to assign it an Exchange Online license or it will be disabled when the 30-day grace period ends.

Takeway #1: Always check licences after a mailbox on-boarding in a hybrid migration.

Takeway #2: Monitor your users regularly for their licensed status. Automatic alerting may be flaky, so if you are a developer then you may want to rock up an application and use the audit APIs to extract data and send alerts.

Takeawy #3: You need to search the correct audit log. There are a couple Security and Compliance centers in different places on the portal. The one you're after is under Admin centers | Security & Compliance, then in the new window navigate to Search & Investigation | Audit Log Search


Once there, select to search for the Update user and Changed user license activities in the User administration activities section:


Happy auditing!

Friday, 10 February 2017

Email and UPN do not belong to the same namespace

Hi There,

Just recently I helped out in a case where hybrid Exchange users with the mailbox in the cloud were failing to retrieve autodiscover configuration data.

It was an environment with multiple federated domains, and mailboxes split between on-premises and O365. On-premises mailboxes worked well, only O365 mailboxes were failing, and, more bizarrely, one of the domains worked while the others didn't.

As we know, mail clients for onboarded hybrid mailboxes go through a double autodiscover process:

  1. The first iteration discovers the on-premises mail user, which then redirects the client to the cloud mailbox.
  2. The client then goes through a second autodiscover process, this time against the cloud mailbox.

Looking at the Microsoft Remote Connectivity Analyzer output, I noticed that the first iteration succeeded. The issue was with the second pass. The error:

X-AutoDiscovery-Error: LiveIdBasicAuth:LiveServerUnreachable:<X-forwarded-for:40.85.91.8><ADFS-Business-105ms><RST2-Business-654ms-871ms-0ms-ppserver=PPV: 30 H: CO1IDOALGN268 V: 0-puid=>LiveIdSTS logon failure '<S:Fault xmlns:S="http://www.w3.org/2003/05/soap-envelope"><S:Code><S:Value>S:Sender</S:Value><S:Subcode><S:Value>wst:InvalidRequest</S:Value></S:Subcode></S:Code><S:Reason><S:Text xml:lang="en-US">Invalid Request</S:Text></S:Reason><S:Detail><psf:error xmlns:psf="http://schemas.microsoft.com/Passport/SoapServices/SOAPFault"><psf:value>0x800488fc</psf:value><psf:internalerror><psf:code>0x8004786c</psf:code><psf:text>Email and UPN do not belong to the same namespace.%0d%0a</psf:text></psf:internalerror></psf:error></S:Detail></S:Fault>'<FEDERATED><UserType:Federated>Logon failed "User@domain.tld".;

Needless to say, I checked on-premises and cloud user and mailbox properties, UPNs, addresses, connector address spaces, proxy addresses, the lot, to no avail. It was all configured correctly.

Then I checked ADFS. There I found a couple of errors, so I turned on debug trace. Nothing obvious there either, so I turned it off.

I tested ADFS logon via https://sts.domain.tld/adfs/ls/IdpInitiatedSignon.aspx: Successful.

ADFS was working well per se. Somehow, it was getting incorrect details from O365.

I also suspected that somehow the ADFS proxy was breaking the SSL stream - I dealt with a similar situation before. However this idea has been dropped when the original (3rd party) proxy was replaced with Microsoft's native Web Application Proxy and the issue remained.

To recap:

  • Users had matching e-mail and UPN
  • ADFS itself was working
  • Multiple federated domains
  • ADFS Proxy (WAP) ruled out


Then I decided to re-federate the domains. Since I had very little (a.k.a. none whatsoever) information on how it was set up initially, and no details on any deployment history, tabula rasa seemed in order.

So I converted the domains to Managed, then re-federated them. Since there are multiple federated domains, I used the Convert-MsolDomainToFederated cmdlet with the SupportMultipleDomain switch.

Coffee time, to allow some time to pass to do its things (it's a distributed environment and things don't happen instantly).

<suspense>Then came the test</suspense> ...  Lo and behold: everything started to work! Every federated domain, every service.

In summary: The hybrid environment and user accounts were configured correctly, yet the wrong details were passed by O365 to ADFS. Either the trusts were incorrectly configured, or the federation metadata got corrupted. Whatever it was, re-federating the domains fixed it.