Quantcast
Channel: TechNet Blogs
Viewing all articles
Browse latest Browse all 36188

SharePoint: Common NTLM Authentication Issues, aka: Consider Ditching NTLM

$
0
0

NTLM authentication is not great.

It’s not the fastest. In most cases, that honor would go to Kerberos.
It’s not the most secure. Again, Kerberos.
It’s not all that flexible. For example, it doesn’t work well for extranets or anything cross-firewall. In those scenarios, Trusted Provider auth (SAML / WS-Fed) works well.  See: AD FS.

 

So why do so many still use it?

It’s the old stand-by. It works good enough, and there’s typically nothing extra you need to configure to get it to work. You just turn it on and it works. Unless it doesn’t, which is what this post is about.

 

Problems with NTLM usually manifest themselves in one of two ways:

1. Users cannot log in at all. They receive authentication prompts and then a 401 – Access Denied.
2. Users receive (seemingly) random authentication prompts when browsing SharePoint sites.

 

One thing to keep in mind when troubleshooting NLTM issues with SharePoint is that the problem is almost always external to SharePoint. Aside from turning it on or off, there’s not really anything you can configure inside of Sharepoint to make NTLM work better or worse.

To enable NTLM, this is all you do within Central Administration | Manage Web Applications | <Your web app> | Authentication Providers:

 

And this is the resulting configuration in IIS Manager | <Your Site> | Authentication | Windows Authentication | Providers:

 

Here are some known issues with NTLM in no particular order:

Issue #1:

The network load balancer (NLB) is bouncing the client between web-front-ends (WFEs) in the middle of the "NTLM Handshake".

Note: See "other troubleshooting tips" section below for details on the "NTLM Handshake".

I know there’s some documentation out there that suggests that session persistence / affinity / "sticky sessions", is no longer required with the advent of Distributed Cache in SharePoint 2013 and above. However, that is not the case, at least not as long as you’re using NTLM.

Staying on the same WFE is vital to any challenge / response authentication process (like NTLM).
Clearly, if the NTLM challenge comes from one WFE, but we send the response to another, that’s not going to work.

See this: https://en.wikipedia.org/wiki/Challenge–response_authentication

“A more interesting challenge–response technique works as follows. Say, Bob is controlling access to some resource. Alice comes along seeking entry. Bob issues a challenge, perhaps "52w72y". Alice must respond with the one string of characters which "fits" the challenge Bob issued. The "fit" is determined by an algorithm "known" to Bob and Alice. (The correct response might be as simple as "63x83z" (each character of response one more than that of challenge), but in the real world, the "rules" would be much more complex.) Bob issues a different challenge each time, and thus knowing a previous correct response (even if it isn't "hidden" by the means of communication used between Alice and Bob) is of no use. A part of Alice's response might convey that it is Alice who is seeking authentication.”

Now consider the above "Bob and Alice" scenario without session persistence (sticky sessions).
Bob issues the challenge. Alice sends the response to Fred, who has no idea what she’s talking about. Authentications fails.

To verify whether or not this is happening, I would suggest using HTTP Response Headers with Fiddler as I detailed in a previous post.

Solution #1:

Configure your NLB for "sticky sessions" so that a given client stays on a given WFE, at least throughout the authentication process.

 

Issue #2:

Users are denied access due to settings in the local security policy on the WFEs.

Reproduce the problem and take a look at the Security Event Log on the WFE. You may see a logon failure event like this:

Log Name: Security
Source: Microsoft-Windows-Security-Auditing
Event ID: 4625
Task Category: Logon
Level: Information
Keywords: Audit Failure
Computer: WFE1.contoso.com
Description:
An account failed to log on.

Subject:
 Security ID: S-1-0-0
 Account Name: -
 Account Domain: -
 Logon ID: 0x0

Logon Type: 3

Account For Which Logon Failed:
 Security ID: S-1-0-0
 Account Name: user1
 Account Domain: contoso

Failure Information:
 Failure Reason: The user has not been granted the requested logon type at this machine.
 Status: 0xc000015b
 Sub Status: 0x0

Detailed Authentication Information:
 Logon Process: NtLmSsp
 Authentication Package: NTLM

 

A logon type of “3” is a network logon. The failure reason tells us that there is something in the local security policy (possibly set by Group Policy) that is not allowing the user to logon.

 

Solution #2:

Run SecPol.msc from the Run prompt or command line.
Check Local Policies | User Rights Assignment.
These two policies should be your focus:

  • Access this computer from the network
  • Deny access to this computer from the network

Check all group memberships for your problem user(s) to make sure they are allowed access from the network and not explicitly denied via those two policies.

 

Issue #3:

No one agrees on which version of NTLM to use.

There are different versions to NTLM and additional security options within them. If the client, WFE, and Domain Controller (DC) can’t find common ground, the authentication will fail.

Reference: https://technet.microsoft.com/en-us/library/2006.08.securitywatch.aspx

 

Solution #3:

Check the LmCompatibilityLevel Registry key for client, WFE, and DCs.
Make sure the value is compatible between the three:
Reference: http://technet.microsoft.com/en-us/library/cc960646.aspx
LmCompatibilityLevel is located here:
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlLsa

 

Issue #4:

DNS problems.

This is most likely to occur for users that are in a remote domain or trusted forest. If DNS is not configured properly, the SharePoint WFE will not be able to get the proper IP address for a remote domain controller.

This one is a little harder to nail down. It can take a network trace with Netmon or Wireshark to fully diagnose. However, a good indication of the problem may lie in your IIS logs.

Check the IIS log for the problem SharePoint site. You may see that the final request that includes the whole NTLM token receives a 401.1 with a particular sc-win32-status of 2148074257.

For example:

10.87.68.93 GET /sites/Pages/allitems.aspx 443 – 192.168.56.21 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.1;+WOW64;+Trident/7.0;+SLCC2;+.NET+CLR+2.0.50727;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30729;+Media+Center+PC+6.0;+.NET4.0C;+.NET4.0E;+InfoPath.3) https://teams.contoso.com/sites/team1/pages/default.aspx 401 1 2148074257 470 2787 31 

A “sc-win32-status” of “2148074257” means "SEC_E_NO_AUTHENTICATING_AUTHORITY", ie: we can't find a domain controller that is authoritative for that domain.

Reference: https://msdn.microsoft.com/en-us/library/windows/desktop/aa375512(v=vs.85).aspx

 

Solution #4:

Fix your DNS so that the SharePoint servers get the proper IPs for remote domain controllers.

You should also verify your domain and forest trusts.

 

Issue #5:

MaxConcurrentApi

This is a bit of a complicated topic, but you can sum it up like this:
There is a finite number of Netlogon process threads available for NTLM authentication on both the SharePoint WFEs and the domain controllers. When that number is exceeded, authentication requests can fail.

This typically happens in large environments with heavy NTLM traffic, and especially when that authentication occurs across domain trusts.

Reference:
https://support.microsoft.com/en-us/help/975363/you-are-intermittently-prompted-for-credentials-or-experience-time-out

 

Solution #5:

Switch SharePoint (and other applications) to use Kerberos authentication.
This cuts down significantly on Netlogon service traffic, in most cases relieving the bottleneck.
However, keep in mind that Kerberos authentication can still be impacted by MaxConcurrentAPI if there is a significant amount of it requiring PAC verification, or if NTLM authentication for other applications is saturating available threads.

Reference: https://support.microsoft.com/en-us/help/2688798/how-to-do-performance-tuning-for-ntlm-authentication-by-using-the-maxc

Another option is cutting down authentication traffic by making more resources available anonymously.
For example, within an out-of-box SharePoint site, all supporting files (CSS, JS, images, etc) are stored on the file system and are available anonymously (most are in the _layouts folder). However, some customizations and branding may store supporting files within a document library where an authentication request must occur for each file request.  The result can be a dozen or more NTLM authentication requests for each page load. Moving those supporting files their own folder in _layouts, or otherwise making them anonymously accessible will drastically reduce total authentication traffic when browsing the site.

 

Other troubleshooting tips:

As we saw in the above sections, IIS logs, the Security Event Log, and Network traces can assist in diagnosing these problems. In this section, I’d like to walk you through using Fiddler to view the authentication traffic.  The purpose is to show what a successful NTLM authentication looks like.

NTLM authentication is done in a three-step process known as the "NTLM Handshake".

 

The first request is always made anonymously. This is true of Kerberos as well.
The site requires authentication, so the WFE responds with a 401 – Unauthorized and a “WWW-Authenticate: NTLM” header.  That header is how the server tells the client which authentication methods to try.

 

The client makes a second request for the same page. This time it includes half of the NTLM token. The server issues a challenge.

 

The client makes a third request with the whole NTLM token, is successfully authenticated, and receives a 200-ok for home.aspx.

 

Note:
The NLTM Handshake is not really a half-token / full-token situation, but for the purposes of simplifying the NTLM Handshake process, I find that explanation works well enough.


Viewing all articles
Browse latest Browse all 36188

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>