SharePoint 2013 Slow Performance due to Distributed Cache
We were having a performance issue for our SharePoint environments. We were also seeing a lot of errors related to the distributed cache in the ULS log
Unexpected Exception in SPDistributedCachePointerWrapper::InitializeDataCacheFactory for usage 'DistributedLogonTokenCache' - Exception 'Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode:SubStatus:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. For on-premises cache clusters, also verify the following conditions. Ensure that security permission has been granted for this client account, and check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Also the MaxBufferSize on the server must be greater than or equal to the serialized object size sent from the client.) ---> System.ServiceModel.CommunicationException: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '10675199.02:48:05.4775807'. ---> System.IO.IOException: The read operation failed, see inner exception. ---> System.ServiceModel.CommunicationException: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '10675199.02:48:05.4775807'. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) at System.ServiceModel.Channels.SocketConnection.ReadCore(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout, Boolean closing) --- End of inner exception stack trace --- at System.ServiceModel.Channels.SocketConnection.ReadCore(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout, Boolean closing) at System.ServiceModel.Channels.SocketConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout) at System.ServiceModel.Channels.ConnectionStream.Read(Byte[] buffer, Int32 offset, Int32 count) at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count) at System.Net.Security.NegotiateStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.NegotiateStream.StartReading(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.NegotiateStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest) --- End of inner exception stack trace --- at System.Net.Security.NegotiateStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.NegotiateStream.Read(Byte[] buffer, Int32 offset, Int32 count) at System.ServiceModel.Channels.StreamConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout) --- End of inner exception stack trace --- Server stack trace: at System.ServiceModel.Channels.StreamConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout) at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment`1 preamble, TimeoutHelper& timeoutHelper) at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.DuplexConnectionPoolHelper.AcceptPooledConnection(IConnection connection, TimeoutHelper& timeoutHelper) at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout) at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout) at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout) at Microsoft.ApplicationServer.Caching.CacheResolverChannel.Open(TimeSpan timeout) at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs) at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase) at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData) at Microsoft.ApplicationServer.Caching.CacheResolverChannel.OpenDelegate.EndInvoke(IAsyncResult result) at Microsoft.ApplicationServer.Caching.ChannelContainer.Opened(IAsyncResult ar) --- End of inner exception stack trace --- at Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ResponseBody respBody, RequestBody reqBody) at Microsoft.ApplicationServer.Caching.DataCacheFactory.GetCacheProperties(RequestBody request, IClientChannel channel) at Microsoft.ApplicationServer.Caching.DataCacheFactory.GetCache(String cacheName) at Microsoft.SharePoint.DistributedCaching.SPDistributedCachePointerWrapper.InitializeDataCacheFactory()'
There was a patch for the AppFabric server (what distributed cache is built on) that we hasn't installed yet. And based on this blog (http://blogs.technet.com/b/speschka/archive/2014/02/06/update-to-guidance-for-patching-appfabric-on-sharepoint-2013-distributed-cache-servers.aspx), patching AppFabric separately is the recommended way from Microsoft.
After the AppFabric update is installed, it does help our DEV and Integration testing environments. But our UAT is still pretty slow ...
After working with Microsoft troubleshooting it, it turns out that our application pool account does not have permission to the separate distributed cache servers. After adding the permissions for those account, we're getting a lot better performance for our UAT environment.
Comments
"it turns out that our application pool account does not have permission to the separate distributed cache servers."
How is this account set up? Is it a basic domain account or does it have special permissions on the server?
How did you add these permissions to the distributed cache servers?
Thanks!