Browsed by
Category: Troubleshooting

DevConnections 2013 Recap

DevConnections 2013 Recap

DevConnections 2013 was an amazing conference, as it has in all years past. The attendees are always hyper-engaged and come with tons of questions. As a presenter, nothing makes me happier than helping people understand the concepts I am covering better than when they walked into my sessions.  Thanks to Scot Hillier for being an amazing track chair and moderating the panel on Tuesday afternoon.

During DevConnections I did 3 sessions (2 of which were recorded) and a panel session (also recorded).  SharePoint Pro Magazine is selling access to all 60 hours of recorded content from the show.  You can find details about this at http://windowsitpro.com/itdev-connections

I have been hearing from folks that my slides are not accessible on the site, so if you are looking for my content please look no further:

PowerShell for the Anxious ITPro

SharePoint Performance – Best Practices from the Field

Business Intelligence in SharePoint 2013

Time for the second half of my crazy SharePoint road trip… See you at SharePoint Fest Chicago!

Debugging an Explorer View issue

Debugging an Explorer View issue

During a recent SharePoint 2010 upgrade project, we applied SP1 and DEC 2012 CU & encountered a frustrating issue. Everything in the farm was working as expected after we did the installation & ran PSCONFIG, except for an issue where some users could not use Explorer View to open some Document Libraries.

More detailed user testing revealed that the issue only occurred for user who were using computers running Windows Vista or later. Windows XP users could open in explorer view without a problem.

WebDAV vs Front Page RPC

The reason for the OS disparity is due to the different protocols by which SharePoint will try to open in Windows Explorer. As it is explained in a Microsoft white paper from 2006 on SharePoint Explorer View:

“The Explorer View prefers WebDAV over FPRPC. Because of the underlying design of the Explorer View and the default network provider order, it always tries to use SMB first, then WebDAV. Only when SMB and WebDAV have failed does it actually attempt to use FPRPC. This means that forcing the Explorer View to use WebDAV is more a case of creating an environment that makes sure WebDAV is successful instead of actually forcing the Explorer View to choose it.”

While the WebDAV protocol relies on the Web Client Service, the FPRPC protocol does not, instead relying on WebFolders. WebFolders are disabled by default beginning in Windows Vista, but are available in Windows XP which explains why Explorer View worked in XP and not later Windows OS.

Compounding the confusion

This client was leveraging nested managed paths. This caused a secondary issue because the managed path contained a “”, causing the WebDAV protocol not to be able to parse the address. The fix simply involved creating a wildcard Managed Path for the first part of the compound address. So if you have a URL which is “statecounty” you would just add a managed path for “state” and explorer view via WebDAV will work. Here is what this would look like:

clip_image001

Take Away

The lesson here is to avoid using compound Managed Paths in your SharePoint web applications. If you absolutely must use compound managed paths be sure to add a wildcard path for the first part of the compound path.

Thank you’s

A big thank you to Todd Klindt, of Rackspace, for giving us some good leads & Evan Riser from my team for chasing this all the way down the rabbit hole to a resolution.

When in doubt, check ALL the permissions…

When in doubt, check ALL the permissions…

Having just completed my last speaking engagement of 2012 it was time to get back into the swing of things and start playing with troubleshooting a bit. 

The dilemma

In a continuing effort to evolve my PowerShell build script for SharePoint I spent a few hours with my team playing with different settings.  One of my team members was driving to get better hands on experience with using PowerShell to configure SharePoint.

We started with the very standard PSConfig script that I have used hundreds of times in the past:
(I left out the variables to save some space)

1

The following error popped its ugly head up in PowerShell’s angriest color when attempting to run this initial farm configuration:

New-SPConfigurationDatabase : Requested registry access is not allowed.

The troubleshooting

Check permissions

Hackles went up immediately when the error was read out loud.  Prior to running the script we had just walked through several Security Best Practice checks, following Microsoft’s guidance in TechNet, partly to see if anything had changed recently (it hadn’t) and partly as a good refresher:

Account permissions and security settings (SharePoint Server 2010)

Plan administrative tasks in a least-privilege environment (SharePoint Foundation 2010)

Plan for administrative and service accounts (SharePoint Foundation 2010)

We went back and doubled checked all of our settings and found that things were configured as prescribed.  The SharePoint install account had local administrator permissions on the SharePoint server and SecurityAdmin and DBCreator rights on the SQL server.

Examine the logs

We visited our Server Event Log and 14 Hive Logs folder but found no evidence that anything was in error.  In fact, no logs entries were created at all…

Check the firewall rules

We validated that for this configuration, in a sandbox without external connections to the world, that the Windows Firewalls were turned off.

Check the connection between servers

Using the trusty Data Sources (ODBC) validation method we were able to make connection from the SharePoint server to the SQL server, and browse the available databases.

Get thyself to Google!

Completely perplexed at this point by an error that doesn’t make any sense due to the fact that the SharePoint install account was a local admin we went to our good friend Google and found, well to be honest a bunch of crap that didn’t help us in any way.  Lots of stuff for people who have lost access to Central Admin due to GPO changes, or had a driver go corrupt, or are trying to write to the registry using C# in ASP.net, & even a forum about people having problem registering their car in Nebraska.

Review of Local Security Policies

One last ditch effort to check the local security policy to see if a new GPO pushed down changes to turned out fruitless, however one of the AD admins mentioned they had seen an issue similar to this once before they changed the User Account Control Settings (UAC). 

The Root Cause

Not even thinking about it my response to the UAC question was “There is no need to do that, you just right-click and launch as Administrator or use my PowerShell script to run as a different user

Upon examination of my team member’s screen it was revealed that: 2

PowerShell ISE have in fact been opened without being run as Administrator.  A costly lesson from a time perspective, but a good learning experience for a newbie at PowerShell for SharePoint.

The most troubling of all however was upon reexamination of the PowerShell error message we needed to only go 2 lines above the big red error message that we were troubleshooting on, to the plain black texted TRUE error: (highlighted here in yellow)

3

Unassuming and unnoticed as we troubleshot the obvious error, the line was thrown by the PSConfig.exe and not a bad PowerShell parameter which explains why PowerShell did not recognize it as an error.

The moral of the story…

Even after following every documented Best Practice out there, we still were able to find a way to cause an error.  While the UI was bad for the error that would have been useful to us, it was at least thrown in our faces.

The easy answer is to always make sure that you open PowerShell or PowerShell ISE as Administrator.  My personal preference is always going to be login in to servers using a non-SharePoint privileged account and then elevate permissions to run in the context of a SharePoint Farm admin or service account as demonstrated in my previous post which sets the run as Administrator for you.

Be sure when you are ready to do any SharePoint Admin work that you see the “Administrator:” in front of your PowerShell ISE path, like this:

4

At the end of a fun troubleshooting session we walked away with a new notch in our troubleshooter tool belt, a fun article to write, and team member who will never forget to fire the RunAs flag ever again.

SharePoint 2010 & Site Directory revisited: bug fix request rejected by Microsoft

SharePoint 2010 & Site Directory revisited: bug fix request rejected by Microsoft

We received the official, and well thought out, answer back from Microsoft regarding the Site Directory bug that I reported on with my post “SharePoint 2010 and the Site Directory” back in December 2010.
 
Here is the official answer from Microsoft:

Issue Summary
Site collection creation fails with access denied error when the master site directory site collection is located on a web application which is using the new claims aware authentication method.

Cause for Rejection and Technical Explanation
The Microsoft Office team has reevaluated this bug and unfortunately our initial decision still holds.  We realize that this causes a lot of inconvenience but the code change required is extremely large and introducing a change can leave behind a huge and unexpected bug trail.

The site directory feature has been deprecated in SharePoint 2010.

Site Directory provided site collection admins a central location where they can pin bunch of URLs with categories. Users could then browse through categories, view and access all URLs/sites associated with the site collection. In SharePoint 2010, social tagging provided a much richer way to categorize URLs, and we provided tag cloud web part for navigation. To avoid having two similar solutions, site directory was deprecated.

Please know we carefully review all Hotfix request because each code change that we implement must maintain or improve the quality and stability of the product.  We strive for this to ensure the continuing integrity of the code base and to maintain a supportable product. While we recognize the impact that this issue is having on you, we cannot compromise the stability of the product’s code base using the Hotfix process.

Alternative solution

1. Ensure that the master site directory site collection is located on a web application which is using classic windows authentication.

2. Disable master site directory setting and explore the capabilities of the new social tagging feature to categorize sites. Learn more about this new feature at ”Social tagging overview (SharePoint Server 2010)” http://technet.microsoft.com/en-us/library/ff608137.aspx

SharePoint 2010’s Visio Graphics Services: EventIDs 8061 & 8046 unmasked

SharePoint 2010’s Visio Graphics Services: EventIDs 8061 & 8046 unmasked

Getting EventIDs 8061 & 8046 in your ULS Logs and Event Logs on your application server?  Having trouble figuring out exactly what they are trying to tell you?  Finding inconsistent results between site collections?  Let’s dive in…

Here are the offending errors:
2

screenshot.142

1

screenshot.143

TechNet tells us: quoting directly from the article linked here

Symptoms:   One or more of the following symptoms might appear:

  • A file or files might not load.
  • This event appears in the event log: Event ID: 8061 Description: File not found at this location: <file location>.
  • This event appears in the event log: Event ID: 8051 Description: Unable to parse file at location: <file location>.

Cause:   One or more of the following might be the cause:

  • A user might try to load a page that contains a Web Part that references a file that no longer exists or is invalid.
  • A user might try to view a Visio diagram that is corrupted.
  • A user might try to view an invalid Visio diagram.

Unfortunately this really doesn’t help you identify what the problem is or how to resolve it.

Here is what I discovered:

The common thread is the app pool.  Validate that the app pool that is being reported in the error is the same app pool that is hosting the Visio Graphics Services service. 

screenshot.136

Next using the visit using the PowerShell script provided in my previous blog article, How to: Get your Managed Account passwords when they are changed automatically by SharePoint 2010, get the password for the account running your Visio Graphic Services and head over to your Manage Service Applications and manage the Secure Store Service.  Find your Visio SSS entry and reset the credentials.

screenshot.139

screenshot.140

Once you have set the credentials you need to recycle the Application Pool on the application server.  In a standard OOB install look in SharePoint Web Services for the site that contains VisioGraphicsService.svc and view the basic settings to determine which app pool is being used by the site and recycle it.

After this is complete you should see all of your Visio Graphics Services rendering correctly.

Why the inconsistent behavior mentioned at the beginning of this article?  If you create a site after the SSS credential is invalid and you have a site that is still holding a valid SSS token then you will see one site (the new) be broken and one site (the old) working perfectly.  Just one of the fun anomalies we get to experience in the field.  

Once again, another issue finds it’s root cause in credentials management.  While SharePoint 2010 has brought us leaps and bounds further than any product previously released, we still must remain vigilant in our credentials management as Admins and Architects.  IMHO, Planning for proper credentials management is almost as critical as DR planning, and often time more far more complex.

How to: Fix the "Unable to access SharePoint sites from the localhost" problem

How to: Fix the "Unable to access SharePoint sites from the localhost" problem

Ever try to access a page on your SharePoint site from your web front end only to get prompted for a login that never lets you through?

The issue happens when dealing with sites that Integrated Authentication and have names that are mapped to the loopback address.  Translation:  if you are using Windows with Claims or Classic Mode web applications  and you are trying to connect from the server, this is you.

The LoopbackCheck security feature is enabled by default on Windows Server since 2003 SP1 and since most SharePoint Farms are going to have an FQDN AAM or two, this is going to be something that many admins are going to run into.

There are two options, and as in most scenarios one is easy and the other is the right way.

Option 1. – create a Multi-String Value that has all of your AAMs for the server and restart the IISADMIN service.

Option 2. – disable the LoopbackCheck on the server

The Microsoft recommended option is #1 (I happen to agree), however you have to do this on every server (however if you have access to create and modify GPOs, this should be something that you can just have centrally managed for all SharePoint WFEs) and you need to have the list of all of your AAMs handy with which to do it.  Not a ton of work, so bite the bullet and update the registry entry.

Serious caveat:  Option 2 is great if you are working on a developer vm or something to play with for a short burst, but if you are going to put something in production, please protect yourself and allow Microsoft to do the same.  This is one of those security scenarios where they are putting a validation check in place to protect you from malicious attacks.

For the more detailed steps on making the changes visit the KB article and get it from the horse’s mouth.

SharePoint 2010 & SQL 2008 R2 build numbers and helpful patching links

SharePoint 2010 & SQL 2008 R2 build numbers and helpful patching links

In an effort to make life simpler I have compiled a short list of useful link to build numbers and patching sites for SharePoint 2010 and SQL 2008 R2.

The Microsoft SharePoint Updates PagesSharePoint 2010 – http://technet.microsoft.com/en-us/sharepoint/ff800847.aspx
SharePoint 2007- http://technet.microsoft.com/en-us/office/sharepointserver/bb735839.aspx
This is your official sites for downloading the SharePoint updates from Microsoft.  Highly useful rather than waiting on the blogger community to send you a link, or get the TechNet bulletin, your carrier pigeon to arrive, or any of the other 120+ ways to get your SharePoint update information!

Cornelius J. van Dyk’s Blog on Versions for SharePoint
http://www.cjvandyk.com/blog/Lists/Versions/
Highly useful link if you find yourself a CU or two behind and you want to know what version of the CU is currently applied to your farm.  Simply visit your Central Admin Manage Patch Status page http://localhost:####/_admin/PatchStatus.aspx to check what version number your farm is on.

SQL 2008 R2 build numbers
http://support.microsoft.com/kb/981356/
This has proven to be the most reliable site I have found for listing the latest build numbers for SQL Server since 2008 R2 dropped.  To find out what what build you are running do the following:

  1. Open SQL Management Studio
  2. New Query
  3. type and then execute the following command
    select @@version 

    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode, .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode pre
    {font-size:small;color:black;font-family:consolas, “courier new”, courier, monospace;background-color:#ffffff;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode pre
    {margin:0em;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .rem
    {color:#008000;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .kwrd
    {color:#0000ff;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .str
    {color:#006080;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .op
    {color:#0000c0;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .preproc
    {color:#cc6633;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .asp
    {background-color:#ffff00;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .html
    {color:#800000;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .attr
    {color:#ff0000;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .alt
    {background-color:#f4f4f4;width:100%;margin:0em;}
    .externalclassaf644619e5194a77b8e785ba7f262756 .csharpcode .lnum
    {color:#606060;}

We are working on getting a SQL Server version’s page up on CvD’s blog as well, but for now I here is a table with the info:

April 2010 RTM
10.50.1600.1
May 2010 CU1
10.50.1702.0
June 2010 CU2
10.50.1720.0
August 2010 CU3
10.50.1734.0
October 2010 CU4
10.50.1746.0
December 2010 CU5
10.50.1753.0
SharePoint 2010 Farm Service Account passwords expired?!?!?!?

SharePoint 2010 Farm Service Account passwords expired?!?!?!?

Scenario:

Managed service accounts passwords expired.  Access to part of Central Administration are no longer accessible.  Sites are starting to go down because app pool passwords are managed accounts and have expired.

Soapbox moment:

Firstly, there is NO real excuse for this in SharePoint 2010 because the ability to have this done automagically for you is BUILT-IN, so either your farm admin is so over taxed (usually the case) or incompetent (the two aren’t mutually exclusive). 
I take the liberty to say all of this as someone who has had this happen to them, otherwise I wouldn’t be able to write about it, right?

Resolution:

To start, you aren’t going to be able to do anything with SharePoint until you can get the Timer Job Service running again because everything is driven by timer jobs. 
Using a credential that has full admin rights to the box and is a Farm admin, change the account the Timer Job Service runs as and start the service.  This must be done on all servers in the farm. 
Don’t fret, the next things you are going to do is fix this back to the way it should be but you can’t do the next steps without the timer job service running, so just play along.

Go to http://www.yourserver.com/_admin/ManagedAccounts.aspx and edit your farm service account and tell it to change the password now.

Go to http://www.yourserver.com/_admin/FarmCredentialManagement.aspx and select Farm Account.  You will see your registered service account (the one that you just changed the password for) and click ok.  This will go reset your Timer Job Service account to the registered account which now is active and working.
Next, create a text file called on your server which will be a list, one account per line, of service accounts that you are going to have auto-updated.

Then using the SharePoint 2010 Management Shell interface change the passwords and set to auto change run using this variable script:

foreach ($account in Get-Content driveletter:filename.txt)
{
Set-SPManagedAccount -Identity $account -AutoGeneratePassword -PreExpireDays n#days -Schedule “monthly between n#dayofthemonthvalue hh:mm:ss and n#dayofthemonthvalue hh:mm:ss” -confirm:$false
}

The script that I actually used, without the variables looks like this:

foreach ($account in Get-Content c:managedaccounts.txt)
{
Set-SPManagedAccount -Identity $account -AutoGeneratePassword -PreExpireDays 30 -Schedule “monthly between 7 02:00:00 and 7 03:00:00” -confirm:$false
}

Once this command has completed successfully you will see that your last password change just happened and that your next password change is scheduled.  Make sure that if your next scheduled password change isn’t in conflict with a password change minimum group policy that won’t allow passwords to be changed before a minimum number of days or you will end up with some errors in your ULS Logs and some misfired password change attempts.

Lastly, go to http://www.yourserver.com/_admin/FarmCredentialManagement.aspx and walk through all of the farm credentials and let the accounts get synced up.  This should re-spin up the app pools and get your users back into the site, but if not, do an IISRESET and things should be back online.

Shout outs

Huge thanks to my partners in crime on this one, Derek Martin and Trent Foley of Slalom Consulting, for helping with the out of the gate perfect PowerShell scripts that are referenced above.

While I was busy figuring out how to break back in to Central Admin, they figured out the proper script to reset the passwords and set the auto change programmatically.  This script can be used in advance of this type of shenaniganal activity to ensure that while you are building your farm you get this set right the first time and not have to do it manually (which is often the excuse when you are using 30+ managed accounts in a farm).

Random Server hang issues result in a required hard reset

Random Server hang issues result in a required hard reset

Symptom:

Windows 2008 R2 64bit systems hang at random and require a hard reboot of the system to recover. You can remote to the system via KVM (RDP is not accessible) and even do a CTRL+ALT+DEL, but after the lock screen goes away and tries to give you a login screen… YOU GET NOTHING. Only silence…

Root Cause:

We ended up with a three headed root cause on this set of issues.

1.) Our blades had a bad BIOS version that caused the system to get into an inconsistent state and required a power cycle to get them clear.

2.) The hardware vendor had Data Execution Protection (DEP) turned on at the hardware layer by default.

3.) By default Microsoft has its own version of DEP turned on for all services unless you add in exceptions.

How did we diagnose this beast? Many team members (Dan, Don, Christian, Jim, and Cornè) all weighed in and found part of this along with support from our hardware vendor and Microsoft.

The issues plagued us for several weeks because it was not a predictable failure and there was NOTHING in the logs to correlate the issues together other than a single model of blade server.

Call with Microsoft and the hardware vendor suggested that the Microsoft DEP might be part of the issue as well. Luckily our support level was good enough to get both vendors on the same line and have them work together. Support calls like this are not cheap if you don’t have the agreements in place already.

Resolution

1.) Flash the BIOS with an updated and vendor verified version.
2.) Turning off of the hardware DEP
3.) Setting the Windows DEP to on “for essential Windows programs and services only”

Since making these changes we have not seen reoccurrence of the random system hang issues. I will update this post if things change… but so far, so good!

Windows with Claims User gets access denied to a site they had access to earlier in the day

Windows with Claims User gets access denied to a site they had access to earlier in the day

Scenario:

Small Farm 3 tiered topology using Windows with Claims implementation aggregating AD with a custom LDAP database to create the claims roles.

Symptom:

Users of a SharePoint 2010 site get access denied to a site they could access earlier in the day.  As the day goes on, the number of users effected increases.  Eventually only users with full control policies can access the farm.

ULS Log error:

An exception occurred in Custom Roles claim provider when calling SPClaimProvider.FillResolve(): The underlying provider failed on Open..

Root Cause:

The 10 hour default session timeout for the user’s claim has been exceeded and the database housing the Role Data is no longer accessible.   In this case it was due to an expired SQL account password. Changing the password and updating the connection string or just unchecking the password expiry flag in the SQL account will resolve the issue.

Notes from the field:

There was one easy way to prevent this type of user facing outage.  Don’t allow SQL accounts to expire.  EVER.  They are horrible to diagnose because access to the SQL Server is still operational and access using AD authentication is going to throw you off the scent because the main farm access is still available.

Read Scot Hillier’s blog on “Authorization Failures with Claims-Based Authentication in SharePoint 2010”.  Really useful stuff in there about how claims works and extending the timeouts.