Today is one of those days where I try to figure out a bunch of odd situations in various places. The oddest was one involving a Data View Web Part (DVWP) I wrote about a year ago. It’s been running just fine ever since – until about a month or so ago. I didn’t hear about the issues until last week, and they didn’t make much sense to me.
Most of the time, the DVWP was loading just fine. Occasionally, rather than the spendiferous expected output (it really is a cool DVWP), users would see the wonderful error:
Unable to display this Web Part. To troubleshoot the problem, open this Web page in a Microsoft SharePoint Foundation-compatible HTML editor such as Microsoft SharePoint Designer. If the problem persists, contact your Web server administrator.
along with a correlation ID. Of course, that error can mean just about anything, as we DVWP lovers all know. The error is sporadic and there’s nothing obvious going on. No data or code has changed, so my theory was that there must be something going on with the server at the time of the error.
In looking at the logs, we saw that a stack overflow had occurred, but there wasn’t much else to go on. So why would the same code with the same underlying data sometimes cause a stack overflow and sometimes not? I looked through the code and didn’t see anything that I’d done which was dumb.
After going back and forth on this a bit, I remembered reading that there was a change in the August CU that set a limit on the amount of time that an XSL transform was allowed to take. As I understand it, it went from 5s in the June or July CU to 1s in the August CU. This can many times not be enough time if there is a server load. Ostensibly this is to protect against denial of service (DoS) attacks. And even better, if the 1s is exceeded, a stack overflow is forced even though there is no actual error. This means that if there is a server load, the DVWPs can be forced to a stack overflow needlessly. I’m piecing all of this together based on multiple blog posts and such, of course, so I’m not certain that it’s all correct.
With Glyn Clough’s (@glynclough) help
— Glyn Clough (@GlynClough) November 7, 2011
I was able to track down KB Article 2639184 which stated the problem, and gives some possible workarounds. I don’t find the workarounds to be all that palatable, though, especially the first one:
Simplify the custom XSL that was added to the DataForm web part.
If I could simplify the XSL, then wouldn’t I have done that when I wrote it to be efficient?
I’m hoping that there will be a longer term fix for this, ideally a way to set the 1s parameter to something more reasonable. I’ve emailed the MVP mailing list to see if there’s anything positive from the Product Team and I’ll update this post if I hear anything useful.
In the meantime, I’m not sure what we’ll do other than tell our users that they need to refresh any pages that have DVWPs on them until the error disappears. I’m sure that will make me popular.
In this thread, Dan Davis shows what he found when he used Reflector to look for what was causing the issue. Scroll down for Dan Davis’ post on Monday, September 19, 2011 at 8:47 PM.
There is reportedly a fix for this in the February 2012 CU. See Fix in the February 2012 CU for Stack Overflows with DVWPs Running Over One Second Limit Imposed by the August 2011 CU.