There can be many different reasons why a queue in Project Server might not be processing as fast as you would like. After listing a few of them,I will go into detail on one condition we have seen with a few different customers now, under some different scenarios. All of these are equally as applicable to Project Server 2007 as they are to Project Server 2010 queue processing.
Possible reasons for a slow queue:
- Poor database maintenance- see this post for Project Server 2010 or this post for the 2007 version.
- Large Shadow Tables- learn more.
- Incorrectly configured queue settings- too many threads for the hardware you have. The 2007 reference is still one of the best descriptions of the Project Server queue.
- Clock differences between servers can also be an issue.If the queue job can be processed ‘now’ but ‘now’ appears to be in the future to other servers then the job may go to sleep.
- Just busy- the queue is there to spread the work out, andsometimes this might take a while (long duration admin type plans publishing).
- Poor execution plans for the stored procedures that control the queue processing- the main topic of this article!
Now, getting in to more detail on point 6: we have seen a few scenarios where the queue is processing reallyslowly even though theSQL Server is notvery busy and there appears to be no good reason that the jobs are notgoing through faster.
One customer observed that the jobs alsoappeared to complete in groups, which related to the number of threads. For example, if you had 8 threads running your Project queue (and of course plenty of hardware to support this quite high setting) then you might see some jobs that you expected to run quickly all seem to be waiting for the longest running job and then they all finish at once.
What appeared to be happening is that certain stored procedures that govern the queue jobs (and ordinarily take just a few milliseconds) might slow down and take half a second, 10 seconds, or (in the extreme) 30 seconds to run. And it was these stored procedures that decided which jobs get processed next. In summary, the Project Server appeared to be taking more time deciding which queue job to run and not spending any time actually running anything! Initially a reboot was the step that got things moving but we wanted to be a bit smarter than the turn it off and turn it on again approach.
Let me talk a little more about the scenarios where we have seen this, some of the symptoms, the remedies, and then some of the quick fixes that cannot onlyget things moving again, butalso can confirm the problem. Ialso would like to introduce the topic of execution plans and how SQL Server decides how to get answers from the database.
An execution plan will detail how SQL Server goes about finding the data it needs, if specific indexes are used, or if a table read is going to be quicker. The plan will depend on the indexes available, the statistics available to understand the make-up of the indexes, and the accuracy of those statistics. Some of this comes back to point 1 in the list above – database maintenance – and some may depend on certain server settings or even patches and versions of SQL Server in use.
Most of the slow downs we have seen relate to activities that will create a large number of queue jobs in a very short time. Active Directory Synchronization, reporting database rebuild, mass deletion of projects, very heavy timesheet usage, or even automation are the first causes that come to mind. My theory is that the queue tables that were running along quite nicely while relatively empty suddenly find that they now have thousands of rows, and the execution plan that is cached is no longer optimal. SQL Server should quickly adjust for this and we have certainly seen that Service Pack 1 of SQL Server 2008 R2 appears less susceptible to this problem.
There is also a recommended setting for Project Server – SET AUTO_UPDATE_STATISTICS_ASYNC ON (documented here) that could avoid this problem, as it may potentially be triggered if the database is waiting on the statistics to update rather than just getting on with the work!Considering the problem is quite rare, and we usually want to get things working rather than examine the slow system, we have not had a good chance to confirm some of these theories.
One typical symptom, apart from things just running slow, is a particular event in the ULS log indicating a slow running stored procedure. These are verbose level messages from SharePoint Foundation Monitoring, so there may not be traces that are normally written to your ULS logs. However, they many occur for some of the queue stored procedures such as MSP_ProjQ_Lock_Next_Available_Group or MSP_TimesheetQ_Lock_Next_Available_Group. The event ID to look for is b4ly and it might look something like this:
08/09/2012 16:10:25.10 Microsoft.Office.Project.Server (0x36D0) 0x2E58 SharePoint Foundation Monitoring b4ly Verbose Leaving Monitored Scope (FillTypedDataSet — MSP_ProjQ_Lock_Next_Available_Group). Execution Time=21102.179988219601 <correlation ID>
The Event ID and the stored procedure names are in boldface. The execution time is in milliseconds so this is 21 seconds very slow. You may also be able to track an increase in this time across your logs.
As mentioned above, we usually found that a reboot got things moving again, but once we understood the likely causes of the problem, we could be a little more focused and fix things with less inconvenience to the users.
How to Fix
- Always worth getting your database maintenance working well- for this and many other reasons.
- Set the recommended value for SET AUTO_UPDATE_STATISTICS_ASYNC
- If Project Server is the only application running on SQL Server then the sledgehammer approach would be to run the following command within SQL Server management studio: DBCC FREEPROCCACHE. This will empty the stored execution plans from the procedure cache, and then ALL stored procedures will need to recompile. The new execution plan should get things working faster.
- If other applications are running you probably dont want to run that DBCC command so instead, you could just recompile a couple of the likely stored procedures:
EXEC Sp_recompile MSP_ProjQ_Lock_Next_Available_Group
EXEC Sp_recompile MSP_ProjQ_Get_Status_Of_Jobs_List
EXEC Sp_recompile MSP_TimesheetQ_Get_Status_Of_Jobs_List
EXEC Sp_recompile MSP_TimesheetQ_Lock_Next_Available_Group
I hope this helps if you do run into this kind of behavior, and hopefully by keeping SQL Server patched and using our recommended settings this isn’t something that you should see happen often.