Tuesday, September 20, 2016

I've Got The SP2 Replication Blues - SP2 For SQL 2014 Changes A System SPROC

I ran into this replication error the other day and I was completely stumped.
Procedure or function sp_MSreplraiserror has too many arguments specified.
 
We started getting that error message shortly after we had applied SP2 for SQL Server 2014 to a server that is a replication Publisher (source of replicated data).

We dug into the commands that were being replicated and found that there were missing rows in the table on the replication Subscriber (destination for replicated data).  Once the rows were populated the errors stopped.  However, after digging in a bit more, we found that this error has an explainable source.

In SP2 for SQL Server 2014, the system SPROC sp_MSreplraiserror has been updated to support an additional parameter for additional error logging detail.  This is good!

When we updated the Publisher with SP2 for SQL Server 2014, the replication SPROCs that actually perform the work (sp_MSdel_*, sp_MSins_*, and sp_MSupd_*) were updated with the additional parameter value to specify when running the sp_MSreplraiserror SPROC.  Unfortunately, the Subscriber (which is where the SPROCs are executed) has not yet had SP2 applied and thus the error is thrown because the system SPROC sp_MSreplraiserror has not yet been updated to support an additional parameter.

To mitigate the issue, I edited the sp_MSdel_* and sp_MSupd_* SPROCs (the sp_MSins_* SPROCs do not execute sp_MSreplraiserror when an error is encountered) and commented out the very last parameter.  Once we apply SP2 on the Subscriber, I will go back in and uncomment the last parameter to gain the additional error logging detail.

Friday, May 6, 2016

Trace, Trace, Trace - Too Many And SQL Can't Keep Up The Pace!

I recently ran into an issue on a client development (thankfully not production) server where the server was performing very poorly.  We didn't see anything obvious in any of the monitoring detail we had, aside from the fact that our monitoring tools also had issues with how SQL was performing (lots of missing data).

Before the client took the last ditch effort of just restarting SQL Server, I checked traces.  There were 9 user traces collecting a ton of trace events.

I manually killed them all and suddenly performance returned to normal.  Phew!  Crisis averted.

As a follow up from that issue, I created a script that stops and then deletes all user traces.  We are discussing setting up a job to run this script periodically to keep traces at bay and I am educating the group on proper trace discipline (and Extended Events).

-- Turn off (## rows returned) messages
SET NOCOUNT ON;

-- Set the stage by declaring working variables
DECLARE @SQLcmd VARCHAR(1024),
@TraceID SMALLINT

-- Create a temp table to hold the results of the trace detail
CREATE TABLE #TraceDetail
(TraceID SMALLINT,
Property SMALLINT,
Value SQL_VARIANT
)

-- Populate the temp table with only user trace data (TraceID 1 is the default system trace)
-- In our environment, TraceID #2 is our monitoring software, so we actually have WHERE TraceID > 2
INSERT INTO #TraceDetail (TraceID, Property, Value)
SELECT * FROM ::fn_trace_getinfo(NULL) WHERE TraceID > 1

-- Check to see if there is anything in the table, and if so, start stopping and killing the traces
WHILE (SELECT COUNT(*) FROM #TraceDetail) > 1
BEGIN
-- Just grab the first ID in the table
SELECT TOP 1 @TraceID = TraceID FROM #TraceDetail
-- Build the command to stop (status = 0) and then delete (status = 2) the trace
SELECT @SQLcmd = 'EXEC sp_trace_setstatus @traceid = ' + CONVERT(VARCHAR(3), @TraceID) + ', @status = 0;
EXEC sp_trace_setstatus @traceid = 'CONVERT(VARCHAR(3), @TraceID) + ', @status = 2;'
-- Execute the command
EXEC (@SQLcmd)
-- Report what we killed
PRINT 'Killed TraceID ' + CONVERT(VARCHAR(3), @TraceID)
-- Delete all entries for that TraceID (typically 5 rows each)
DELETE FROM #TraceDetail WHERE TraceID = @TraceID
END

-- Drop the temp table
DROP TABLE #TraceDetail