I immediately thought, "Why on earth is the Distribution database bigger than this truck?"
I started my investigation by running this query to see how many Transactions were in the Distribution database (there were several million):
SELECT PD.Publisher_DB, COUNT(RT.xact_id) AS #TransactionsInDistributionDB
FROM distribution.dbo.MSrepl_transactions RT
JOIN distribution.dbo.MSpublisher_databases PD ON PD.id = RT.publisher_database_id
GROUP BY PD.Publisher_DB;
However, Replication Monitor tells you how many Commands are in queue (not Transactions), so I ran this query and found that there were almost 500 million commands total in queue (NOTE: Use this query with caution, it will take a long time to run if there are a lot of commands in queue - in my case it took almost 45 minutes):
SELECT PD.Publisher_DB, COUNT(RC.command_id) AS #CommandsInDistributionDB
FROM distribution.dbo.MSrepl_commands RC
JOIN distribution.dbo.MSpublisher_databases PD ON PD.id = RC.publisher_database_id
GROUP BY PD.Publisher_DB;
We also observed a lot of Disk IO on the Distributor, and the Distribution clean up: distribution job was taking a lot longer than normal (1-4 hours instead of 2-5 minutes).
After some Googling I found a great article by Paul Ibison (http://www.replicationanswers.com/TransactionalOptimisation.asp) that reveals that the immediate_sync option causes the Distributor to queue all Transactions for a Publication until the retention period is reached, regardless of whether or not the Transaction has been delivered or not.
I ran this query at the Distributor to see what Publications had the immediate_sync option enabled:
SELECT SS.Name AS PublisherName, Pubs.Publisher_DB, Pubs.Publication, Pubs.Immediate_Sync
FROM distribution.dbo.MSpublications AS Pubs
JOIN master.sys.servers AS SS ON SS.server_id = Pubs.publisher_ID
WHERE Pubs.Immediate_Sync = 1
ORDER BY PublisherName, Publisher_DB, Publication;
This revealed 2 Snapshot Publications that had that option enabled. Yes, you read that correctly, the immediate_sync option also causes the Distributor to queue all Transactions for Snapshot Publications. Even though they're not necessary for the Publication.
How Do I Fix It?
To correct the issue, you need to run a couple scripts in the Published database (update PublicationName as appropriate):
EXEC sp_changepublication
@publication = 'PublicationName',
@property = 'allow_anonymous',
@value = 'FALSE';
EXEC sp_changepublication
@publication = 'PublicationName',
@property = 'immediate_sync',
@value = 'FALSE';
The next execution of the Distribution clean up: distribution job will take much longer, but it will then clean up all the unnecessary transactions. In my case, there is now less than 10 GB of data in the Distribution database.
These scripts can be executed on the fly without impact as they do not interrupt or affect the Publication (aside from telling the Distributor that it no longer needs to queue commands that it doesn't really need).
How Did It Happen?
This option is a result of checking a very innocent looking box in the New Publication Wizard:
This setting is very BAD |
If you have scripted out the Publication creation, edit your scripts and look for this snippet:
EXEC sp_addpublication ... @immediate_sync = N'true'
This will need to be changed to:
EXEC sp_addpublication ... @immediate_sync = N'false'
This option can greatly impact the performance and storage of your Distributor. If you suspect this an issue on a Distributor you manage, I recommend you run the scripts above to check for and correct the issue.
Thanks again Paul Ibison for your invaluable assistance.
Hey Matt, did you notice any issue when changing those two parameters on the source database in regards to performance on the source database ?
ReplyDeleteNo, there was no impact at the published database by changing these parameters. The Log Reader will still read through the entire transaction log looking for transactions and commands that need to be replicated.
DeleteThe exception to this is if your Distribution database is on the same server as the published database. This will reduce the overall performance impact to the server where the Distribution database resides.
nice article
ReplyDelete