Create Alert Messages When SQL Server Transactional Replication Is Blocked
Recently we had an issue in our Production environment, on the Reporting Database Server the Transactional Replication procedure was being blocked due to execution of some heavy T-SQL queries and as a result business users were unable to view the latest data.
During our investigation, I had a looked at the Synchronisation status of the subscriber and I received the below error message:
The process is running and is waiting for a response from the server.
On further investigation I noticed that due to the execution of a resource-intensive T-SQL query, the database was blocking.
Service Centre Operation Manager (the monitoring utility we use to monitor the servers) was not capturing an alert message. Based upon this, I decided to write a T-SQL script which will execute every 30 minutes and check if the Transactional Replication is functioning properly and alert a particular user or a team.
During the investigation of the case, I have noticed that whenever the system shows the error message as :
The process is running and is waiting for a response from the server
it keeps on executing and at a later stage it throws a message as Query Timeout Expired which is stored in sysreplicationalerts table in the msdb database on the publisher server.
I queried this table to find the list of all the records ehere the Replication Process was halted using the below T-SQL:
select row_number() over(order by alert_id),[time],subscriber, alert_error_text from msdb..sysreplicationalerts where datepart(yyyy,[time]) = datepart(yyyy,getdate()) and datepart(mm,[time]) = datepart(mm,getdate()) and datepart(dd,[time]) = datepart(dd,getdate()) and alert_error_text like '%query timeout%'
I first created 2 temporary tables named #replicationalerts and #subscriber_alert_message which will be used for conditional processing.
The schema for #replicationalerts temporary table is as follows:
create table #replicationalerts ( replication_alert_id int, [time] datetime, subscriber varchar(100), alert_error_message varchar(1000) )
The schema for #subscriber_alert_message temporary table is as follows:
create table #subscriber_alert_message ( subscriber_name varchar(1000), [time] datetime, alert_message varchar(1000) )
The data in a replication procedure which has been halted is transferred into the #replicationalerts temporary table using the below T-SQL.
insert #replicationalerts select row_number() over(order by alert_id),[time],subscriber, alert_error_text from msdb..sysreplicationalerts where datepart(yyyy,[time]) = datepart(yyyy,getdate()) and datepart(mm,[time]) = datepart(mm,getdate()) and datepart(dd,[time]) = datepart(dd,getdate()) and alert_error_text like '%query timeout%'
Once this is done then I declared two local variables which will later be used in the conditional processing:
Declare @i int Declare @max_count int set @i = 1 select @max_count = max(replication_alert_id) from #replicationalerts
Whilst the value of the variable @i is less than or equal to the value of the variable @max_count, the fetching of records continues:
while(@i<=@max_count) begin select @subscribername=subscriber from #replicationalerts where replication_alert_id = @i select @patindexname=patindex('%\%',@subscribername) select @count = @patindexname-1 select @alerterrormessage = alert_error_message from #replicationalerts where replication_alert_id = @i select @alert_message_patindexname = patindex('%.%',@alerterrormessage) select @count_alert_message = @alert_message_patindexname -1 select @length_alert_message = len(alert_error_message) from #replicationalerts where replication_alert_id = @i select @final_count = @length_alert_message-@count_alert_message select @left_alert_error_message = right(@alerterrormessage,@final_count-1) from #replicationalerts select @time = [time] from #replicationalerts where replication_alert_id = @i insert #subscriber_alert_message select 'Subscriber Server Name', ,@time,@left_alert_error_message set @i = @i+1 end
I then created a temporary table named #alert_message to hold the text of the alert message to be displayed to the users.
create table #alert_message ( row_id int, subscriber_name varchar(1000), alert_message varchar(1000), [time] datetime )
The data in the temporary table was inserted using the #subscriber_alert_message, keeping in mind that only last 30 minutes data needs to be retrieved for which we have used the datediff function as shown below:
insert #alert_message select rank() over(order by subscriber_name,alert_message), subscriber_name, alert_message, [time] from #subscriber_alert_message where datediff(mi,[time],getdate())<=30
I declared two local variables named @j and @max_count as shown below:
Declare @j int Declare @max_count int set @j =1 select @max_count = max(row_id) from #alert_message while(@j<=@max_count) begin select @mail_subject = ‘Subscriber Server Name’+space(1)+'Replication Has Failed' from #alert_message where row_id = @j select @message = alert_message from #alert_message where row_id = @j select @mail_body = ' Hi Team,<br><br> The Transactional Replication between the OLTP and Reporting Server is not happening properly.<br><br> The reason for failure is:'+convert(varchar(100),@message)+'.'+'<br><br> The time when the failure occured is:'+convert(varchar(100),@time)+'.'+'<br><br> Regards,<br><br> DBA Support Team ' exec msdb..sp_send_dbmail @profile_name = 'DB Mail' ,@recipients ='semi comma seperated email address of the recipients' ,@subject = @mail_subject ,@body = @mail_body ,@body_format = 'HTML' ,@importance = 'High' set @j = @j+1 end drop table #alert_message drop table #replicationalerts drop table #subscriber_alert_message end
This is just one such approach which I used on my production environment to alert the users in case if the transactional replication gets halted. If there are any suggestions or approaches then please let me know.