SQL Server Replication
If your current SQL Server is overwhelmed with work and performance is suffering, one option is to move part of its workload onto one or more additional SQL Servers using replication.
Replication allows you to “scale out” your SQL Server by distributing the processing load over multiple servers. For example, you might want to consider moving your reporting function from the main (transaction server) to another server (reporting server). This task is relatively easily accomplished by implementing transactional replication.
Replication offers many opportunities to scale out your current SQL Server, helping to boost the overall performance of your SQL Server-based applications. [6.5, 7.0, 2000, 2005] Updated 8-5-2005
If you plan is to use replication as part of your SQL Server-based application, and your goal is high scalability and performance, then this factor should be considered when the database is first designed. The fact that you will be using replication, and the type of replication you plan to use, can directly affect how your databases should be designed.
In addition, using replication requires a different hardware strategy, which means you also have to plan for this. [6.5, 7.0, 2000, 2005] Updated 8-5-2005
For optimum replication performance, consider these hardware suggestions for servers involved in replication:
- Use multiple CPUs, including Dual- and Quad-Core CPUs.
- Add additional RAM (besides what you need for using SQL Server without using replication).
- If you need maximum performance for your transactional or merge replication, you should place the log of every database you want replicated on it own separate disk array. The transaction log is used heavily during transactional and merge replication, and the more you can isolate disk I/O for these files, the higher the performance.
- Connect all SQL Servers involved in replication using high-speed network connections and switches.
- Ideally, the distribution component of replication should be on its own dedicated server, not located on the server running the publishing or subscribing components of replication.
[6.5, 7.0, 2000, 2005] Updated 8-5-2005
Don’t publish more data than you need, whether you are talking about tables, rows, or columns in a table. You can use vertical and/or horizontal filtering to prevent specific rows or columns of a table from being published.
If your tables include columns with IMAGE, NTEXT, or TEXT columns, you should seriously consider filtering out these columns (vertical filtering) because these data types can greatly slow down replication [6.5, 7.0, 2000, 2005] Updated 8-3-2003
If close to real-time replication is not required, then don’t use continuous replication. Instead, schedule replication to occur at regular intervals, such as once an hour, or every four hours. This reduces unnecessary server overhead. [6.5, 7.0, 2000, 2005] Updated 9-17-2004
When creating the distribution database and its log, don’t take the defaults for the location of the database. Instead, manually specify that the log and database be placed on appropriate disk arrays.
For best performance, consider putting the distribution database on a RAID 1 or RAID 10 disk array (RAID 5 is not ideal because of the many writes produced by replication), and putting the database log file on its own dedicated RAID 1 disk array. [6.5, 7.0, 2000, 2005] Updated 9-17-2004
If the distribution and publishing components of replication have to be on the same server (because of resource considerations), and this server is reaching its performance capacity, then use Pull Subscriptions instead of Push Subscriptions. This will offload some of the work to the subscribing servers. Of course, this assumes the subscribing servers have the resources to do this. [7.0, 2000, 2005] Updated 9-17-2004
When creating a publication, SQL Server gives you the ability to specify if one or more of the subscribers will be Microsoft Jet databases. If you need this feature, then turn it on. But if you will not have any Microsoft Jet subscribers, then don’t select this option. What happens when you select this is option is that the Snapshot Agent will use BCP character format, not the faster BCP native format, when running. This also means that any SQL Server subscribers will have to go through extra work translating the BCP character format when it receives the publication, which causes a slight performance hit. [7.0, 2000] Updated 9-17-2004
For best performance, avoid replicating columns in your publications that include TEXT, NTEXT or IMAGE data types. These data types require more overhead than standard data types. [7.0, 2000, 2005] Updated 9-17-2004
If you are filtering the data on the publisher before it gets to any of the subscribers, be sure that the column or columns you are filtering on have an appropriate index. If not, then SQL Server will perform table scan to filter the data, putting an increased load on the publisher and reducing performance. [6.5, 7.0, 2000, 2005] Updated 9-17-2004
If the replication load is very heavy between the publisher and the distributor, or between the distributor and the subscriber(s), and each of these is on their own dedicated SQL Server, and each are connected by a fast LAN connection, and the latency among the servers is longer that you prefer, consider this option. Instead of using public network connections between each of the servers, use a dedicated private network instead. This can be accomplished by adding an additional NIC to each server, and connecting them using a private hub, switch, or cross-over cable.
This technique will boost performance because data replication no longer has to compete with all of the data already traveling on your public network. To ensure the highest performance, use 100Mbs or 1Gbs NICs and matching connection hardware and cables. [6.5, 7.0, 2000, 2005] Updated 9-17-2004
When you first click on the Replication Monitor group in Enterprise Manager, after having installed Replication on your server, you see a message asking you if you want to enable polling for Replication Monitor.
What this means is that the replication agent status information, found in the Replication Monitor group in Enterprise Manager, will automatically be refreshed periodically if you answer Yes, or it won’t be automatically refreshed if you answer No.
If you answer Yes, then you will see another screen that allows you to specify how long the refresh interval will be.
As you might imagine, if you answer Yes, SQL Server will have to incur some additional overhead to automatically refresh the status information. For the most part, the default 10 seconds refresh interval won’t produce a significant amount of overhead, and it is a good place to start. If you enter a refresh interval of less than 10 seconds, such as 1 or 2 seconds, you will probably notice the overhead, and you will probably be unhappy with this selection. If you are currently having performance problems, you may want to increase the default 10 seconds to a larger number, such as 30 seconds or so.
Once you set the refresh interval, you can always change it by right-clicking on Replication Monitor in Enterprise Manager and selecting “Refresh Rate and Settings.” [7.0, 2000] Updated 9-17-2004
If the subscriber data is being indexed heavily for the decision support needs of the organization, keep in mind that those same indexes can slow down how long it takes data to be moved from the distributor to the subscriber. The more indexes there are, the slower the replication process. This is because any new data moved to the subscriber must be indexed. In some cases, it might be beneficial to create an indexed view on the publisher, and then publish it as a table to subscribers. [2000, 2005] Added 10-9-2001
The distribution agent, the log reader agent, the merge agent, and the snapshot agent all log their activity as they work. The amount of data they log depends on their verbosity setting. The more verbose the setting, the more overhead there is. The less verbose the setting, the less overhead there is. According to Microsoft, setting all of these settings to their lowest setting can boost replication performance from 10 to 15 percent. In most cases, there are two verbosity settings that need to be made:
HistoryVerboseLevel and OutputVerboseLevel.
These settings are made using the various agent utility programs, which are located in the Binn folder. Here are the commands used to turn the verbosity levels of each of these agents their least verbose setting:
distrib -HistoryVerboseLevel 1
distrib -OutputVerboseLevel 0
Log Reader Agent
logread -HistoryVerboseLevel 1
logread -OutputVerboseLevel 0
replmerg -HistoryVerboseLevel 1
replmerg -OutputVerboseLevel 0
snapshot -HistoryVerboseLevel 1
snapshot -OutputVerboseLevel 0
The default value for the HistoryVerboseLevel is 2 (not 1), and the default value for the OutputVerboseLevel is 2 (not 0).
Of course, if you are troubleshooting replication, you will not want to reduce these values. But if you are in production, and all is going well, you should consider changing the relevant default values in order to boost performance. [7.0, 2000] Updated 3-20-2006
If you need replication agents to run frequently, say every minute or so, it is more efficient to set them to run continuously than it is from them to run frequently. This is because these agents require overhead to start and stop as they are used, which increases overhead. So if you have set the replication agents to run often, consider setting them to run continuously instead. On the other hand, if you only need the agents to run occasionally, once an hour or so, or longer, then this configuration is still more efficient than running them continuously. You may have to perform tests to see which option best meets your specific replication needs. [7.0, 2000, 2005] Updated 3-20-2006
To optimize the performance of the distribution database, manually size it to what you believe will be the biggest size it will ever get. This can reduce overhead as SQL Server will not have to increase its size automatically as needed. In addition, be sure not to turn on the “Auto Shrink” and the “Auto Close” database options, as they will incur unnecessary overhead to the database. [7.0, 2000, 2005] Updated 3-20-2006
If your SQL Server 2000 replication configuration will cause a large number of rows to be affected, consider using stored procedure replication. Significant overhead can be reduced if the execution of stored procedures is replicated instead of the data changes caused by the execution of one or more stored procedures. This is generally accomplished in snapshot or transactional replication when one or multiple stored procedures are specified as articles. [2000, 2005] Added 10-9-2001
To help reduce overhead and the size of the distribution database, consider reducing the amount of time that log history and transaction retention is held. To do this, right-click on the “Replication” folder, then select “Configure Publishing, Subscribers, and Distributors.” Next, click on the “Properties” button for the distribution database to make your change.  Added 10-9-2001
Avoid creating triggers on tables that contain subscribed data. They can significantly add to overhead and slow replication performance, including snapshot replication, transactional replication, and merge replication configurations. [7.0, 2000, 2005] Updated 3-20-2006
To help you identify any potential replication performance issues, consider tracking these Performance Monitor counters:
- Dist: Delivered Cmds/sec: Tracks the number of commands per second sent to subscribers from the distributor.
- Dist: Delivered Trans/sec: Tracks the number of transactions per second sent to subscribers from the distributor.
- Dist: Delivery Latency: The average amount of time it takes to for a transaction to be moved from the distributor to a subscriber.
The first two counters help you to get a feel for how busy your replication setup is. There are no absolute numbers to watch out for. Instead, you need to watch these counters over time to see if there is any significant change, and if so, you need to investigate why, as this could indicate a potential performance issue. The Delivery Latency gives you a good feel for how much time it takes subscribers to get updated. Again, there is no ideal number for this counter. Instead, evaluate whether or not the latency you are seeing is livable. If it is not, then you will have to take steps to reduce latency to an acceptable figure. [6.5, 7.0, 2000, 2005] Updated 3-20-2006