Site sponsored by: Idera The gold standard of SQL Server performance monitoring & diagnostics.
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Performance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
Developer
General DBA
ASP.NET / ADO.NET

Write for Us

Share you SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

Claytons Data Mining (Part 2)
Backup System Databases Using Maintenance Plans
Overview of Maintenance Plans in SQL Server 2008
Monitoring Index Fragmentation

More     
 
Latest FAQ's

How to alter a User Defined Data Type?
How to unzip a File in SSIS?
How to view previous query plans?
ALTER TABLE SWITCH statement failed because the object '%.*ls' is not ...

More     
   
Latest Software Reviews

Spotlight on ApexSQL Doc 2008
ApexSQL Enforce
Embarcadero Change Manager
SQL Server DBA Dashboard

More     

articles >> performance tuning >> A Practical Path Towards the Ultimate Microsoft ...

A Practical Path Towards the Ultimate Microsoft SQL Server Scalability and Availability

By : Justin Y. Shi
Jun 15, 2005

Page 2 / 3

Microsoft Cluster Server (MSCS)

Microsoft SQL Server Enterprise Edition supports MSCS. MSCS was designed to provide high availability by running multiple SQL instances at the same time. The idea was to have automatic failover capability when the primary crashes. This idea would have worked fine it MSCS supports multiple replicated datasets, one for each instance. Instead, MSCS supports only a single update database image. Since transactions are mostly I/O bound, this limitation makes MSCS unable to help in scalability at all. It is ironic that MSCS still needs scheduled backups for dataset availability.

 

Distributed Partitioned Views (DPV)

Microsoft SQL Server Enterprise Edition also supports DPV. DPV enables the “federated database design”. The idea is to partition your tables across multiple physical SQL servers. Queries updating different tables will be automatically directed to multiple SQL Servers to achieve scalability. For very large databases, this scheme must be used to obtain reasonable response time. It requires significant application recoding, if the application did not start with partitioned views in mind. This technology makes the availability problem dramatically worse than using a single SQL Server since any server crash can potentially halt the entire cluster. The delivered scalability is also limited. For example, if some tables attract more queries than others, these servers will be more heavily loaded than others. Using MSCS on top of DPV helps neither scalability nor availability. It will result in a costly cluster with very high license fees and maintenance complexity.

 

Automated Transaction Log Shipping

The simplest way to produce a replicated SQL Server dataset is perhaps using log shipping. You will need two running SQL Servers. You can automate the process that the backup server is being updated every few seconds. Since the backup SQL will never be in sync with the primary, it cannot be used for real time updates.

Theoretically the backup SQL’s value is only to be realized during disasters. Clever uses of the backup SQL would include report and data mining applications. The trouble is that when the log is applied, the database cannot be used at all. This makes the reporting and data mining application schedule a challenge. Another thorny point in practical production is that the speed of log shipping typically lags behind when updates are frequent. The delay can be in minutes to hours. When the primary SQL crashes, all transactions not yet shipped will be lost. In addition, log shipping must be defined for each database.

If your SQL Server is hosting multiple databases; such as a typical ASP (Application Service Provider), you will have to maintain multiple scripts and make sure each is running smoothly on time.

 

Immediate Transaction Replication

Microsoft SQL Server 2000 supports transaction replication based on publisher/subscriber relations. The system was designed to incrementally propagate transaction data in distributed environments. Transactions are first applied to the primary server and then applied to the subscriber server (or a replication server who relays the contents to the target SQL Server). There is also an “immediate update” option designed to minimize the possible delay. Since the target SQL is never in sync with the primary, in either case it cannot be used for real time load sharing. But they do help with reporting and business intelligence applications. Practice indicates that managing the publication queues is tricky due to network/processor speed differences and the large overhead by the replication engine.

 

Disk or Database File Mirroring

Hard disk vendors and backup software vendors also offer some help for SQL Servers. Their offers consist of synchronous and asynchronous replication of database files. Asynchronous database file replication has exactly the same problem as for log shipping, provided that these solutions often does not need a running SQL instance. All synchronous replication products use the well-known two-phase-commit protocol (2PC) that either the primary or the backup side does not commit, the entire transaction must be rolled back. This effectively cuts each transaction’s survivability by half – making it practically unusable.

 

Hardware Mirrors

One ancient method for providing high availability is to use lock step duplicated hardware. This concept has produced the most cost prohibitive design in the industry. The real trouble is that the money was thrown at the wrong problem – hardware is no longer the weakest link. This has been true at least for the last ten years. These systems still operate on a single update dataset offering no help for scalability. It is ironic that these systems’ availability requires log shipping or regular backups.

 

Homegrown Transaction Replication

Since the hardware costs have gone down so much, it is cheap enough to build a Microsoft SQL Server cluster with identical datasets. It appears that all one needs to do is to make sure that updates are sent to all SQL Servers and the read-only can be distributed to any.

A closer look will find that DCP will happen, if the application support concurrent updates for multiple clients. This is not recommended for anything other than single client update applications.

 

Replication Middleware

One emerging technology involves injecting a middleware between the data clients and SQL Server(s). This represents a unique approach in solving scalability and availability problems. Since the middleware is capable of “seeing” all concurrent transactions when they pass through the middleware, it provides a unique opportunity for data synchronization, automatic fail over and load balancing. While the middleware will introduce some overhead, if the benefits include both scalability and availability, the prizes may well be worth the troubles.

To date, there have been two proprietary methodologies:

  • Modified database access API to capture transactions for special synchronization controls (SQLup by Incepto.com (now defunct)); and
     

  • Proxy-like packet router by embedding the data synchronization controls in the packet transmission mechanisms (ICX-UDS by Parallel Computers Technology Inc.).

Regardless the data synchronization algorithms, changing API is a bad idea for practical reasons. Captured transactions must be sent to a middleware server before reaching SQL Servers. The middleware server becomes the single-point-of-failure. A proxy-like packet router can appear seamlessly in any existing SQL infrastructure. It can be monitored and switched by network appliances to remove the single-point-of- failure. Therefore, it is the most promising in solving both the scalability and availability problems.


<< Prev Page     Next Page>>    








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 1999-2008 by T10 Media. All rights reserved