Database Load Balancing
An emerging way to achieve massive database scalability is by real database clustering, literal load balancing of database servers. Load balancing web servers and application servers is easy and straightforward because there’s no data to persist. Incoming requests can be handled by any web server in a cluster because each server has identical executables on it. When clustered web servers need access to data, they funnel to a shared data source, usually a database. You can not use traditional load balancing for database servers, because the databases would get out of sync as soon as a write is made to one of them.
The mystery to load balancing persistable data sources lies in synchronization. Any update made to the database must be transactionally made across all databases in the cluster so that if a millisecond later a read request is made on the same rows which are affected by the write operation, each and every database server in the cluster will yield the exact same result, if queried. This kind of guaranteed synchronization can only be provided when load balancing is built into the database itself, as is done with Oracle’s 9i Real Application Cluster, or provided through a middleware transaction server, such as Database Scattering, a new product offered by my company, Metaverse.
To help demonstrate how load balancing can be a very powerful way to manufacture scalability, allow me to use Microsoft Access as an example. While Access is generally considered “not scalable”, referring mostly to its inability to handle large amounts of volume, most would consider Access to be suitable for many small to medium-sized applications. What people are really saying is that Access can’t scale up, which is true. Access wasn’t written to take advantage of multiple processors or extremely high amounts of memory, and it doesn’t have any automatic failover capability. But if you were to load balance access to it, those things become much less important. Scalability comes from each Access server in the cluster only processing a much smaller load of operations, which it is capable of, and backup and recovery exists on account of having multiple separate live copies of the data –- better than failover. While load balancing Access won’t clip any other thorns of the low end database (transactional ability, etc.), I think it demonstrates the power of load balancing and distributed processing to achieve massive scalability on software which was not designed to really be very scalable at all, no less massively scalable.
The basic idea behind load balancing databases is to read from any one database server in the cluster, and write to all databases in the cluster. You can write a simple database load balancer yourself by doing just that, given a set of database connection strings, randomly select one for read operations, and use them all for write operations.
Load balancing databases enables you to rely more heavily on the database for processing, as you don’t have to worry about one or two inefficient queries stealing all of the database’s CPU cycles. Front-end applications and processor intensive back-end services alike can have the same access to live data. Obviously load balancing databases favors read heavy applications, however most applications are just that, with at least 80% of all operations being reads. Database load balancing presents a new and powerful way to achieve massive database scalability.
About the Author
Douglas Kerwin is the founder and chief executive of Metaverse Corporation, an early adopter of Microsoft .NET technology. Doug has designed and patented Metaverse’s Database Scattering middleware, a Microsoft based transaction server capable of load balancing database access to SQL Server. Doug can be reached at mailto:email@example.com