SQL Server Performance Forum – Threads Archive
need help : very large database designHi, Which solution is better for designing a very large database, which more than 1 million daily records will add to it?
A. 1 Database, 2 or 3 Tables.
B. 1 Database, Many Tables ( Maximum 50,000 ).
C. Many Databases, 2 or 3 Tables Each.
The no. of tables will depend on the normalization of the data and the kind of system it is. If DSS, there will be less normalization and if OLTP then more normalization. The no. of dataases can be decided based on the business functionality of this application. If there are independent modules for which the data can be set saperately and does not depend on each other then you can go for multiple databases. This will make the mailtenance activities like backups easier. Else if the data is more or less related, go for one database and decide on the system resource requirement like processor, physical memory, disk etc. Gaurav
As Gaurav stated your design should mostly depend on the business requirements (rules) and partly on your own administrative needs. The number of objects you eventually create is as important as the their location within your available storage. With between 10-30 INSERTS per second, the OLTP architecture will need to be optimized for fast logging. On the other hand, the inevitability of queries against your databases(s) will require you to create a delicate balance between the number of indexes on your tables and shortest possible time for logging the many INSERTS (and extent allocations). The administrative effort (in terms of frequency and type) will increase with the number of objects you create. It is always easier to manage (backup,restore, replicate,maintain etc.) fewer databases. There can be no straight answers about the number of objects (databases and the tables therein) without having a serious look at the business requirements. Only a thorough analysis will tell you what to do and how.
Consider your query performance against database load. Are you, for example, wanting to query on current data or are you aggregating prior to reporting ? A scenario could be that you daily load into a single non indexed table and use that to aggregate with previous data overnight for reporting, taking an archive of the 1m transactions. If all transactions are needed for reporting (aggregation and drill through) then, as long as you do not want to look at current data then you could still use the above approach but rather than archiving data, add to the reporting database overnight, the reporting database having indexes etc.
Almost worse case scenario would be where reporting is required realtime, in which case definately have a daily database with minimal indexes and update overnight.
Worse caase scenario would be where aggregated data for reporting is required real time in which case you will need to consider the balance of user requests against logging performance or consider the best way to join daily data to an aggreagte table.