SQL Server Performance

Replication advice

Discussion in 'SQL Server 2005 Replication' started by acrombie, Aug 28, 2007.

  1. acrombie New Member

    Hi,
    Here is my scenario, any advice would be much appreciated;
    I have a main database, part of which will be updated manually through a web interface by approximatly 50 concurrent users, the other part will only be updated by a bulk import occuring every 6 months or so. The portion of the database that will be updated by a bulk import will be large, maybe billions of records. The rest of the database should a more managable size and the number of changes per day should be in the thousands and no more.
    I also have, six (or more) calculation databases that use the data in the main database to run calculations. These calculation databases will be read-only but will need to use the very latest data from the main database in order to do the calculations. Once a calculation is started on any one of the calculation databases i need to halt replication for that database and start it again once the calculation is complete. A calculation can take any where between 2 min and 72 hours. I need the calculation databases to be ready and up to date with the latest data at any point in time in order for users to run calculations on (accepting a small delay perhaps, but this should be more more than a few seconds).
    The main database and each of the calculation database will sit on seperate servers. The calculation servers will be virtualized on two machines (3 on each)
    Can anyone suggest a strategy that would suit my needs?, is it possible to divide the replication process into two perhaps; the portion of the database that will be updated on a day by day basis and the other portion of the database that is only updated by bulk import?. All tables have primary key identity columns.
    Hope someone can help me.
    Angus
  2. thomas New Member

    Yes, you can have 2 different kinds of replication for different tables (e.g. transactional for one, snapshot for the other).

    You say you want to be able to stop replication to do calculations, for up to 72 hours, but also the database must be fully-up-to-date at all times for calculations? You can't do both. However if you have Enterprise Edition of 2005 on your subscribing (calculation) database, you could let replication run continuously and instead of stopping it, use database snapshots to run the calculations. These can then be dropped when no longer needed, and will provide a static read-only, point-in-time database for each calculation.

    Also if all your replication is transactional, the database will have one log reader and one distribution agent. You can't stop and start these for different parts of the database, it's all or nothing.
  3. acrombie New Member

    Thanks,
    When i say 'stop' replication during calculation, i really mean 'pause'. During the calculation, the data in the calculation database will and must remain the same, however, upon completion, the calculation database must fetch all the data changes that have taken place since the calculation began.
    I suppose i could use a snapshot to run the calculations. Won't this take a long time to create though?, given the database could exceed 300gb.
    I need the delay just prior to a calculation to be kept to about 4 or 5 seconds. I do accept, however, that once the calculation has finished, it may take some time for the database to fetch all the changes that have taken place, especially, if it has been running for 72 hours.
    Angus
  4. thomas New Member

    No - creation of snapshots is very quick. It doesn't actually have to do anything until data starts to change, in which case it copies the original (unchanged) pages to the snapshot, to keep them correct as of the creation time of the snapshot. So snpashots tend to remain small (unless your entire database changes very rapidly).
  5. acrombie New Member

    OK, thanks
    So, have i got this right? I can use transactional replication that will replicate changes to each of the calculation databases continuously. Then, just prior to a calculation, i create a snapshot of the replicated database to use in the calculation and allow the replication process to continue for the duration of the calculation. At the end of the calculation, i discard the snapshot.
    Thanks again for your help
    Angus
  6. thomas New Member

    yes, exactly right. good luck.

Share This Page