SQL Server Performance Forum – Threads Archive

How to Do Data Archive

If I have a huge database, I use the data archiving to archieve to table. eg. I have table A, a huge table, I archive some of the old data to table B. Let said my data is some transaction data. When I would like to retrieve it to generate a report, a historical report for my yearly transactions. Let said I’ve split the March 2003 data from table A into table B. which mean only April to July transactions remain in table A. But now I want to retrieve the data from Jannuary till now on, to generate a half year report. How should I do it?? Should I just using a SQL query to select data from table A where date from Jan to July, then the system will automatically look back for the data that have been archived into table B and retrieve it for me?? Or I should use 2 seperate queries to select it from table A (Apr-July) and table B (Jan-Mar)???

Its better to differentiate the data on month basis, in your case you can maintain seperate tables for each quarter and run query whereever required. Always test the queries in QA using QUERY EXECUTION PLAN and take help of PROFILER to define recommendations from Index tuning wizard. BTW< what is the size of data in each month? For information refer to this topic also http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=1008] HTH
_________
Satya SKJ
Moderator
SQL-Server-Performance.Com

I would recommend using a indexed view on the data if possible. If it is not possible, a single query using UNION ALL clause should do the job for you. There is no need for writing two queries and it will be time consuming for coding as well as performance. Gaurav

If I use UNION ALL to get data from both table, if the data that i would like to retrieve is located in the table A, how’s the query performing??? Will it search for Table B although ther are no related data in B?? Example, I want to retrieve data by date from June to July (all the transactions are in Table A, not yet archive into Table B). How would the performing for a single query that just retrieve data from Table A compare to a query using UNION ALL Table A and B?? Normally my database size is around 2-3GB.
I’m actually facing the performing problem. My system running very slow when I’m searching or retriving data from database.

If you create a constraint on the tables with the date, I don’t think the query optimizer will touch the other table. Actually database size is not so big. Can you post the code for the tables involved including the indexes and also the query that you are executing? Gaurav

Actually I’m using Sybase and PowerBuilder to develop my system. But now I’m planning to migrate to .NET and using SQL Server 2000. I’m just doing research before I start my system. I’m reviewing the problems that I’ve faced in Sybase and PB, and see whether SQL Server 2000 and .NET will have the same problems. Anyway, thanks a lot for your helps everyone!

Another question from me.
What is the size of the database table after achiving?
Will it become smaller or same size?
Let said my database before archive is 2GB, after archive, willl it become smaller??? less than 2GB???

Any ideas on back-end processing?
Lets say I’m doing retrieving data history for reporting. In the mean time, I would like to continue with my front-end processing, such as daily transactions?
In my current system, while I’m doing my data retrieve, I cannot do anything before the retrieving process finished. One of the issue I think is because of the size of my database and the technic that I’m using to retrieve the data is bad. Hope to hear more advices.

quote:Originally posted by esupport …What is the size of the database table after achiving?
Will it become smaller or same size?
Let said my database before archive is 2GB, after archive, willl it become smaller??? less than 2GB???

Partitioning your tables (what you are currently calling archiving) will not result in a smaller database. The tables – and the data therein- are still contained in the same data files. The size may reduce if you are archiving/partitioning to another database and then using distributed views to do your queries. Even with federated databases you might still have to shrink your data files (and log(s)) to see an overall size reduction.
Actual database size is not the primary reason most DBAs would implement partitioning(‘archiving’). Performance when working with specific partitions would be one among many ‘excuses’. Nathan H.O.

quote:Originally posted by esupport …Any ideas on back-end processing?…..

No major problems in SQL server with SELECTing while doing front-end stuff –
You may have to worry about locking and blocking in a multi-user environment.

No problems if you partition your application into logical layers (the reknowned user, business and data services)

SQL Server has a transaction Isolation level that can be ‘infrequently’ manipulated

Your fears will definitely materialize if you ignore commonplace advice on database and application architecture design when you upgrade to .NET and SQL Server. Check out the various articles on this site about database design, federated databases…. In fact read almost everything if you have the time. Nathan H.O.

I would recommend creating a view that UNION ALL the two tables. This way you don’t have to duplicate the UNION ALL statement in every stored procedure that requires the full data; they can just use the view. Whether to use an indexed view or not dependeds on how busy table A is. If it’s updated frequently, then using an indexed view is a bad idea because most of the time you’re only updating the recently data, and each update will have to update the indexed view in a hot spot. Keep in mind that an indexed view is implemented internally as a table, so if your data is big, it will consume a lot of space. Therefore I suggest using a plain-vanilla view to UNION ALL the two tables. As far as performance goes with the UNION ALL statement or the view, it depends on the column used in the WHERE clause and the indexes on table B. For example, if you have a clustered index on DateEntered for table B and the query has a WHERE clause that restrict on that column, then there won’t be performance penalty because the query optimizer can figure out very quickly that table B is not needed. But if you columns in the where clause that are not covered by an index (or some of them can’t be used for short-circuiting becausing you’re using OR), then badnews: it will scan table B and incurs a huge performance penalty.

I have my own forum as well using snitz, as I have just set it up I do not have any post older then the minimum 1 month period to look at the archiving function. My DB/Forum uses an access back end instead of sql. My question is, when you use the archive option, what does it actually do? ie. the old topics, where are they archived to? once the db is archived how do you then recall the data that has been archived? regards, [<img src=’/community/emoticons/emotion-1.gif’ alt=’

‘ />]

quote:Originally posted by aziz I have my own forum as well using snitz, as I have just set it up I do not have any post older then the minimum 1 month period to look at the archiving function.
My DB/Forum uses an access back end instead of sql. My question is, when you use the archive option, what does it actually do?
ie. the old topics, where are they archived to?
once the db is archived how do you then recall the data that has been archived?

I guess this should be your place to gohttp://forum.snitz.com/ Frank
http://www.insidesql.de
http://www.familienzirkus.de

Aziz Kindly post the threads in related forums. _________
Satya SKJ
Moderator
SQL-Server-Performance.Com

]]>