I sent this to Brad McGehee as an email and he rightfully asked me to post it here so his answer would benefit the masses. Anyone else who wants to weigh in on this is welcome to do so, too... I have a question regarding Brad's article about about improving performance in SQL Server 2005 at the following page here: http://www.sql-server-performance.com/tips/asp_sql_server_p1.aspx Brad explicity recommends against storing images in the database, but is this always true? MS published a whitepaper in 2006 that showed for file sizes smaller than 256 KB, the DB outperforms NTFS in delivery and is on par with it fragmentation-wise: http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2006-45 We're using a webservice to deliver dynamic images (charts) that are built using SQL Server Reporting Services. The images are regenerated when a user enter new information and then visits the page hosting the chart (generally once a day). Note that to improve performance, if a user enters new information but doesn't bother to view the chart, the image will not be regenerated. The images themselves average about 60 KB in size and are in PNG format. We are currently planning to save these images in a table reserved specifically for them with only an identity column as the PK and the image column itself. A separate table will include foreign keys for the image id and user id, as well as a modified date and a few other columns needed for sorting/selecting. When a change is made, we simply append the new image to the table instead of updating the existing one, then update the foreign key in the related table with the new image id. We then delete any orphaned images during times of low usage. Does your advice ring true given the whitepaper results and our situation? Will our approach of simply including a sequential identity column and the image column in a separate table help with performance? Many thanks in advance for any comments you can provide! Cheers, Gene
Hi, Welcome to the forum!! yes according to the research paper it stands true to blob upto 1 MB of files, but then the hardware they have used to test stress would be different then yours, the scope of the application and its growth would be different. So, you have to test it with your available hardware and application to come to conclusion. Some years back we have had similar situation, though the files are not bigger then 1 MB we have stored only path of the files and did not blob them.
I think you should store it in the database. http://groups.google.se/group/microsoft.public.sqlserver.programming/browse_frm/thread/c5bfd673c1ee4524/455d6ec4c91a75d4?hl=sv&lnk=st&q=kurt+sune+image&rnum=18#455d6ec4c91a75d4 http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_thread/thread/28cf7a9cf600daf0Steven Abbott: Not long ago, someone called with a request for advice about digitizing an important and massive 19th century manuscript archive. Google-aided research quickly convinced me that, contrary to popular wisdom, the problems associated with digitizing had nothing to do with scanners and storage. Information, not technology, is the problem. Just having a giant collection of a million images stored as Image1, Image2, and so on isn’t the answer. A historical archive needs to be an active database that scholars can access in all sorts of ways. Because the archive in question couldn’t feasibly be scanned with OCR to convert the image data to text data, the database would have to rely on the integrity of links between images and some sort of textual record of what each image was. For example, the textual database might say that Image99 is a letter from A to B about subject C, written on date D. Given a million (or so) images, it should be clear why the integrity of that textual database is crucial. Ideally, you’d want to make sure that each image stored sufficient data within itself that a workable database could be assembled from the images themselves... PLUS: in SQL 2008 there will be a stream dattype. /m
Gene, I am facing similar decision. <Again> In the past I've followed the convention wisdom of file storage. Out of curiosity, which option did you end up chosing for production? What has been your experience with performance?
DBConner Welcome to the forums. It is good that you are able to pickup the relevant post to refer here that can help a lot for others too, in anycase I would like to ask you that what kind of version & edition of SQL you are planning to deploy. As you may know SQL 2008 has got great feature named FILESTREAM where you could take advantage of storing unstructured data in SQL without compromising the performance.
Thanks for the info Satya. We haven't started using SQL 2008 in production. Although we started development on a SQL 2008 project, it doesnt involve BLOB or unstructred type data. I did run across the FILESTREAM type -appears that it could be natural integration in C# apps, given FileStream type concept is there. - looking forward to pref test! Meanwhile, back in the past... still interested on any info on perf testing 2005 image storage especially in production app... -DC
Ok, so what will be volume of those unstructured data (images/docs) to store in the database. What is your criteria, is it performance or usability of this unstructured data?