Storing web statistics in the DB | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Storing web statistics in the DB

Hi, I have an Offers table where I store data such as Offer_id, User_num, theDate, Product_num, Offer_title, City, … (Is a table to store published offers by users in my website). Now I want to store statistics related to visits to offers_details web page, so, visits to every published offer. The statistics I want to store are related to actions the user can do in that offers_details web page, such as number of visits, number of times users has clicked a link, number of times users has enlarged an image, etc.. My question is: Which the best place to store this kind of information? Perhaps in a separate table called Offers_Statistics, with a primary key column as identity called Offer_statistic_id and a foreign key column called Offer_number? And then, every time an offer is published is added automatically a row in that statistics table with its correspondent offer number, in order to store statistics produced by users later, if any? Suggestions and/or ideas? Thanks
I know several CMS systems track this in the main table by adding one or more columns for that data. Are the statistics automatically being displayed when someone views an offer or will the user have to click somewhere to see statistics. Just asking, because when they are displayed every time a page is opened, I’d probably add some columns to the offers table eliminating the need for a join
However, I would go for a separated table when the user views an offer and has to take another action to see the statistics. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs

Hi Frank Kalis,<br /><br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">Are the statistics automatically being displayed when someone views an offer or will the user have to click somewhere to see statistics.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br />Now the user has to go to its private menu to see the statistics. Anyway, this may change in the future in some data, and display some of them every time ‘offer_detail.aspx#%92 web page is loaded, so, many many times per day (I hope [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]).<br /><br /><br />I was thinking beyond to the scope of that little part of the statistics, and I would like to implement briefly a system to track the user activity I want to analyse. I mean, store in the DB the sessions of all my users and the actions they made during the visit. For example:<br /><br />A user arrives to my index web page, after 5 seconds visits web_page2, after 2 seconds clicks button_2 in this page, after 6 seconds clicks button_4 in the same page, after 4 seconds clicks link2 and goes to web_page7, and here the session expires because none action is made during 15 minutes. <br /><br />The information I would want to store in this example is: <br /><br />I know the user? : If user has the last cookie, I know who is, and he/she will be in my ‘visitors_users#%92 table. If doesn#%92 t has the last cookie, with retrieved information in my web application such as IP address, internet provider and domain name, I can check in my ‘visitors_users#%92 table if there is some user with these identical data, if so, I also know who is, if not I add a new user with all these associated data and I give him/her a cookie for later sessions. <br /><br />Whatever previous situation is, I add a row to ‘Sessions#%92 table. In this table I would store the Session_id key field (PK) and User_id (FK). <br /><br />Then, in ‘web_statistics#%92 table I would store the web_stat_id key field (PK), Session_id (FK), action_id field to store the action the user has made (this means have a separate table with actions defined, and the web page the action belongs to) and the action_time field to store the date/time the action occurred. <br />So, with my example, I would store in ‘web_statistics#%92 table: <br /><br /><b>Web_stat_id</b> – <b>Session_id</b> – <b>action_id</b> – <b>action_time</b><br />584 – 52 – 12 – 9/01/2006 10:06:21 (Visits index page) <br />…<br />596 – 52 – 5 – 9/01/2006 10:06:26 (Visits web_page2)<br />…<br />632 – 52 – 25 – 9/01/2006 10:06:28 (Clicks button_2 in web_page2)<br />…<br />640 – 52 – 9 – 9/01/2006 10:06:34 (Clicks button_4 in web_page2)<br />…<br />655 – 52 – 58 – 9/01/2006 10:06:38 (Clicks link2 and goes to web_page7)<br />…<br />657 – 52 – 15 – 9/01/2006 10:06:38 (Visits web_page7)<br />…<br />867 – 52 – 32 – 9/01/2006 10:06:53 (Expires session)<br /><br /><br /><br />What do you think about this system to store all the statistics along my website? <br /><br />One of my doubts is: With that only statistics table for all web pages (‘web_statistics#%92), retrieve or query the statistics will not be very slow or will have bad performance because having to deal with thousands or millions of records? For example, trying to query how many visits a page had in a period of time, or how many times a link was clicked in a page during the last month by a specified group of users, etc,.. ?<br /><br />Thank you <br />
While it is fairly easy to collect all kind of data, you need to think about what you really need. There are already sophisticated tracking packages out there. I believe that there are at least some php scripts freely available. <br /><br />Also, do you know Google Analystics? Very, very sophisticated and free. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />
Hi,<br />refer follow link if it helps(vb and asp scripts to maintain site statistics) :<br /><br /<a target="_blank" href=http://www.planet-source-code.com/vb/scripts/BrowseCategoryOrSearchResults.asp?txtCriteria=site+statistic&blnWorldDropDownUsed=TRUE&txtMaxNumberOfEntriesPerPage=10&blnResetAllVariables=TRUE&lngWId=4&optSort=Alphabetical>http://www.planet-source-code.com/vb/scripts/BrowseCategoryOrSearchResults.asp?txtCriteria=site+statistic&blnWorldDropDownUsed=TRUE&txtMaxNumberOfEntriesPerPage=10&blnResetAllVariables=TRUE&lngWId=4&optSort=Alphabetical</a><br /><br /><img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /><br />Regards<br /><br />Hemantgiri S. Goswami<br />[email protected]<br />"Humans don’t have Caliber to PASS TIME , Time it self Pass or Fail Humans" – by Hemantgiri S. Goswami<br />
Yes, I know what I need now, and I will change many things in the future. I didn#%92 t know Google Analytics, seems quite good, I also know other free web traffic analysis. Anyway, I think that those tools are good to see the results of other generic web traffic analysis, and can compare them with your own results. I think that it#%92 s important to have your critical data (such as data you think it#%92 s very important for your business) in your own application and database. Always will be some kinds of data and some kinds of analysis that are very important for you and that a free or paying tool can#%92 t get or delivery. So, I was thinking in the way I can design the DB so that I can get the kind of analysis explained in my previous post. And I would like to know some experienced opinions about how the performance would be in the design I posted, I mean queries, insertions, updates, etc,.. in a production server.

Your proposed design sounds reasonable to me. I guess you won’t have to do UPDATEs that much. Mostly INSERTs which should be fairly straightforward and simple, so likely not to hurt performance significantly. Clustered index on an IDENTITY column should also help avoiding performance issues. As for displaying something like page count, still I’d probably add another column to the main table, as you certainly won’t query the whole historical data table to get this information. So this might be a break in normalisation rules as you would keep information redundantly, but in terms of performance the better solution. So, every time a page is called there would be an UPDATE to the main table and an INSERT to the history table. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs

Brilliant, simply brilliant! This is the explanation I was looking for, you are the best!! I can#%92 t say anything else because the explanation is perfect for me. Thank you very much,
Cèsar

]]>