Performance Required | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Performance Required

I have a visit table. I want to know what was the path of the visit i.e; If the user is visiting Yahoo page. Then his visit suppose is 1 and he first open the home page then news, shopping, sign me up etc. etc. This sequence of visiting the pages is his path. Please see the code and I hope you#%92ll understand. The code works fine. But if there are 200, 000 rows then it gets dead and keeps on running. Please check if we can optimize the code? set nocount on
declare @visit_id int, @path_id varchar(8000), @path_dsc varchar(8000), @sr_nbr int
declare cur_path cursor
scroll
dynamic
for
select distinct visit_id
from visit
order by visit_id open cur_path
fetch first from cur_path
into @visit_id while @@fetch_status = 0
begin
select @path_id = coalesce(@path_id, ”) + cast(pageid as varchar)+’-‘,
@path_dsc = coalesce(@path_dsc+’->’, ”) + page
from PageTracking
where visit_id <> 0 and visit_id = @visit_id and page is not null and page <> ”
order by visit_id, page_hit_time asc set @path_id = left(@path_id, len(@path_id)-1) if @path_id is not null
begin
if not exists(select 1 from path where path_id = @path_id)
begin
select @sr_nbr = isnull(max(sr_nbr), 0) + 1 from path
insert into path values (@sr_nbr, @path_id, @path_dsc)
end print @visit_id; print @path_id
if not exists(select 1 from visit where visit_id = @visit_id and path_id = @path_id)
begin
update visit
set path_id = @path_id
where visit_id = @visit_id
end
end set @path_id = null
set @path_dsc = null fetch next from cur_path
into @visit_id
end
close cur_path
deallocate cur_path
Am I understanding this right, that you store the visited pages as a list in one column? If so, you should normalize your schema to get rid of first normalform violation.
Further, a cursor is some kind of T-SQL of the last resort. You should avoid it. I see here no reason to use a cursor. Next, within the cursor you do a ORDER BY. Why?
It would be helpful if you could post sample data and desired output. —
Frank Kalis
SQL Server MVP
http://www.insidesql.de

I agree with Frank’s recommendations. However, there are a few things you can try to improve the code posted. Instead of:
quote:
select @path_id = coalesce(@path_id, ”) + cast(pageid as varchar)+’-‘,
@path_dsc = coalesce(@path_dsc+’->’, ”) + page
from PageTracking
where visit_id <> 0 and visit_id = @visit_id and page is not null and page <> ”
order by visit_id, page_hit_time asc
try:
if @visit_id = 0 set @visit_id = – 1 –so PageTracking.visit_id can’t be equal to @visit_id
select
@path_id = coalesce(@path_id, ”) + cast(pageid as varchar)+’-‘,
@path_dsc = coalesce(@path_dsc+’->’, ”) + page
from PageTracking
where
visit_id = @visit_id and
–page is not null and
page > ”– page > ” will evaluate to null, not to true if page is not null
order by visit_id, page_hit_time asc
That way indexes may be used more effectivly. Instead of:
quote:
select @sr_nbr = isnull(max(sr_nbr), 0) + 1 from path
insert into path values (@sr_nbr, @path_id, @path_dsc)
define sr_nbr as identity and omit sr_nbr from insert statement.
]]>