insert query eats up entire disk | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

insert query eats up entire disk

I am trying to normalize a database by building a table called ActedIn. When I run this query, it runs for a while, slowly filling up my disk until all 20 gigs are used up. I cancel the query and the disk space returns to normal. I have clustered indexes built on Movies (Title), Person (fname, mname, lname), and Actors(fname, mname, lname). When I limit the transacation log size, the query raises an error after so long. I’ve run this exact same query on Actresses file with no problem, and the database size isn’t even 1 GB. Can anyone help?
insert into ActedIn (PersonID, MovieID)
select P.PersonID, M.MovieID
from Actors A
inner join Movies M on A.Title = M.Title
inner join Person P on ((A.FirstName = P.FirstName) or (A.FirstName is null)) and
((A.MiddleName = P.MiddleName) or (A.MiddleName is null)) and
((A.LastName = P.LastName) or (A.LastName is null))
Is there a possible primary key join between actors and person? firtname, middlename, lastname may not be unique? do a select count(*) from actors, etc. to get an idea of how many rows SQL finds, and make sure that it matches with your expectation…? Cheers
I think most of this comes because you dont have a direct link between person and actor. If you could do that, then it would go much faster, much, much faster. Which ‘begs the question’ what is different between actors and person? Do you really need 2 tables or could you just put a flag (or series of flags) on the person table to indicate if its an actor, actress, producor etc? Back to your query….this is why I think it takes so long….you are joining with ands and ors, and if you read through it and pull it apart you will join person to every actor where a.firstname is null and a.lastname is null and a.middlename is null – if you have even one of those records on your actor table, then every single record in your person table will match to it. Same thing goes if you have someone in your actor table entered as null, null, smith. Every Smith in your person table will join. Run this select on your actor table
select * from actors where firstname is null and middlename is null and lastname is null
and then find any where first and middle is null (and surname is not)
and then where middle and surname are null and first is not etc…. Its likely that the data in your actress table is a bit cleaner. Chris

That would make sense, because I did end up with a few more records in ActedIn than I actually had in Actresses table. How do I make a query that will still match when it has null names? For instance, I did this query insert into ActedIn (PersonID, MovieID)
select P.PersonID, M.MovieID
from Actors A
inner join Movies M on A.Title = M.Title
inner join Person P on A.FirstName = P.FirstName and
A.MiddleName = P.MiddleName and
A.LastName = P.LastName A.LastName But anyone with a null in their name didn’t end up in the table. (As for the question about the Actors and Actresses table, these are not normalized tables and will be deleted after I’m done building the appropriate tables) Thanks!
Actually, I just tested that query on 1000 rows of the actors table and it works fine. People with null middle names or null first and middle etc… match up fine. Exactly 1000 rows is inserted into ActedIn I don’t think its the query. Is there a way I can insert 1000 at a time or somthing? Any other ideas? Thanks Scott
Problem Solved! I found that Insert Into is fully logged, this was making log file huge. Used select into instead, which is minimally logged, worked fine. thanks Scott