Query Optimizer Shortcoming? | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Query Optimizer Shortcoming?

Why can’t the query optimizer figure out that a subquery is a constant value and create a good plan? For example:
1) SELECT * FROM table1 WHERE type = 7; versus
2) SELECT * FROM table1 WHERE type = (SELECT typeId FROM TypesTable WHERE TypeName = ‘Ursus horribilis’); Note: TypesTable has unique index on TypeName. More complex versions of the second query consistently produce very slow plans, where subsitituting in the constant for the subquery produces a fast plan. Why can’t the optimizer tell that "(SELECT typeId FROM TypesTable WHERE TypeName = ‘Ursus horribilis’)" evaluates to a constant, and just substitute it in? Is there a "dirty read" issue, where the subquery might change during the execution because someone updates the TypesTable table, and so the query plan allows for that? We have queries that really bog down, but if we substitute the constants for the subqueries, they fly. Oracle seems to do much better with these kinds of queries. Is this a known weakness with the SQL Server optimizer, or is there a reason the optimizer can’t evaluate the constants and make a fast plan?
Sorry, but I think you maybe have a kind of naive approach here. Of course, substituting a subquery for a constant will speed things up and produce a leaner plan. A query is *NEVER* a constant. Your knowledge about that it evaluates to a "constant", e.g. always returns the same value is based on informations that the optimizer doesn’t have when it creates the execution plan. That is, you know that this value in your lookup table won’t change, the optimizer only "knows" that after evaluating the subquery. Later on the query will surely be optimized, but you can’t tell from the query what this optimization will look like.

Frank Kalis
Moderator
Microsoft SQL Server MVP
Webmaster:http://www.insidesql.de
quote:Originally posted by FrankKalis Sorry, but I think you maybe have a kind of naive approach here. Of course, substituting a subquery for a constant will speed things up and produce a leaner plan. A query is *NEVER* a constant. Your knowledge about that it evaluates to a "constant", e.g. always returns the same value is based on informations that the optimizer doesn’t have when it creates the execution plan. That is, you know that this value in your lookup table won’t change, the optimizer only "knows" that after evaluating the subquery. Later on the query will surely be optimized, but you can’t tell from the query what this optimization will look like.

Frank Kalis
Moderator
Microsoft SQL Server MVP
Webmaster:http://www.insidesql.de

I think you misunderstand what I mean by "constant". At the time the query engine runs the query, the database is essentially a "snapshot", that is, constant for the duration of the query (consistent reads). Therefore, the subquery should evaluate to the same value for every row and/or every operation the query engine performs. The query engine should be able to evaluate the subquery once, then use that value in all the operations. The optimizer knows that the value is unique based on a unique constraint. Thus the optimizer should know that the subquery will be constant for the duration of the query. The optimizer cannot know what the constant will be, just that it is constant for the duration of the query. Therefore, the optimizer should at least create a plan where the subquery is only evaluated once. I would think the plan would be similar to a prepared query where the constant is unknown at time of preparation, but at least it is known that it is a constant. Maybe it’s not practical, but SQL Server could create a consistent snapshot of the database (using locks or logs like Oracle), determine that the subquery must return a constant (because it is based on a unique constraint), rewrite the query with the constant, create an optimized plan using statistics and knowing the constant, then run the query. From my experience, Oracle does a much better job at optimizing queries with such "constant subqueries". Just wondering if anyone has raised the issue with SQL Server developers, or if SQL Server 2005 does a better job.
Aah, I see. I indeed misinterpreted your question. Adapted to my environment you get query plans like these, right?<br /><pre id="code"><font face="courier" size="2" id="code"><br />StmtText <br />—————————— <br />SET STATISTICS PROFILE ON<br /><br />(1 row(s) affected)<br /><br />StmtText <br />———————– <br />SET SHOWPLAN_TEXT ON<br /><br />(1 row(s) affected)<br /><br />StmtText <br />————————————————– <br />SELECT * FROM Portfolio WHERE unternehmenID = 1;<br /><br />(1 row(s) affected)<br /><br />StmtText <br />———————————————————————————————————————————————– <br /> |–Bookmark Lookup(BOOKMARK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Bmk1000]), OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Portfolio]))<br /> |–Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Portfolio].[UC_Portfolio_hostID]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Portfolio].[unternehmenID]=Convert([@1])) ORDERED FORWARD)<br /><br />(2 row(s) affected)<br /><br />StmtText <br />—————————————————————————————————————————————— <br /><br />SELECT * FROM Portfolio WHERE unternehmenID = (SELECT unternehmenID FROM Unternehmen WHERE bezeichnung = …’);<br /><br />(1 row(s) affected)<br /><br />StmtText <br />————————————————————————————————————————————————————————— <br /> |–Bookmark Lookup(BOOKMARK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Bmk1000]), OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Portfolio]))<br /> |–Nested Loops(Inner Join, OUTER REFERENCES<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Unternehmen].[unternehmenID]))<br /> |–Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Unternehmen].[UC_unternehmen_bezeichnung]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Unternehmen].[bezeichnung]=’…’) ORDERED FORWARD)<br /> |–Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Portfolio].[UC_Portfolio_hostID]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Portfolio].[unternehmenID]=[Unternehmen].[unternehmenID]) ORDERED FORWARD)<br /><br />(4 row(s) affected)<br /><br />StmtText <br />—————————– <br />SET STATISTICS PROFILE OFF<br /><br />(1 row(s) affected)<br /></font id="code"></pre id="code"><br />I’ve mailed some people and bookmarked this thread. I don’t know the answer off-hand.<br /><br />–<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
Since the subquery is not a correlated one, why don’t you use this syntax: DECLARE @type INT
SELECT @type = typeId FROM TypesTable WHERE TypeName = ‘Ursus horribilis’
SELECT * FROM table1 WHERE type = @type When you put the two in a batch, the version with the subquery in the WHERE statement takes a little over 50% in query cost for the batch if the type column is not indexed. The percentage rises to more than 60% if the column is indexed. (In my case, the main table has 20K rows, the lookup table has 200 rows). Agreed … this appears to be a considerable bug. Not sure they’ll issue a service pack for it, now that we have SQL 2005.
Seems like I was not really awake this morning. One of these days I’ll remember that graphical execution plans are read TOP to BOTTOM and RIGHT to LEFT. So, looking again at the plan for the second query:<br /><pre id="code"><font face="courier" size="2" id="code"><br />StmtText <br />————————————————————————————————————————————————————————— <br /> |–Bookmark Lookup(BOOKMARK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Bmk1000]), OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Portfolio]))<br /> |–Nested Loops(Inner Join, OUTER REFERENCES<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Unternehmen].[unternehmenID]))<br /> |–Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Unternehmen].[UC_unternehmen_bezeichnung]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Unternehmen].[bezeichnung]=’…’) ORDERED FORWARD)<br /> |–Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[FAI].[dbo].[Portfolio].[UC_Portfolio_hostID]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Portfolio].[unternehmenID]=[Unternehmen].[unternehmenID]) ORDERED FORWARD)<br /><br />(4 row(s) affected)<br /></font id="code"></pre id="code"><br />you can see that firstly the seek on the "Unternehmen" table is done as this is the input parameter for the query against the "Portfolio" table. Now, a Nested Loop executes for each row in the outer table some operations against the inner table. Since the outer table in my case is "Unternehmen" the query against "Unternehmen" is executed exactly once. <br /><br />The plan you get from your query<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT * FROM table1 WHERE type = (SELECT typeId FROM TypesTable WHERE TypeName = ‘Ursus horribilis’);<br /></font id="code"></pre id="code"><br />should look very similar to mine. If not, please post the plan here. <br /><br />Also, you might want to check this _very_ interesting blog: <br /<a target="_blank" href=http://blogs.msdn.com/craigfr>http://blogs.msdn.com/craigfr</a> a<br />nd there especially:<br /<a target="_blank" href=http://blogs.msdn.com/craigfr/archive/2006/07/26/679319.aspx>http://blogs.msdn.com/craigfr/archive/2006/07/26/679319.aspx</a><br />–<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
Thanks for looking at this. I also confess that my knowledge of SQL Server is spotty at best since I divide my time between SQL Server, Oracle, Java, JSP, JavaScript, HTML, CSS, XML — typical spread-thin developer. So I appreciate your examples of how to display text plans, etc.<br /><br />Below are the two cases from our database. <br /><br />Note: we have many indexes on the CodingData table on many combinations of fields. That may be part of the effect.<br /><br />I wish these were simpler tables to discuss. I included the sp_help info for each table if anyone cares to dig deeper.<br /><br />The case with the hard-coded constant (no subquery) runs much faster.<br /><br />=============================== Case 1 ==========================<br /><br />SELECT cd.ObjectId, cd.ObjectType, cd.DataType, cd.CodingMasterId, cd.NumberVal, cd.String<br />FROM CodingData cd<br />WHERE cd.ObjectType = ‘Box'<br />AND cd.CodingMasterId = 20047<br /><br />(1 row(s) affected)<br /><br />StmtText <br />—————————————————————————————————————————————————————————————– <br /> |–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingData].[PK_CodingData] AS [cd]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[ObjectType]=’Box’), WHERE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[CodingMasterId]=Convert([@2])) ORDERED FORWARD)<br /><br />(1 row(s) affected)<br /><br />=============================== Case 2 ==========================<br /><br />SELECT cd.ObjectId, cd.ObjectType, cd.DataType, cd.CodingMasterId, cd.NumberVal, cd.String<br />FROM CodingData cd<br />WHERE cd.ObjectType = ‘Box'<br />AND cd.CodingMasterId = (SELECT Id FROM CodingMaster WHERE DisplayName = ‘Department Number’)<br /><br />(1 row(s) affected)<br /><br />StmtText <br />——————————————————————————————————————————————————————— <br /> |–Nested Loops(Inner Join, WHERE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[CodingMaster].[Id]=[cd].[CodingMasterId]))<br /> |–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingMaster].[PK_CODINGMASTER]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[CodingMaster].[DisplayName]=’Department Number’) ORDERED FORWARD)<br /> |–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingData].[PK_CodingData] AS [cd]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[ObjectType]=’Box’) ORDERED FORWARD)<br /><br />(3 row(s) affected)<br /><br />==============================================================<br /><br />========================= CodingData Table ===================<br />Name Owner Type Created_datetime <br />——————————————————————————————————————————– ——————————————————————————————————————————– ——————————- ————————— <br />CodingData dbo user table 2005-01-20 21:45:31.580<br /><br /> <br /> <br />Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation <br />——————————————————————————————————————————– ——————————————————————————————————————————– ———————————– ———– —– —– ———————————– ———————————– ———————————– ——————————————————————————————————————————– <br />Timestamp timestamp no 8 no (n/a) (n/a) NULL<br />ObjectType varchar no 20 no no no SQL_Latin1_General_CP1_CS_AS<br />ObjectId int no 4 10 0 no (n/a) (n/a) NULL<br />CodingMasterId int no 4 10 0 no (n/a) (n/a) NULL<br />DataType tinyint no 1 3 0 no (n/a) (n/a) NULL<br />DateVal datetime no 8 yes (n/a) (n/a) NULL<br />NumberVal decimal no 9 19 4 yes (n/a) (n/a) NULL<br />String nvarchar no 894 yes (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />Signature text no 16 yes (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />LinkId int no 4 10 0 yes (n/a) (n/a) NULL<br />LinkType varchar no 20 yes no no SQL_Latin1_General_CP1_CS_AS<br />LinkAuxId int no 4 10 0 yes (n/a) (n/a) NULL<br />StringSearch int yes 4 10 0 yes (n/a) (n/a) NULL<br /><br /> <br />Identity Seed Increment Not For Replication <br />——————————————————————————————————————————– —————————————- —————————————- ——————- <br />No identity column defined. NULL NULL NULL<br /><br /> <br />RowGuidCol <br />——————————————————————————————————————————– <br />No rowguidcol column defined.<br /><br /> <br />Data_located_on_filegroup <br />——————————————————————————————————————————– <br />PRIMARY<br /><br /> <br />index_name index_description index_keys <br />——————————————————————————————————————————– —————————————————————————————————————————————————————————————————————— —————————————————————————————————————————————————————————————————————————————————————- <br />IX_CodingData_nval_cmid_dtyp nonclustered located on PRIMARY NumberVal, CodingMasterId, DataType<br />IX_CodingData_oid_otype_cmid nonclustered, unique located on PRIMARY ObjectId, ObjectType, CodingMasterId<br />IX_CodingData_str_cmid_dtyp nonclustered located on PRIMARY String, CodingMasterId, DataType<br />PK_CodingData clustered, unique, primary k ey located on PRIMARY ObjectType, ObjectId, CodingMasterId<br /><br /> <br />constraint_type constraint_name delete_action update_action status_enabled status_for_replication constraint_keys <br />————————————————————————————————————————————————– ——————————————————————————————————————————– ————- ————- ————– ———————- —————————————————————————————————————————————————————————————————————————————————————- <br />PRIMARY KEY (clustered) PK_CodingData (n/a) (n/a) (n/a) (n/a) ObjectType, ObjectId, CodingMasterId<br /><br /> <br />No foreign keys reference this table.<br />Table is referenced by views <br />——————————————————————————————————————————– <br /><br />============================ CodingMaster Table ===================<br />Name Owner Type Created_datetime <br />——————————————————————————————————————————– ——————————————————————————————————————————– ——————————- ————————— <br />CodingMaster dbo user table 2006-09-28 10:01:39.230<br /><br /> <br /> <br />Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation <br />——————————————————————————————————————————– ——————————————————————————————————————————– ———————————– ———– —– —– ———————————– ———————————– ———————————– ——————————————————————————————————————————– <br />Timestamp timestamp no 8 no (n/a) (n/a) NULL<br />Id int no 4 10 0 no (n/a) (n/a) NULL<br />DisplayName nvarchar no 100 no (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />PropertyName nvarchar no 100 no (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />HelpText nvarchar no 200 yes (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />DataType smallint no 2 5 0 no (n/a) (n/a) NULL<br />DisplayType smallint no 2 5 0 no (n/a) (n/a) NULL<br />PermanentVal bit no 1 no (n/a) (n/a) NULL<br />Mandatory bit no 1 no (n/a) (n/a) NULL<br />DisplayLines tinyint no 1 3 0 no (n/a) (n/a) NULL<br />Mask nvarchar no 100 yes (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />Indexable bit no 1 no (n/a) (n/a) NULL<br />Visible bit no 1 no (n/a) (n/a) NULL<br />Required bit no 1 no (n/a) (n/a) NULL<br />UniqueVal bit no 1 no (n/a) (n/a) NULL<br />LowValue nvarchar no 100 yes (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />HighValue nvarchar no 100 yes (n/a) (n/a) SQL_Latin1_General_CP1_CS_AS<br />ReadOnly bit no 1 yes (n/a) (n/a) NULL<br />RenditionCommon bit no 1 yes (n/a) (n/a) NULL<br /><br /> <br />Identity Seed Increment Not For Replication <br />——————————————————————————————————————————– —————————————- —————————————- ——————- <br />No identity column defined. NULL NULL NULL<br /><br /> <br />RowGuidCol <br />——————————————————————————————————————————– <br />No rowguidcol column defined.<br /><br /> <br />Data_located_on_filegroup <br />——————————————————————————————————————————– <br />PRIMARY<br /><br /> <br />index_name index_description index_keys <br />——————————————————————————————————————————– —————————————————————————————————————————————————————————————————————— —————————————————————————————————————————————————————————————————————————————————————- <br />IX_CodingMaster_1 nonclustered, unique, unique key located on PRIMARY Id<br />IX_CodingMaster_2 nonclustered, unique, unique key located on PRIMARY PropertyName<br />PK_CODINGMASTER clustered, unique, primary key located on PRIMARY DisplayName<br /><br /> <br />constraint_type constraint_name delete_action update_action status_enabled status_for_replication constraint_keys <br />————————————————————————————————————————————————– ——————————————————————————————————————————– ————- ————- ————– ———————- —————————————————————————————————————————————————————————————————————————————————————- <br />DEFAULT on column DisplayLines DF_CodingMaster_DisplayLines (n/a) (n/a) (n/a) (n/a) (1)<br />DEFAULT on column HelpText DF_CodingMaster_HelpText (n/a) (n/a) (n/a) (n/a) (”)<br />DEFAULT on column Indexable DF_CodingMaster_Indexable (n/a) (n/a) (n/a) (n/a) (0)<br />DEFAULT on column Mandatory DF_CodingMaster_Mandatory (n/a) (n/a) (n/a) (n/a) (0)<br />DEFAULT on column Mask DF_CodingMaster_Mask (n/a) (n/a) (n/a) (n/a) (”)<br />DEFAULT on column PermanentVal DF_CodingMaster_PermanentVal (n/a) (n/a) (n/a) (n/a) (0)<br />DEFAULT on column UniqueVal DF_CodingMaster_Unique (n/a) (n/a) (n/a) (n/a) (0)<br />UNIQUE (non-clustered) IX_CodingMaster_1 (n/a) (n/a) (n/a) (n/a) Id<br />UNIQUE (non-clustered) IX_CodingMaster_2 (n/a) (n/a) (n/a) (n/a) PropertyName<br />PRIMARY KEY (clustered) PK_CODINGMASTER (n/a) (n/a) (n/a) (n/a) DisplayName<br /><br /> <br />No foreign keys reference this table.<br />No views with schema binding reference this table.<br /><br />
quote:Originally posted by Adriaan Since the subquery is not a correlated one, why don’t you use this syntax: DECLARE @type INT
SELECT @type = typeId FROM TypesTable WHERE TypeName = ‘Ursus horribilis’
SELECT * FROM table1 WHERE type = @type When you put the two in a batch, the version with the subquery in the WHERE statement takes a little over 50% in query cost for the batch if the type column is not indexed. The percentage rises to more than 60% if the column is indexed. (In my case, the main table has 20K rows, the lookup table has 200 rows). Agreed … this appears to be a considerable bug. Not sure they’ll issue a service pack for it, now that we have SQL 2005.

That’s a very useful suggestion, thanks for the example, I’m going to use it. Our real-life queries are more complicated and usually have several of these "subquery constants", so there would be several pre-queries to get those, but worth it performance-wise. In fact, we essentially do that now with our prepared queries, query all the constants first, and supply them as parameters. My use for the subqueries is usually when we have to create ad hoc queries to do things using SQL Analyzer or Oracle’s SQL*Plus. It is always preferable to have queries that don’t use procedural code so we have a better chance of running them on both platforms, and less for me to remember and maintain. It would be nice if the subquery versions ran in comparable time to the hard-coded constant versions. On SQL Server (with more complicated queries), the subquery versions can take 10 minutes vs. 10 seconds. It just seems like something the optimizer is missing, but maybe there is a deep reason it has to be that way (although my impression is that the Oracle optimizer does better – I’ll need to verify that since we don’t have identical databases).

The plan in your example is as good as it can get. <br /><br />|–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingMaster].[PK_CODINGMASTER]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[CodingMaster].[DisplayName]=’Department Number’) ORDERED FORWARD)<br />|–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingData].[PK_CodingData] AS [cd]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[ObjectType]=’Box’) ORDERED FORWARD)<br /><br />SQL Server is doing the CI seek to get the scalar value from CodingMaster, which is the outer table and uses that value to seek into the inner table CodingData.<br /><br />I’ve noticed your use of NVARCHAR. You really need UNICODE? Also, you have verified that your indexes aren’t that fragmented? If you don’t know that already, here’s an interesting white paper on that subject:<br /<a target="_blank" href=http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx>http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx</a><br /><br />–<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
Like Frank suggested, if your parameters are NVARCHAR, and your using them to compare against VARCHAR columns, then there is a performance loss because SQL needs to convert. The conversion also means that if the column is covered by an index, that the index probably cannot be used.
quote:Originally posted by Adriaan Like Frank suggested, if your parameters are NVARCHAR, and your using them to compare against VARCHAR columns, then there is a performance loss because SQL needs to convert. The conversion also means that if the column is covered by an index, that the index probably cannot be used.

Virtually all our table columns that hold user viewable data are NVARCHAR. NVARCHAR decision was our best guess, but a difficult one because it was very hard to understand the low-level impact for both Oracle and SQL Server. We decided that all user input or user-viewable fields would be unicode so we wouldn’t have to tell customers or prospective customers "we can’t do that". Our product is a commercial Records Management System, so it has to be very flexible. There are also many standards, such as DOD, that we strive to fulfill. Most of our queries are run via JDBC, so we really don’t have control over what type of string the parameters are. Java strings are basically unicode. I’m not sure where the conversion would take place if the column datatype is not unicode. Hopefully the JDBC driver would convert the string to match the table column, so that SQL Server would not have to do it. On the other hand, for the case we are looking at here, the query engine / optimizer should not have to do much conversion. Hopefully it would be smart enough to evaluate the subquery once, convert the value to the same type as the other column it is joining with, and then use whatever indexes are available as necessary. I would like to learn more about how NVARCHAR would affect performance, exactly what cases are a problem, so we can take whatever steps we can to minimize the problems.
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by FrankKalis</i><br /><br />The plan in your example is as good as it can get. <br /><br />|–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingMaster].[PK_CODINGMASTER]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[CodingMaster].[DisplayName]=’Department Number’) ORDERED FORWARD)<br />|–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingData].[PK_CodingData] AS [cd]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[ObjectType]=’Box’) ORDERED FORWARD)<br /><br />SQL Server is doing the CI seek to get the scalar value from CodingMaster, which is the outer table and uses that value to seek into the inner table CodingData.<br /><br />I’ve noticed your use of NVARCHAR. You really need UNICODE? Also, you have verified that your indexes aren’t that fragmented? If you don’t know that already, here’s an interesting white paper on that subject:<br /<a target="_blank" href=http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx>http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx</a><br /><br />–<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br />I suppose it is, although I see one plan is a "join" and the other plan is a "seek with a where clause". Could not SQL Server convert the "join" to a "seek with where clause"?<br /><br />I changed the queries to select a "COUNT(*)" so I could test performance.<br /><br />The query with the hard-coded constant ran in 4 seconds. The query with the subquery ran in 16 seconds. I ran them repeatedly to get stable times.<br /><br />Here are the query plans:<br /><br />================================ <br />SELECT COUNT(*)<br />FROM CodingData cd<br />WHERE cd.ObjectType = ‘Box'<br />AND cd.CodingMasterId = 20047<br /><br />—————————————————————————————— |–Compute Scalar(DEFINE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Expr1001]=Convert([Expr1004])))<br /> |–Stream Aggregate(DEFINE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Expr1004]=Count(*)))<br /> |–Index Scan(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingData].[IX_CodingData_oid_otype_cmid] AS [cd]), WHERE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[CodingMasterId]=20047 AND [cd].[ObjectType]=’Box’))<br /><br />(runs in 4 seconds)<br /><br />=======================================================================<br />SELECT COUNT(*)<br />FROM CodingData cd<br />WHERE cd.ObjectType = ‘Box'<br />AND cd.CodingMasterId = (SELECT Id FROM CodingMaster WHERE DisplayName = ‘Department Number’)<br /><br /> <br />————————————————————————————- <br /> |–Compute Scalar(DEFINE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Expr1004]=Convert([Expr1007])))<br /> |–Stream Aggregate(DEFINE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Expr1007]=Count(*)))<br /> |–Nested Loops(Inner Join, WHERE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[CodingMaster].[Id]=[cd].[CodingMasterId]))<br /> |–Clustered Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingMaster].[PK_CODINGMASTER]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[CodingMaster].[DisplayName]=’Department Number’) ORDERED FORWARD)<br /> |–Index Scan(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Image2000].[dbo].[CodingData].[IX_CodingData_oid_otype_cmid] AS [cd]), WHERE<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[cd].[ObjectType]=’Box’))<br /><br />(runs in 16 seconds)
]]>