Despite NTEXT and TEXT being deprecated in SQL Server for some time they are still both used in production systems. In this article, I will briefly demonstrate the difference between a VARCHAR (MAX),
NVARCHAR (MAX) and the NTEXT data types, and the impact on performance from using NTEXT/TEXT.

Let’s create a table named Student using the below T-SQL.

create table student

student_id int identity(1,1),
student_name varchar(max)

Let’s insert a few records into the table using the below T-SQL.

insert student(student_name)
select 'abc'

select 'pqr'

select 'xyz'

Now let’s have a look at the query execution plan:

It is obvious from this that you do not have a Clustered Index on the
table. Let’s create it using the below T-SQL:

create clustered
index clu_student_stuid on student(student_id)

Now look again at the query execution plan :

Now you can see that the
Table Scan gets converted to a Clustered Index Scan which means that the base
table is completely scanned by the Clustered Index. Now let’s try to improve
it’s performance by creating a Non Clustered Index on the column named
Student_name using the below T-SQL.

As you can see, a Non Clustered index isn’t allowed to be created on a column
using VARCHAR(MAX) as a data type. This will hurt the performance when
the there is a large volume of data inside the table and hence this is one of the major
disadvantages of using the VARCHAR(MAX) data type.

A SQL Data row has a max limited
size of 8000 bytes. Therefore a VARCHAR(MAX) which can be up to 2 GB in size cannot
be stored in the data row and so SQL Server stores it "Out of row". VARCHAR(MAX) is overkill for many types of data such as person’s names, locations, colleges etc. Therefore instead of
using VARCHAR(MAX) such such types of data, one should limit the Varchar size
to a reasonable number of characters which will greatly improve performance.

If you have variable
length Unicode string data then you can go either for an NVARCHAR or NTEXT but note the difference between these:

Let’s create a table named
Student using the below T-SQL.

create table student

student_id int,
student_name ntext

Now we can insert a few records :

insert student(student_name)

select 'abc'

select 'pqr'

select 'xyz'

Now let’s have a look at the
Statistic count using the below T-SQL:

set statistics io on
select * from student
set statistics io off

Note that
since the data type of the student_name column is NTEXT the
query optimizer is treating the data in that column as a Large Object.

Now let’s change the data type

alter table student

alter column student_name nvarchar(max)

Now, look at the
Statistics count again :

set statistics io on

select * from student

set statistics io off

Still not good as despite changing the
data type of the column, the LOB Logical Read count is still the same (ie 3).
This is because SQL Server has maintained the data in the Large Object
structure and now uses a pointer as a reference to extract the text from this
column. In order to fix this problem, execute the below T-SQL.

update student

set student_name = student_name

Now let’s have a look at
the Statistics count again:

Now the query optimizer
does not treat is as a LOB Data. Therefore, from a performance standpoint NVARCHAR data type is
always superior to NTEXT.

Please let us know if you have any comments or suggestions. Alternatively you
can email me at ..


15 Responses to “VARCHAR (MAX), NVARCHAR (MAX) and NTEXT Data Types”

  1. Good and Knowledgeable Article.. Thanks!!!

  2. Hey let loot at this from other side. Blob data always stores out side of the main data. it means it will be better solution when you hardly ever want to get blob data without filter by this data. So using of blob data are not bad solution sometime.

  3. But sometimes is good for VARCHAR(MAX) or NVARCHAR(MAX) to be treated as LOB data.
    When the list queries don’t require the field, just used on the form, having the table split into two data pages is good since it has less reads to get the data without the LOB fields.. this should also be taken in concern…

  4. But what about fragmentation? Text is a consistent size (a pointer to another location), and thus fragmentation is lower than when using Varchar, assuming you will never want to index the Text. Is this correct?

  5. I have to tell you that if I ever caught a developer making a Student_Name column using any form of blob, I’d have to chase them until their heels smoked.

    Perhaps a more appropriate article would be on the extreme advantages of “right sizing”. Just having a BLOB in a table prevents ONLINE indexing of the clustered index in the Enterprise Edition even if the column IS NOT one of the key columns.

  6. Nice one !!!

  7. Great article, nice and easy to understand for a any novice!!!

  8. Nice article.

  9. Very intuitive article, thank you!

  10. Perfect article!!! Thank you!

  11. sometimes I create nvarchar(max) columns but I index on a calculated column of the first N positions: fits many query needs while matching the size of a string in the application – if you say nvarchar(200), you can be sure you will need 205 one of these days etc

  12. Hi,

    Can you clear one point. What is the benefit of executing the below statement ?

    update student
    set student_name = student_name


  13. bPU3mJ This is one awesome blog post. Really Cool.

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |