SQL Server Performance

Index on a large text field

Discussion in 'SQL Server 2008 General DBA Questions' started by neerajmalik, Aug 30, 2011.

  1. neerajmalik New Member

    Hi there,

    We want to have a index on a nvarchar field. This contains a big text.

    Can someone please guide me to right direct to index this column?

    Thanks in advance.
  2. FrankKalis Moderator

    Without knowing more about it, Full-Text indexing might be a way to go. With the "traditional" relational indexes you are restricted to 900 bytes.
  3. neerajmalik New Member

    Thanks for the response.

    How does the storage looks like with Full Text? Do you know any article which talks about the stroage with Full Text index?
  4. Luis Martin Moderator

  5. preethi Member

    Not sure whether you were looking for a different problem like this:
    We received data from different system, where the data of varchar(1024) is used as the Primary key. The other system expected us to return data from our system when they give that large string.
    This is how we tackled it.
    We created a Hash Key of the varchar(1024) column and created an index on hash key. we search the hash key first and return the data. Hash key was not unique (There could be duplicates) but when we narrow down the search to few rows it is easier to filter the exact row.
  6. satya Moderator

    How frequently the queries are accessing on that NVARCHAR field?
  7. neerajmalik New Member

    @Satya - it will be used very frequently. This is a headline column and hourly process before loading the data in the table, will check if the headline already exist or not.
  8. neerajmalik New Member

    Hi Prethi, Can you please give me some more information about Hash Keys. This sound like a solution to me.
  9. preethi Member

    There are many ways of generating Hash keys,
    SQL Server supports CHECKSUM function where the outcome is int.

    This is how we went with it.
    Assume you have a large text to save. add another column is generated with CHECKSUM of the large string column. I made the checksum Column as the clusered idnex to make the search faster.
    As you may aware, checksum will generate the same value for always for the same input, but the reverse is not true. Two string values can generate the same checksum
    In the seach I perform the search on the checksum and them filter the second condition.
    This enables a clustered key seek (Even if you don't have a clustered index, you can have a non clustered idnex check will turn into index seek and key lookup.)
    PLease let em know if this is not clear and you need some sample code.
    Hope this helps.
  10. satya Moderator

Share This Page