Monday, March 12, 2012

Middle Search Support

Some of the databases that I inherited contain search that are based on finding a string anywhere within a last name such as:

WHERE lastName like '%smith%'

It is desired that each of these names be returned:

Smith
Smithson
Nesmith

What is presently done is that updates to the last name fields trigger that substrings of the last name be sent off into a substring table wtih retention of no 2-char substrings. For these three last names the following would be kept:

(1) Smith, (2) mith, (3) ith and (4) th
(1) Smithson, (2) mithson, (3) ithson, ..., (n) on
(1) Nesmith, (2) esmith, (3) smith, ... (n) th

The where now becomes

WHERE lastNameSub like 'smith%'

This seems to make the search routine by last name faster. I would like to improve on this if I can. Suggestions?

Dave

Hello Dave,

I'm afraid you can't within the core engine alone. Obviously a good indexing strategy will help you, but if we're talking about improving the substring/like search on a character column, apart from indexing, you don't have many options.

As you've discovered, using 'smith%' is much faster then '%smith%' as the optimiser has the opportunity to perform an index scan in the first instance - it does not in the second.

If you are performing a lot of these types of searches, I would look at full-text search:

http://www.databasejournal.com/features/mssql/article.php/3441981

http://www.sql-server-performance.com/tb_search_optimization.asp

Cheers,

Rob

No comments:

Post a Comment