Has it really been over a week since last I blogged? Oh well ... I've been busy both at work and at home.
Something I've been working on of late is a better routine to split a string into individual words in T-SQL. You might remember my previous effort, which works really well. This one works even better (although it may be a little slower), because it allows for any number of "delimiter" characters (in the @delims variable).
I know a few of my readers are absolute SQL guns, so if there's something in here that doesn't look right, or could work better, feel free to call me on it in the comments section.
Update! It turns out that I need to move the hyphen character in @delims to the end of the string - otherwise it's treating it as a range operator, and matching everything between the characters on either side of it!
CREATE FUNCTION SplitWords(@text varchar(8000))RETURNS @words TABLE (pos smallint,value varchar(8000))ASBEGINSET @text = RTRIM(@text)DECLARE @delims varchar(10)SET @delims = ' ,:/-'DECLARE @pos smallint, @i smallint, @s varchar(8000)SET @pos = 1WHILE @pos < LEN(@text)AND CHARINDEX(SUBSTRING(@text, @pos, 1), @delims) > 0SET @pos = @pos + 1WHILE @pos <= LEN(@text)BEGINSET @i = PATINDEX('%[' + @delims + ']%',SUBSTRING(@text, @pos, len(@text) - @pos + 1))IF @i > 0BEGINSET @i = @i + @pos - 1IF @i > @posBEGIN-- @i now holds the earliest delimiter in the stringSET @s = SUBSTRING(@text, @pos, @i - @pos)INSERT INTO @wordsVALUES (@pos, @s)ENDSET @pos = @i + 1WHILE @pos < LEN(@text)AND CHARINDEX(SUBSTRING(@text, @pos, 1), @delims) > 0SET @pos = @pos + 1ENDELSEBEGINSET @s = SUBSTRING(@text, @pos, LEN(@text) - @pos + 1)INSERT INTO @wordsVALUES (@pos, @s) SET @pos = LEN(@text) + 1ENDEND
-- remove common words that we don't want to search forDELETE FROM @wordsWHERE value IN ('an', 'the', 'of', 'and', '&')RETURNEND