T-SQL to Split a varchar into Words
I had to knock this up for Comicster's web services, and I thought it might come in handy for someone else out there.
It's a user-defined-function to split a given string into 'words'. In this case I'm delimiting words using commas and spaces. So 'Smith, Fred' (with a comma AND a space) will be separated into 'Smith' and 'Fred'.
For convenience, it returns a table with a 'pos' column, representing the position in the original text that the word was found.
Hope this helps someone out there.
CREATE FUNCTION SplitWords(@text varchar(8000)) RETURNS @words TABLE ( pos smallint primary key, value varchar(8000) ) AS BEGIN DECLARE @pos smallint, @i smallint, @j smallint, @s varchar(8000) SET @pos = 1 WHILE @pos <= LEN(@text) BEGIN SET @i = CHARINDEX(' ', @text, @pos) SET @j = CHARINDEX(',', @text, @pos) IF @i > 0 OR @j > 0 BEGIN IF @i = 0 OR (@j > 0 AND @j < @i) SET @i = @j IF @i > @pos BEGIN -- @i now holds the earliest delimiter in the string SET @s = SUBSTRING(@text, @pos, @i - @pos) INSERT INTO @words VALUES (@pos, @s) END SET @pos = @i + 1 WHILE @pos < LEN(@text) AND SUBSTRING(@text, @pos, 1) IN (' ', ',') SET @pos = @pos + 1 END ELSE BEGIN INSERT INTO @words VALUES (@pos, SUBSTRING(@text, @pos, LEN(@text) - @pos + 1)) SET @pos = LEN(@text) + 1 END END RETURN END
# Trackback from Splitting Text Into Words in SQL Revisited on 4/09/2008 5:03 PM
Comments
# Keisuke
6/07/2006 4:39 PM
Thanks, Matt!
The code works perfectly in my Japanese character environment :-)
# mabster
6/07/2006 5:28 PM
Great to hear! Thanks for the comment, Keisuke.
# ckam
25/07/2006 6:45 AM
Dude, seriously, you have no idea how much trouble you saved me. I thank you.
THANK YOU
# Srikanth
10/10/2006 9:36 PM
Dude, this is a real help for me. I have to say 1000 thanks for such a nice one.
# nubcake
6/12/2006 6:15 AM
Thanks mate, now I don't need to write it :]
# Jennifer
6/12/2006 6:56 AM
Thanks very much for the post! This really saved me some work and trouble.
# purplekitten
1/02/2007 10:29 PM
Thank you. Saved me writing it. I always like to check before I go reinventing the wheel :)
And a very good wheel it is too.
# Charlie
26/07/2007 1:03 PM
Thanks very much. For my purpose, I replaced 'pos' with 'uid' and made it an identity(1,1) column. Thus, I could do a JOIN with another table, table function, etc. Thanks again!
# GUNES OZIS
26/11/2007 6:00 AM
ITS WORKING PERFECT THANX
# Don
7/12/2007 3:26 AM
This is a great function. Thanks.
But can we split the words inside a column?
# mabster
7/12/2007 7:36 AM
How do you mean, Don? I can pass any column as the parameter to this function (provided it's of a text-based column type like char or varchar).
# Don
11/12/2007 5:39 AM
Matt,
Yes. I like to pass a column in as a parameter and split whatever inside the column into single words separated by space. What I am trying to do is to compare each single word from @input parameter the user has input in, compare to each single word from a certain column. Please provide me sample syntax if you have the solution. Thanks Matt!
# mabster
11/12/2007 7:33 AM
I haven't needed this UDF for quite some time, but from memory it's something like:
SELECT *
FROM MyTable t
CROSS APPLY dbo.SplitWords(t.MyColumn) w ON w.value = @word
WHERE dbo
(Where @word is the word you're searching for.)
# Don
12/12/2007 1:46 AM
Matt,
I am using sql server 2000, it does not have function 'Apply'. And from the articles from net, it says we can NOT pass a row into UDF in sql server 2000, I don't know there is the other way to do it. My query is something like this:
select *
from table1 t1
where dbo.splitwords(t1.column1)
like dbo.splitwords(@userinput)
(where @userinput is a input parameter)
I think dbo.splitwords(t1.column1) does not work in sql server 2000.
I hope you have solution.
Thanks
# mabster
12/12/2007 7:23 AM
SELECT *
FROM dbo.SplitWords(@userinput) w
JOIN table1 t ON dbo.SplitWords(t.column1) = w.value
Does that work?
# Don
12/12/2007 8:49 AM
No Matt
it show the error where dbo.splitwords(t.column1)
I don't think this UDF take a column as input parameter in sql server 2000.
Anyway, thanks for great function you post.
# Lee
27/12/2007 6:35 AM
Don,
Would assigning the results of the column to a variable and passing the variable in to the UDF work?
# Klas
17/01/2008 1:46 AM
Great man! Thanks!
# dbatten
18/01/2008 5:53 AM
Genius!! Thanks so much.
# Anindya
2/04/2008 4:27 AM
Thanks Matt! Thanks a lot.Its much better than the CHARINDEX stuff I had been using :)
# mabster
2/04/2008 11:42 AM
No problem! Don't forget to check the follow-up post:
www.madprops.org/.../Splitting-Text-Into-Words-in-SQL-Revisited.aspx
# scott in a pinch
23/04/2008 2:29 AM
i hope you dont mind that I changed your function a little. changed the signature to spruced up the CHARINDEX search.
GREAT FUNCTION!!!Saved me a lot of time.
CREATE FUNCTION SplitValues(@text varchar(8000),@Delimteter VARCHAR(2))
RETURNS @words TABLE (pos smallint primary key, value varchar(8000))
AS
BEGIN
DECLARE @pos smallint,
@i smallint,
@j smallint,
@s varchar(8000)
SET @pos = 1
WHILE @pos <= LEN(@text)
BEGIN
SET @i = CHARINDEX(@Delimteter, @text, @pos)
SET @j = CHARINDEX(@Delimteter, @text, @pos)
# Shinobi
1/05/2008 5:48 PM
Yesss, EXACTLY what I was looking for. Dude, you rock! This just saved me alot of time. Thank God for Google...and cool guys like you = ).
# zitun
7/05/2008 1:19 AM
Here is something without the function :
--This is my string
select 'my_string_stuff_b'
-- I want "string_stuff"
select SUBSTRING('my_string_stuff_b', CHARINDEX('_', 'my_string_stuff_b') + 1,(CHARINDEX('_','my_string_stuff_b',LEN('my_string_stuff_b')-3) - (CHARINDEX('_', 'my_string_stuff_b') + 1)) )
# Matze
21/06/2008 8:32 AM
hey dude - thank you very very much. works perfect easy and fast.
a+
# Deepthi
28/08/2008 9:37 PM
Awesome.. Countless thanks to you
# pavan
14/10/2008 11:37 PM
Thanks very much for the post! This saved me from trouble.
# Nick
19/11/2008 9:16 PM
Thanks, Matt. Works for 100%. God Bless You and Google :)
# SS
3/03/2009 4:26 AM
Thanks Matt. The function works fine but in my case, the 'words' have spaces in them and the function does not work properly. So if the string is like this:
John Smith, John Doe, Jane Doe
The result is:
John
Smith
John
Doe
Jane
Doe
I want:
John Smith
John Doe
Jane Doe
# mabster
3/03/2009 7:17 AM
SS,
See my follow-up post for a more-flexible version which can ignore spaces.
www.madprops.org/.../splitting-text-
# Mathieu L.
6/03/2009 1:54 AM
This is awesome !!!!
I can now pass my list of orders to be added to a list using a stored procedure... you function works perfectly !
THANKS :)
# Fernando Faria
21/07/2009 11:03 PM
Its working perfect. Very good.
# Ricardo
29/07/2009 9:43 PM
Hi. Im trying this:
SELECT [SplitWords](REPLACE(CAST(Observacoes as varchar(500)) ,char(13)+char(10),',')) From DocFinanceirosVendaCabecalho WHERE ID=1
but it gives me an error:
Msg 102, Level 15, State 1, Line 15
Incorrect syntax near 'REPLACE'.
Can you help me? thanks
# mabster
29/07/2009 10:10 PM
Ricardo,
It's a table-valued function. It returns a table. So select FROM it, don't just select it:
SELECT * FROM dbo.SplitWords(...)
# Fabian Fernandez
3/10/2009 6:23 AM
Thanks a lot Matt u reaaaaaaally help me today!
# Ricardo Passians
29/10/2009 9:47 AM
Another version:
CREATE FUNCTION xSplitWords(@text varchar(8000))
RETURNS @words TABLE (
pos smallint primary key,
value varchar(8000)
)
AS
BEGIN
DECLARE @s varchar(8000), @PosSep INT, @PrevPosSep INT, @word varchar(60)
SELECT @PosSep=1, @PrevPosSep=0, @text=REPLACE(rtrim(@text),' ',',')+','
while (1=1)
begin
set @PosSep=CHARINDEX(',', @text, @PrevPosSep+1)
IF @PosSep=0 BREAK
set @word=SUBSTRING(@text, @PrevPosSep+1, @PosSep-@PrevPosSep-1)
IF @word<>''
INSERT INTO @words
VALUES (@PrevPosSep+1, @word)
set @PrevPosSep=@PosSep
end
RETURN
END
# Paul Taylor
30/01/2010 8:24 PM
Great function, exactly the jumping off point I needed to work from.
Here is my variation, which returns a comparable list of Soundex values for the words found (tossing out invalid words (SOUNDEX='0000')). Adding it to an AFTER INSERT/UPDATE pair of triggers, I can then do the following:
Select *
from Parts
where (DescriptionSdx like '%' + SOUNDEX('HYDRAULICK') + '%'
or DescriptionSdx like '%' + SOUNDEX('HYD') + '%')
AND DescriptionSdx like '%' + SOUNDEX('PUMP') + '%'
Which returns hits on 'HYRAULIC PUMP', 'HYD PUMP', 'HYDROLIC', 'PUMP, HYD KIT', etc.
CREATE FUNCTION [dbo].[SoundexWords](@text varchar(8000))
RETURNS varchar(8000)
AS
BEGIN
DECLARE
@pos smallint,
@i smallint,
@j smallint,
@s varchar(8000),
@SoundexWords varchar(8000),
@WordSoundex char(4)
SET @pos = 1
SET @SoundexWords = ''
WHILE @pos <= LEN(@text)
BEGIN
SET @i = CHARINDEX(' ', @text, @pos)
SET @j = CHARINDEX(',', @text, @pos)
IF @i > 0 OR @j > 0
BEGIN
IF @i = 0 OR (@j > 0 AND @j < @i)
SET @i = @j
IF @i > @pos
BEGIN
-- @i now holds the earliest delimiter in the string
SET @s = SUBSTRING(@text, @pos, @i - @pos)
SET @WordSoundex = SOUNDEX(@s)
IF @WordSoundex <> '0000'
BEGIN
IF @SoundexWords <> ''
BEGIN
SET @SoundexWords = @SoundexWords + ' '
END
SET @SoundexWords = @SoundexWords + @WordSoundex
END
END
SET @pos = @i + 1
WHILE @pos < LEN(@text)
AND SUBSTRING(@text, @pos, 1) IN (' ', ',')
SET @pos = @pos + 1
END
ELSE
BEGIN
SET @WordSoundex = SOUNDEX(SUBSTRING(@text, @pos, LEN(@text) - @pos + 1))
IF @WordSoundex <> '0000'
BEGIN
IF @SoundexWords <> ''
BEGIN
SET @SoundexWords = @SoundexWords + ' '
END
SET @SoundexWords = @SoundexWords + @WordSoundex
END
END
END
RETURN @SoundexWords
END
GO
# gpwin59
4/06/2010 2:53 AM
hi Matt,
just wanted to say thanks for the function it's very useful
All the best
# Sean
2/03/2011 9:58 PM
Neat. One little tweak - you should be able to use PATINDEX('%[, ]%', @text) to pick out multiple delimiters without being explicit. Should simplify the logic a bit...
# Matt Beck
19/04/2011 2:52 AM
Awesome. It's 2011 and I'm still using your function. LOL
# Tove
30/09/2012 3:54 PM
Thank you. Just made use of this today!
# Amit
12/04/2013 5:34 PM
Hi Matt,
This function works perfectly. Thank you.
I had an additional requirement where there could be a space between two delimiters or a blank value, something like this (, , or ,,) and I need ti retain these as field values. Can you please suggest something?
Best regards.
# Fred J.
4/01/2014 10:55 PM
Matt,
This was the most immediately used and perfect T-SQL code chunks from the web. Hey, if I have the time, I love to code my own, but your fn rocks!!
Thanks Much!
Fred J.
Director of Technical Services
Madison, WI
# Spartagen xt
20/11/2015 9:00 PM
A month isn't desire enough for elevated testosterone levels to bonk an essence on yobbo ontogenesis and development.D-AA has been plant to cater raised natality and testosterone when supplemented by infertile men, but it has no validness on athletes and people with rule testosterone levels.http://www.healthcaresups.com/spartagen-xt-exposed/
# show cake
30/11/2015 10:57 PM
Help anti-UV
Yes, we know it's summer or on vacation at the beach, but the harmful rays of the skin at any time, not just during the summer but also in winter. Until the first rays of the spring force can be fatal for your skin literally. It is to protect the skin from UV rays. When you go to the mountains, be sure to apply a thin layer daily under makeup.
fun-flicks.com/.../pure-phytoceram
# SQLBlog
26/04/2016 4:59 PM
Pretty old post but still works! Thanks for sharing!
# AD
17/10/2017 1:55 AM
Thanks a lot, that worked an helped!