Thursday, May 19, 2011

Sql user defined function to replace HTML text without disturbing HTML elements

You may find many articles on how to strip HTML in SQL. But we rarely find articles on doing replace functionality without disturbing the HTML tags in SQL. This is a very useful requirement where we store content in HTML format in the database.

When I did Google search on this I found some articles on how to strip HTML from SQL and one of those articles inspired me to write a user defined function which does replace within the text and preserves the HTML as it is.

The below is the user defined function written by me as inspired from the article SQL SERVER – 2005 – UDF – User Defined Function to Strip HTML – Parse HTML – No Regular Expression

CREATE FUNCTION [dbo].[udf_ReplaceHTMLContent]
(
@HTMLText VARCHAR(MAX),
@Find VARCHAR(MAX),
@Replace VARCHAR(MAX)
)

RETURNS VARCHAR(MAX)
AS
BEGIN

DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
DECLARE @UpdatedHTMLText VARCHAR(MAX)

SET @UpdatedHTMLText = ''
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1


WHILE @Start > 0
AND @End > 0
AND @Length > 0

BEGIN

SET @UpdatedHTMLText = @UpdatedHTMLText + Replace(SUBSTRING(@HTMLText, 0, @Start), @Find, @Replace)

SET @UpdatedHTMLText = @UpdatedHTMLText + SUBSTRING(@HTMLText, @Start,@Length)

SET @HTMLText = STUFF(@HTMLText,1, @Start + @Length - 1,'')

SET @Start = CHARINDEX('<',@HTMLText)

SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))

SET @Length = (@End - @Start) + 1

END


if @UpdatedHTMLText = ''

SET @UpdatedHTMLText = @UpdatedHTMLText + Replace(@HTMLText, @Find, @Replace)


RETURN LTRIM(RTRIM(@UpdatedHTMLText))

END

kick it on DotNetKicks.com