Programming Tricks & Hacks

Strip All HTML Tags from a String by Regular Expression

Posted by kinjanshah on September 24, 2008

In one of my functionality, I needed to strip all HTML tags from a String and then I need to show it to User. So, I have used Regular Expression for that. Following is the example of it.

Procedure:

  1. Retrieve All HTML Tags using <(.|\n)*?> Pattern
  2. Replace them with Empty String and return the Result.

Example 1:

Private Function StripHTML(ByVal htmlString As String) As String

‘”<(.|\n)*?>” -> This pattern Matches everything found inside html tags;

‘”(.|\n)” -> Look for any character or a new line

‘”*?” -> 0 or more occurrences, and make a non-greedy search meaning

‘That the match will stop at the first available ‘>’ it sees, and not at the last one

Dim Pattern As String = “<(.|\n)*?>”

Return Regex.Replace(htmlString, Pattern, String.Empty)

End Function

Example 2:

You can just use one line also as following

Regex.Replace(textBox1.Text,@”<(.|\n)*?>”,”");

2 Responses to “Strip All HTML Tags from a String by Regular Expression”

  1. Rajat said

    Hi,
    Thanks for your appreciation. this one also a nice example of regular expression.
    Regards
    Rajat (http://wwww.indiandotnet.wordpress.com)

  2. J. Benitez said

    Thanks it was so usefull

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>