Kamis, 09 September 2010

How to Automatically Linkify Text with PHP Regular Expressions

Good software enables us to take a lot of niceties for granted. Intelligent interfaces handle all the simple tasks so that we don’t need to worry about them. For example, when I type “www.desktopped.com” into an email or an instant message, I expect that it will be clickable on the other end without having to manually add in HTML tags. Another example is parsing text from a twitter feed. For example, “@desktopped is a blog about the #computers“, we expect both @desktopped and #computers to be links.

The ability to “linkify” text is a great tool to have when developing a blog or website. Possible uses include:
  • Turning URLs clickable in content, comments, and anywhere else
  • Turning valid email addresses clickable
  • Turning twitter text clickable so that @desktopped, #computers, www.desktopped.com all become links.
To search a string for patterns, such as strings that begin with “http://” or “@” is an ability that can be applied in almost endless ways to improve the way we process and display data.
How can we do this? The best way with PHP is to use a universal pattern matching syntax called regular expressions and some useful PHP functions.

Regular Expressions Basics

A regular expression is a pattern string that represents a set of strings by using a variety of special characters.

The Basic Special Characters

  • | connects two possible values and will turn up a match if the string matches either. For example hi|hello matches the strings “hi” and “hello”
  • () are used to group values and set order of operations. For example, br(i|y)an will match both “brian” and “bryan”.
  • [] are used to match a single character that appears inside the brackets. [abc] will match “a”, “b”, or “c”, but not “d”.
  • * will turn up a match if there is zero or more of the preceding element. The string go*gle will match “ggle”, “gogle”, “google”, “gooogle”, etc.
  • + will turn up a match if there is one or more of the preceding element. The string go+gle will match “gogle”, “google”, gooogle”, etc.
  • ? will turn up a match if there is zero or one of the preceding character. The string desktopp?ed will match both “desktopped” and “desktoped”.

Other Common Special Characters

  • \w will match a “word” character, which translates to any character alphanumeric or ‘_’
  • \n \r and \t will match a new line, carriage return and tab respectively.
A full reference for special characters can be found here:

PHP Function: preg_replace

The preg_replace function in PHP will take a regular expressions pattern, a replacement string, and the text to be examined as arguments. It will check the input text against the pattern and then if there’s a match it will place certain pieces of the input text into the replacement string.
The pieces that are placed into the replacement string are determined by what is in parenthesis in the pattern string. They are then referenced in the replacement string by using $0, $1, $2, etc., where the $n matches the nth parenthesized pattern.

A Simple Example

  1. $text'My name is Brian';  
  2. $pattern = 'My name is (Brian|Sam|Zach)';  
  3. $replacement = '$1 is a pretty cool guy.';  
  4. echo preg_replace($pattern$replacement$text);  
The code will output “Brian is a pretty cool guy.” If $text was “My name is Zach”, the output would be “Zach is a pretty cool guy.” If $text was “My name is Nick”, there’d be no match and the original text would be returned; “My name is Nick”.

Useful Regex Functions

This function will turn all URLs in a body of text into clickable links
  1. function link_it($text)  
  2. {  
  3.     $text= preg_replace("/(^|[\n ])([\w]*?)((ht|f)tp(s)?:\/\/[\w]+[^ \,\"\n\r\t<]*)/is""$1$2<a href=\"$3\" >$3</a>"$text);  
  4.     $text= preg_replace("/(^|[\n ])([\w]*?)((www|ftp)\.[^ \,\"\t\n\r<]*)/is""$1$2<a href=\"http://$3\" >$3</a>"$text);  
  5.     $text= preg_replace("/(^|[\n ])([a-z0-9&\-_\.]+?)@([\w\-]+\.([\w\-\.]+)+)/i""$1<a href=\"mailto:$2@$3\">$2@$3</a>"$text);  
  6.     return($text);  
  7. }  
This function will turn all pound signs (#) and at-sign (@) into hash tag and @reply links in a twitter feed.
  1. function twitter_it($text)  
  2. {  
  3.     $text= preg_replace("/@(\w+)/"'<a href="http://www.twitter.com/$1" target="_blank">@$1</a>'$text);  
  4.     $text= preg_replace("/\#(\w+)/"'<a href="http://search.twitter.com/search?q=$1" target="_blank">#$1</a>',$text);  
  5.     return $text;  
  6. }  
This function finds strings in your post body that you’ve identified with the pattern :tagname: and turns them into tag searches on your blog. For example: “This post is about :PHP:.” will result in “The post is about PHP“.
  1. function tag_it($text)  
  2. {  
  3.     $text= preg_replace("/:(\w+):/"'<a href="http://www.buildinternet.com/tag/$1/" target="_blank">$1</a>',$text);  
  4.     return $text;  
  5. }  
This function will highlight search terms in search result titles on your WordPress blog. Pass an array of keywords and it will do the rest. (Must be used inside the loop)
  1. function highlight_terms($keys_array)  
  2. {  
  3.     $title = get_the_title();  
  4.     return preg_replace('/('.implode('|'$keys_array) .')/iu''<span class="highlight">$0</span>'$title);  
  5. }  
The function will take any string (usually a page title) and generate a URL slug.
  1. function create_slug($string)  
  2. {  
  3.     $stringstrtolower(trim($string));  
  4.     $string= preg_replace('/[^a-z0-9-]/''-'$string);  
  5.     $string= preg_replace('/-+/'"-"$string);  
  6.     return $string;  

Tidak ada komentar:

Posting Komentar