Getting Started with PHP Regular Expressions
[ Note: Have you already pre-ordered your copy of our Printed Smashing Book #3? The book is a professional guide on how to redesign websites and it also introduces a whole new mindset for progressive Web design, written by experts for you.]
1. What are Regular ExpressionsThe main purpose of regular expressions, also called regex or regexp, is to efficiently search for patterns in a given text. These search patterns are written using a special format which a regular expression parser understands.
Regular expressions are originating from Unix systems, where a program was designed, called grep, to help users work with strings and manipulate text. By following a few basic rules, one can create very complex search patterns.
As an example, let’s say you’re given the task to check wether an e-mail or a telephone number has the correct form. Using a few simple commands these problems can easily be solved thanks to regular expressions. The syntax doesn’t always seems straightforward at first, but once you learn it, you’ll realize that you can do pretty complex searches easily, just by typing in a few characters and you’ll approach problems from a different perspective.
2. Perl Compatible Regular ExpressionsPHP has implemented quite a few regex functions which uses different parsing engines. There are two major parser in PHP. One called POSIX and the other PCRE or Perl Compatible Regular Expression.
The PHP function prefix for POSIX is ereg_. Since the release of PHP 5.3 this engine is deprecated, but let’s have a look at the more optimal and faster PCRE engine.
In PHP every PCRE function starts with preg_ such as preg_match or preg_replace. You can read the full function list in PHP’s documentation.
3. Basic SyntaxTo use regular expressions first you need to learn the syntax. This syntax consists in a series of letters, numbers, dots, hyphens and special signs, which we can group together using different parentheses.
In PHP every regular expression pattern is defined as a string using the Perl format. In Perl, a regular expression pattern is written between forward slashes, such as /hello/. In PHP this will become a string, ‘/hello/’.
Now, let’s have a look at some operators, the basic building blocks of regular expressions
^ | The circumflex symbol marks the beginning of a pattern, although in some cases it can be omitted |
$ | Same as with the circumflex symbol, the dollar sign marks the end of a search pattern |
. | The period matches any single character |
? | It will match the preceding pattern zero or one times |
+ | It will match the preceding pattern one or more times |
* | It will match the preceding pattern zero or more times |
| | Boolean OR |
- | Matches a range of elements |
() | Groups a different pattern elements together |
[] | Matches any single character between the square brackets |
{min, max} | It is used to match exact character counts |
\d | Matches any single digit |
\D | Matches any single non digit caharcter |
\w | Matches any alpha numeric character including underscore (_) |
\W | Matches any non alpha numeric character excluding the underscore character |
\s | Matches whitespace character |
As an addition in PHP the forward slash character is escaped using the simple slash \. Example: ‘/he\/llo/’
To have a brief understanding how these operators are used, let’s have a look at a few examples:
‘/hello/’ | It will match the word hello |
‘/^hello/’ | It will match hello at the start of a string. Possible matches are hello orhelloworld, but not worldhello |
‘/hello$/’ | It will match hello at the end of a string. |
‘/he.o/’ | It will match any character between he and o. Possible matches are heloor heyo, but not hello |
‘/he?llo/’ | It will match either llo or hello |
‘/hello+/’ | It will match hello on or more time. E.g. hello or hellohello |
‘/he*llo/’ | Matches llo, hello or hehello, but not hellooo |
‘/hello|world/’ | It will either match the word hello or world |
‘/(A-Z)/’ | Using it with the hyphen character, this pattern will match every uppercase character from A to Z. E.g. A, B, C… |
‘/[abc]/’ | It will match any single character a, b or c |
‘/abc{1}/’ | Matches precisely one c character after the characters ab. E.g. matchesabc, but not abcc |
‘/abc{1,}/’ | Matches one or more c character after the characters ab. E.g. matches abcor abcc |
‘/abc{2,4}/’ | Matches between two and four c character after the characters ab. E.g. matches abcc, abccc or abcccc, but not abc |
Besides operators, there are regular expression modifiers, which can globally alter the behavior of search patterns.
The regex modifiers are placed after the pattern, like this ‘/hello/i’ and they consists of single letters such as i which marks a pattern case insensitive or x which ignores white-space characters. For a full list of modifiers please visit PHP’s online documentation.
The real power of regular expressions relies in combining these operators and modifiers, therefore creating rather complex search patterns.
4. Using Regex in PHPIn PHP we have a total of nine PCRE functions which we can use. Here’s the list:
preg_filter ? performs a regular expression search and replace preg_grep ? returns array entries that match a pattern preg_last_error ? returns the error code of the last PCRE regex execution preg_match ? perform a regular expression match preg_match_all ? perform a global regular expression match preg_quote ? quote regular expression characters preg_replace ? perform a regular expression search and replace preg_replace_callback ? perform a regular expression search and replace using a callback preg_split ? split string by a regular expressionThe two most commonly used functions are preg_match and preg_replace.
Let’s begin by creating a test string on which we will perform our regular expression searches. The classical hello world should do it.
view plain copy to clipboard print ?
$test_string = 'hello world';If we simply want to search for the word hello or world then the search pattern would look something like this:
view plain copy to clipboard print ?
preg_match('/hello/', $test_string); preg_match('/world/', $test_string);If we wish to see if the string begins with the word hello, we would simply put the ^ character in the beginning of the search pattern like this:
view plain copy to clipboard print ?
preg_match('/^hello/', $test_string);Please note that regular expressions are case sensitive, the above pattern won’t match the word hElLo. If we want our pattern to be case insensitive we should apply the following modifier:
view plain copy to clipboard print ?
preg_match('/^hello/i', $test_string);Notice the character i at the end of the pattern after the forward slash.
Now let’s examine a more complex search pattern. What if we want to check that the first five characters in the string are alpha numeric characters.
view plain copy to clipboard print ?
preg_match('/^[A-Za-z0-9]{5}/', $test_string);Let’s dissect this search pattern. First, by using the caret character (^) we specify that the string must begin with an alpha numeric character. This is specified by [A-Za-z0-9].
A-Z means all the characters from A to Z followed by a-z which is the same except for lowercase character, this is important, because regular expressions are case sensitive. I think you’ll figure out by yourself what 0-9 means.
{5} simply tells the regex parser to count exactly five characters. If we put six instead of five, the parser wouldn’t match anything, because in our test string the word hello is five characters long, followed by a white-space character which in our case doesn’t count.
Also, this regular expression could be optimized to the following form:
view plain copy to clipboard print ?
preg_match('/^\w{5}/', $test_string);\w specifies any alpha numeric characters plus the underscore character (_).
6. Useful Regex FunctionsHere are a few PHP functions using regular expressions which you could use on a daily basis.
Validate e-mail. This function will validate a given e-mail address string to see if it has the correct form.
view plain copy to clipboard print ?
function validate_email($email_address) { if( !preg_match("/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])*@([a-zA-Z0-9_-])+ ([a-zA-Z0-9\._-]+)+$/", $email_address)) { return false; } return true; }Validate a URL
view plain copy to clipboard print ?
function validate_url($url) { return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)? (/.*)?$|i', $url); }Remove repeated words. I often found repeated words in a text, such as this this. This handy function will remove such duplicate words.
view plain copy to clipboard print ?
function remove_duplicate_word($text) { return preg_replace("/s(w+s)1/i", "$1", $text); }Validate alpha numeric, dashes, underscores and spaces
view plain copy to clipboard print ?
function validate_alpha($text) { return preg_match("/^[A-Za-z0-9_- ]+$/", $text); }Validate US ZIP codes
view plain copy to clipboard print ?
function validate_zip($zip_code) { return preg_match("/^([0-9]{5})(-[0-9]{4})?$/i",$zip_code); } 7. Regex Cheat SheetBecause cheat sheets are cool nowadays, below you can find a PCRE cheat sheet that you can run through quickly anytime you forget something.
Meta Characters^ | Marks the start of a string |
$ | Marks the end of a string |
. | Matches any single character |
| | Boolean OR |
() | Group elements |
[abc] | Item in range (a,b or c) |
[^abc] | NOT in range (every character except a,b or c) |
\s | White-space character |
a? | Zero or one b characters. Equals to a{0,1} |
a* | Zero or more of a |
a+ | One or more of a |
a{2} | Exactly two of a |
a{,5} | Up to five of a |
a{5,10} | Between five to ten of a |
\w | Any alpha numeric character plus underscore. Equals to [A-Za-z0-9_] |
\W | Any non alpha numeric characters |
\s | Any white-space character |
\S | Any non white-space character |
\d | Any digits. Equals to [0-9] |
\D | Any non digits. Equals to [^0-9] |
i | Ignore case |
m | Multiline mode |
S | Extra analysis of pattern |
u | Pattern is treated as UTF-8 |
Author: Joel Reyes
Joel Reyes Has been designing and coding web sites for several years, this has lead him to be the creative mind behind Looney Designer a design resource and portfolio site that revolves around web and graphic design.
From: http://www.noupe.com/php/php-regular-expressions.html

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











PHP and Python each have their own advantages, and choose according to project requirements. 1.PHP is suitable for web development, especially for rapid development and maintenance of websites. 2. Python is suitable for data science, machine learning and artificial intelligence, with concise syntax and suitable for beginners.

In PHP, password_hash and password_verify functions should be used to implement secure password hashing, and MD5 or SHA1 should not be used. 1) password_hash generates a hash containing salt values to enhance security. 2) Password_verify verify password and ensure security by comparing hash values. 3) MD5 and SHA1 are vulnerable and lack salt values, and are not suitable for modern password security.

PHP is widely used in e-commerce, content management systems and API development. 1) E-commerce: used for shopping cart function and payment processing. 2) Content management system: used for dynamic content generation and user management. 3) API development: used for RESTful API development and API security. Through performance optimization and best practices, the efficiency and maintainability of PHP applications are improved.

PHP is a scripting language widely used on the server side, especially suitable for web development. 1.PHP can embed HTML, process HTTP requests and responses, and supports a variety of databases. 2.PHP is used to generate dynamic web content, process form data, access databases, etc., with strong community support and open source resources. 3. PHP is an interpreted language, and the execution process includes lexical analysis, grammatical analysis, compilation and execution. 4.PHP can be combined with MySQL for advanced applications such as user registration systems. 5. When debugging PHP, you can use functions such as error_reporting() and var_dump(). 6. Optimize PHP code to use caching mechanisms, optimize database queries and use built-in functions. 7

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.

PHP is still dynamic and still occupies an important position in the field of modern programming. 1) PHP's simplicity and powerful community support make it widely used in web development; 2) Its flexibility and stability make it outstanding in handling web forms, database operations and file processing; 3) PHP is constantly evolving and optimizing, suitable for beginners and experienced developers.

PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

PHP and Python have their own advantages and disadvantages, and the choice depends on project needs and personal preferences. 1.PHP is suitable for rapid development and maintenance of large-scale web applications. 2. Python dominates the field of data science and machine learning.
