Getting Started with PHP Regular Expressions

Last Update Date: June 14, 2023

1. What are Regular Expressions

The main purpose of regular expressions, also called regex or regexp, is to efficiently search for patterns in a given text. These search patterns are written using a special format which a regular expression parser understands.

Getting Started with PHP Regular Expressions Image-1 — Photo by Pixabay on Pexels

Regular expressions are originating from Unix systems, where a program was designed, called grep, to help users work with strings and manipulate text. By following a few basic rules, one can create very complex search patterns.

As an example, let’s say you’re given the task to check wether an e-mail or a telephone number has the correct form. Using a few simple commands these problems can easily be solved thanks to regular expressions. The syntax doesn’t always seems straightforward at first, but once you learn it, you’ll realize that you can do pretty complex searches easily, just by typing in a few characters and you’ll approach problems from a different perspective.

2. Perl Compatible Regular Expressions

PHP has implemented quite a few regex functions which uses different parsing engines. There are two major parser in PHP. One called POSIX and the other PCRE or Perl Compatible Regular Expression.

The PHP function prefix for POSIX is ereg_. Since the release of PHP 5.3 this engine is deprecated, but let’s have a look at the more optimal and faster PCRE engine.

In PHP every PCRE function starts with preg_ such as preg_match or preg_replace. You can read the full function list in PHP’s documentation.

3. Basic Syntax

To use regular expressions first you need to learn the syntax. This syntax consists in a series of letters, numbers, dots, hyphens and special signs, which we can group together using different parentheses.

In PHP every regular expression pattern is defined as a string using the Perl format. In Perl, a regular expression pattern is written between forward slashes, such as /hello/. In PHP this will become a string, ‘/hello/’.

Now, let’s have a look at some operators, the basic building blocks of regular expressions

Operator	Description
^	The circumflex symbol marks the beginning of a pattern, although in some cases it can be omitted
$	Same as with the circumflex symbol, the dollar sign marks the end of a search pattern
.	The period matches any single character
?	It will match the preceding pattern zero or one times
+	It will match the preceding pattern one or more times
*	It will match the preceding pattern zero or more times
\|	Boolean OR
–	Matches a range of elements
()	Groups a different pattern elements together
[]	Matches any single character between the square brackets
{min, max}	It is used to match exact character counts
\d	Matches any single digit
\D	Matches any single non digit character
\w	Matches any alpha numeric character including underscore (_)
\W	Matches any non alpha numeric character excluding the underscore character
\s	Matches whitespace character

As an addition in PHP the forward slash character is escaped using the simple slash \. Example: ‘/he\/llo/’

To have a brief understanding how these operators are used, let’s have a look at a few examples:

Example	Description
‘/hello/’	It will match the word hello
‘/^hello/’	It will match hello at the start of a string. Possible matches are hello or helloworld, but not worldhello
‘/hello$/’	It will match hello at the end of a string.
‘/he.o/’	It will match any character between he and o. Possible matches are helo or heyo, but not hello
‘/he?llo/’	It will match either llo or hello
‘/hello+/’	It will match hello on or more time. E.g. hello or hellohello
‘/he*llo/’	Matches llo, hello or hehello, but not hellooo
‘/hello\|world/’	It will either match the word hello or world
‘/(A-Z)/’	Using it with the hyphen character, this pattern will match every uppercase character from A to Z. E.g. A, B, C…
‘/[abc]/’	It will match any single character a, b or c
‘/abc{1}/’	Matches precisely one c character after the characters ab. E.g. matches abc, but not abcc
‘/abc{1,}/’	Matches one or more c character after the characters ab. E.g. matches abc or abcc
‘/abc{2,4}/’	Matches between two and four c character after the characters ab. E.g. matches abcc, abccc or abcccc, but not abc

Besides operators, there are regular expression modifiers, which can globally alter the behavior of search patterns.

The regex modifiers are placed after the pattern, like this ‘/hello/i’ and they consists of single letters such as i which marks a pattern case insensitive or x which ignores white-space characters. For a full list of modifiers please visit PHP’s online documentation.

The real power of regular expressions relies in combining these operators and modifiers, therefore creating rather complex search patterns.

4. Using Regex in PHP

In PHP we have a total of nine PCRE functions which we can use. Here’s the list:

preg_filter – performs a regular expression search and replace
preg_grep – returns array entries that match a pattern
preg_last_error – returns the error code of the last PCRE regex execution
preg_match – perform a regular expression match
preg_match_all – perform a global regular expression match
preg_quote – quote regular expression characters
preg_replace – perform a regular expression search and replace
preg_replace_callback – perform a regular expression search and replace using a callback
preg_split – split string by a regular expression

The two most commonly used functions are preg_match and preg_replace.

Let’s begin by creating a test string on which we will perform our regular expression searches. The classical hello world should do it.

$test_string = 'hello world';

If we simply want to search for the word hello or world then the search pattern would look something like this:

preg_match('/hello/', $test_string);
preg_match('/world/', $test_string);

If we wish to see if the string begins with the word hello, we would simply put the ^ character in the beginning of the search pattern like this:

preg_match('/^hello/', $test_string);

Please note that regular expressions are case sensitive, the above pattern won’t match the word hElLo. If we want our pattern to be case insensitive we should apply the following modifier:

preg_match('/^hello/i', $test_string);

Notice the character i at the end of the pattern after the forward slash.

Now let’s examine a more complex search pattern. What if we want to check that the first five characters in the string are alpha numeric characters.

preg_match('/^[A-Za-z0-9]{5}/', $test_string);

Let’s dissect this search pattern. First, by using the caret character (^) we specify that the string must begin with an alpha numeric character. This is specified by [A-Za-z0-9].

A-Z means all the characters from A to Z followed by a-z which is the same except for lowercase character, this is important, because regular expressions are case sensitive. I think you’ll figure out by yourself what 0-9 means.

{5} simply tells the regex parser to count exactly five characters. If we put six instead of five, the parser wouldn’t match anything, because in our test string the word hello is five characters long, followed by a white-space character which in our case doesn’t count.

Also, this regular expression could be optimized to the following form:

preg_match('/^\w{5}/', $test_string);

\w specifies any alpha numeric characters plus the underscore character (_).

5. Useful Regex Functions

Here are a few PHP functions using regular expressions which you could use on a daily basis.

Validate e-mail. This function will validate a given e-mail address string to see if it has the correct form.

function validate_email($email_address)
{
if( !preg_match("/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])*@([a-zA-Z0-9_-])+
([a-zA-Z0-9\._-]+)+$/", $email_address))
{
return false;
}	
return true;
}

Validate a URL

function validate_url($url)
{
return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?
(/.*)?$|i', $url);
}

Remove repeated words. I often found repeated words in a text, such as this this. This handy function will remove such duplicate words.

function remove_duplicate_word($text)
{
return preg_replace("/s(w+s)1/i", "$1", $text);
}

Validate alpha numeric, dashes, underscores and spaces

function validate_alpha($text)
{
return preg_match("/^[A-Za-z0-9_- ]+$/", $text);
}

Validate US ZIP codes

function validate_zip($zip_code)
{
return preg_match("/^([0-9]{5})(-[0-9]{4})?$/i",$zip_code);	
}

6. Regex Cheat Sheet

Because cheat sheets are cool nowadays, below you can find a PCRE cheat sheet that you can run through quickly anytime you forget something.

Meta Characters

	Description
^	Marks the start of a string
$	Marks the end of a string
.	Matches any single character
\|	Boolean OR
()	Group elements
[abc]	Item in range (a,b or c)
[^abc]	NOT in range (every character except a,b or c)
\s	White-space character
a?	Zero or one b characters. Equals to a{0,1}
a*	Zero or more of a
a+	One or more of a
a{2}	Exactly two of a
a{,5}	Up to five of a
a{5,10}	Between five to ten of a
\w	Any alpha numeric character plus underscore. Equals to [A-Za-z0-9_]
\W	Any non alpha numeric characters
\s	Any white-space character
\S	Any non white-space character
\d	Any digits. Equals to [0-9]
\D	Any non digits. Equals to [^0-9]

Pattern Modifiers

	Description
i	Ignore case
m	Multiline mode
S	Extra analysis of pattern
u	Pattern is treated as UTF-8

8. Useful Readings

Author: Joel Reyes

Joel Reyes Has been designing and coding web sites for several years, this has lead him to be the creative mind behind Looney Designer a design resource and portfolio site that revolves around web and graphic design.

Photo by Pixabay on Pexels

Was this article helpful?

Yes

We're sorry to hear that. What problem did you have with the article?

How can we improve this article?

RECOMMENDED ARTICLES

ADVICE

How to use QR codes for event planning

by Lee Nathan | April 26, 2024

ADVICE

How to optimize the checkout process: 12 strategies

by Jotform | April 24, 2024

ADVICE

Top 20 team collaboration survey questions

by John Boitnott | April 24, 2024

ADVICE

BPM vs RPA: Which should you use to boost efficiency?

by Jotform | April 21, 2024

ADVICE

How to find a process improvement consultant

by Jotform | April 20, 2024

ADVICE

10 top Evite alternatives for 2024

by John Boitnott | April 19, 2024

ADVICE

Top 15 apps to manage your restaurant

by Jotform | April 18, 2024

ADVICE

What are record types in Salesforce?

by Jotform | April 17, 2024

ADVICE

Top 10 AskNicely alternatives for 2024

by John Boitnott | April 12, 2024

ADVICE

NetSuite vs Salesforce: Features, pricing, pros, and cons

by Jotform | April 12, 2024

Send Comment:

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Comments:

dq5e1b

More than a year ago

thank you! This tutorial is very useful !
Hinesh Ghelani

More than a year ago

Hi Joel, thank you so much for your very informative web page. I am new to PHP and its pages such as yours, which help people like me understand PHP and bit more.

Joel, is there any chance you can help me with this, I have been struggling with this for two weeks now.

. OID 1 3 6 1 4 1 6486 800 1 2 1 8 1 1 1 1 1 1002 18 244 48 185 115 114 2 Type OctetString Value F4 30 B9 73 72 02 .

I have this string and I just need two bits of information from it, that being the 1002 and the F4 30 B9 73 72 02

Joel, is there any chance you could show me how to do this please?
hvardhan

More than a year ago

nice tutorial :)
Try php based online regex editor which helps you test your regular expressions with real-time highlighting of regex match on data input
alen

More than a year ago

Nice tutorial. Thank you.
Nalaka Prasad

More than a year ago

Thanks for the greate tut about regular expressions, I learned a lot abut regexp from this article.. :)
laeeq

More than a year ago

Thanks a lot for this. Really helped me out.
Web Genius

More than a year ago

Woww, i love this tutorial, it really helped:-)
Md Aijaz

More than a year ago

awesome. thanks
nothingchen

More than a year ago

very useful. thank you so much. :)

Follow me on twitter:
nothingchen
Akash

More than a year ago

Quit helpful for the beginners...
Gerard

More than a year ago

Thanks a lot for this. Really helped me out. I hate Regular Expressions!!!!!
meeni

More than a year ago

great article..thanks for the help..
Valmir

More than a year ago

I need help on this --> ^N[[:alpha:]]*{-8}[0-9] what is the rezult?
Neeraj

More than a year ago

Hey Lemon, I think the following may work for you

<?php
$pattern = '/C.*T/';
$string = "A cat tear a coat";

if (preg_match($pattern, $string)) {
echo "Valid String";
}else{
echo "Invalid String";
}
?>
Lemon

More than a year ago

I want to make a program which will read a big string and search all the words that started with "C" and Ending with "T" by PHP.

Please Guide me , how to?
Jatin

More than a year ago

I was thinking of learning regular expression, thanks for showing up the article at a perfect time.
Brett Widmann

More than a year ago

This was a really helpful guide. I will be showing this article to some other beginners to give them a basis behind PHP.
Claude Lomack

More than a year ago

Gibt es da noch Alternativen?
Ahmed

More than a year ago

Before reading this I knew nothing about regex (except the name), now I am leaving this page with lot of info plus good example to be kept in mind whenever dealing Regex.
Thanks a lot.
Donald

More than a year ago

But I'm wandering where to register.
Denker

More than a year ago

Great article, thanks for share it.
Prakash

More than a year ago

I think this is an effective & useful post for the professional designers to boost up their creativity. Simple to read & understand & easy to work out. We need this actually. Thanks for giving your advice at a right time. Keep on doing this great job & thx again.
Who Knows

More than a year ago

Nice article, also you should consider using the filter_var php function in some cases:

It can validate email, urls etc. without using regular expressions.

Cheers!
Kojeje

More than a year ago

Hey Master..
thanks for teaching me this code...
zakki

More than a year ago

basic syntax help me very much
shad

More than a year ago

hi all

i am planing to do my final year projcet in UNIV can u HElP me with PHP please if you can email me on shad24×7@hotmail.co.uk
bem km ipb

More than a year ago

"REGEX" is Oke.. :D
mehdy

More than a year ago

Thank you for this clean and great tutorial,
Best Regards :)
chandan dutta

More than a year ago

web designer
Thanks for great article.Its very useful for me.
Hans Lee

More than a year ago

You could test your regexp in the PHP regexp tester.

The regexp should be
/(w+)s(1)/
Kian Ann

More than a year ago

Further to my confusion, I've went to test the replace duplicate words regex out and it doesn't work... at least not for me.

return preg_replace(”/s(w+s)1/i”, “$1?, $text);

I went around doing some searches and ended up asking a friend. This works to replace duplicate words:

return preg_replace("/(w+)s(\1)/", "$1", "Oh this this is good");
// returns "Oh this is good"

But this replaces only 1 duplicate word. If the text was "Oh this this this is good", then it will not replace 3 copies of the words at once - you'll have to do an iteration (or recursion)... whatever u call it.
Kian Ann

More than a year ago

This is a great tutorial! But I don't get the remove duplicate words thing?

return preg_replace("/s(w+s)1/i", "$1", $text);

I'd assume the text is "this this", and we "/s(w+s)1/i" is aupposed to match "this this" and replace with "this"?

Does the second part "$1" refers to the (w+s) group? How do we match two identical words?
krisha

More than a year ago

good tutorial i improved my knowledge
Mehedi Hasan

More than a year ago

great great great tutorial. today i learn many things... thanks a lot for sharing...
Ahmad Alfy

More than a year ago

On Pattern Modifiers, you forgot g (global)
It's pretty useful
Hamranhansen

More than a year ago

> As an addition in PHP the forward slash character
> is escaped using the simple slash .

This is a "slash": /

This is a "backslash":
Bijay Rungta (@rungss), Sharing Knowledge

More than a year ago

Very nicely written...

Hope this helps People who have difficulty understanding the concept of Regular Expressions and using them in PHP..

@rungss on Twitter
Joel Reyes

More than a year ago

Hello!

I tested the regular expressions again.

You're right mentioned conditions are correct.

The last one should be condition should be like this '/(hello)|(world)/' then it will match either hello or world.

Regards :)
Alex Edwards

More than a year ago

Nice post.

Might be worth mentioning some off the online testing tools? I found these invaluable when learning regexp. Here's a couple:
fx15 lida y?lan ya?? kar?nca yumurtas? xacc

More than a year ago

thank you. very good post.
Zac

More than a year ago

+1... what editor/theme is that is looks really easy on the eyes.
Keith

More than a year ago

Not a bad introduction, except for the mistakes pointed out by Gman above (see, they're tricky things, regexes!)

Also, don't use the example shown for email validation -- it's too simple, and will fail valid addresses such as john+doe@example.com. To validate addresses against all the forms permitted by the relevant RFCs requires a MUCH more complex regular expression.
Arkh

More than a year ago

Woot, another regex tutorial for newbies.
Nothing about comments in the regex (yes, you can do that and it's freakin' usefull for big ones).
Awesome, slash delimiters only, guess '`/*`' doesn't work. Snap.

Why lose time writing about simple regexp when you can just link to o/ and write about php specific topics like preg_quote which helps avoid using things like \ in your regexp.
Andy Walpole

More than a year ago

When you first look at a regex pattern you think, "Oh my god that looks ridiculously complicated"

But it's not actually as complex as it initially looks.

They are well worth learning about as they really come in handy in PHP and mod_rewrite
Pierre

More than a year ago

Great article

For the 2 first example of useful regex ( Email and Url ), the best way to validate thata is to use the filter_var() function from the filter extension activated in php5.2.0
Sky

More than a year ago

PHP has some integrated functions to validate things like url's or email's :
I know that this is a tutorial, but still, why code something that already exists. :) Nice day to all.
Alex

More than a year ago

o/reference.html
has helped me a lot :)
RCKY

More than a year ago

Thank you...
dont think the if statement in the email validator function is needed... return preg_match(...); would be enough... wouldn't it?
derschreckliche

More than a year ago

@Gman: I had the same thoughts reading it - so here are the right versions:
(You just have to add bracets to match a quantifier ( ? , * , + , {min, max} ) or | (union) on more than one character.)

‘/(he)?llo/’ It will match either llo or hello
‘/(hello)+/’ It will match hello on or more time. E.g. hello or hellohello
‘/(he)*llo/’ Matches llo, hello or hehello, but not hellooo
‘/(hello)|(world)/’ It will either match the word hello or world

here is a very useful website to test regexp online:

Even it's a german site it's the best i know - for all of you who need practice, like me sometimes ;)
Checking "Color-Pattern anzeigen" highlights your regular expression with colors and "Treffer anzeigen" will highlight the matches in the text you entered.
The resulting $matches array is also shown.
(Enter the regexp only, without the '/ and /' )
fly2279

More than a year ago

Thank you for this great article. What theme or style are you using for your code editor?
joyoge designers' bookmark

More than a year ago

useful tutorial, thanks for tips..
S.Krol

More than a year ago

Try the RegexCoach
ed

More than a year ago

great! ive been looking for such a guide for a while now!
thanks so much!
Ahmed

More than a year ago

Very good tut ill try some tips of it
Ivan Miši?

More than a year ago

Good explained, but after this tutorial, I'll need lot of practice.
Gman

More than a year ago

I think you have some mistakes

‘/he?llo/’ It will match either hllo or hello
‘/hello+/’ It will match hell and o on or more time. E.g. hello or helloooooooooo
‘/he*llo/’ Matches hllo, hello or heeello, but not hellooo
‘/hello|world/’ It will either match the word helloorld or hellworld
....
there might be some more, but I haven't read all
Ben

More than a year ago

Great article there :)

I think regular expressions are something you will learn over time, not something you can be done with after 1 tutorial.