Regular Expressions in a Nutshell

So, you want to validate an email that a user entered into a form to make sure that it is correctly formatted.  No problem, just use a Regular Expression to do this, like the following:

/^(?("")("".+?""@)|(([0-9a-zA-Z]((\.(?!\.))|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)(?<=[0-9a-zA-Z])@))(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,6}))$/
What in the Sam Hill…???

If this was your response (like mine was when first discovering Regular Expressions), then this article is an attempt to demystify and decrypt this mess and explain it in a nutshell, and serve as an introduction to the topic.

Regular Expressions In a Nutshell:

  • In a Quora article about what the most useful and underrated skill in Computer Programming is, Jaime Potter responded that it’s knowing how to use Regular Expressions well.  He posted a picture diagram breaking down the components of a Regular Expression, which I think will best represent them in a nutshell (all credit goes to him as the source of this image diagram):

  • Typically, you would take a Regular Expression (like the one above) and use a matching method (like the preg_match() function in PHP) to see if a specified string matches the search patterns defined in the Regular Expression.  Common use cases  would be validating user submitted form data (for example, an email or address).

Definitions:

  • Regular Expression:  (paraphrased from Wikipedia) A sequence of characters that define a search pattern typically used to find, find and replace, or validate part or all of a string.   (Also referred to as regex or regexp).
  • Delimiter: A character or symbol that identifies a set of data or string of text as complete and separate.  Used to indicate or designate a group of characters or strings in code that are related to each other or an associated task, and to designate a complete statement or group of statements.  In Regular Expressions, the delimiter is the ‘/‘ at the beginning and end which contain it.  Another example would be the ‘;‘ at the end of a statement (i.e. let x = 5;).

REGULAR EXPRESSIONS CHEAT SHEET:

To get started, see this quick reference sheet that I discovered in the User Contributed Notes section of the preg_match() documentation on PHP.net.  This very helpful comment was made by a user named ‘force at md-t dot organd lists out a cheat-sheet for Regular Expression match patterns:

[abc]     A single character: a, b or c
[^abc]     Any single character but a, b, or c
[a-z]     Any single character in the range a-z
[a-zA-Z]     Any single character in the range a-z or A-Z
    Start of line
    End of line
\A     Start of string
\z     End of string
.     Any single character
\s     Any whitespace character
\S     Any non-whitespace character
\d     Any digit
\D     Any non-digit
\w     Any word character (letter, number, underscore)
\W     Any non-word character
\b     Any word boundary character
(...)     Capture everything enclosed
(a|b)     a or b
a?     Zero or one of a
a*     Zero or more of a
a+     One or more of a
a{3}     Exactly 3 of a
a{3,}     3 or more of a
a{3,6}     Between 3 and 6 of a

options: i case insensitive m make dot match newlines x ignore whitespace in regex o perform #{...} substitutions only once

Typical Use Example of Regular Expressions:

  • Validating user submitted form data, such as usernames, emails and addresses, etc. 

For example, to make sure that an email entered is in the correct format (i.e. ‘[email protected]’),  you can use a Regular Expression to define a search pattern that matches the valid email format, and then check to see if the user submitted email matches the pattern (in PHP, you can do this using the preg_match() method which will return 1 if the string matches the pattern defined in the regex).

Example:

$email = $_POST['user_submitted_email'];

$regexp = '/^[^0-9][_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/';

if (preg_match($regexp, $email)) {
    // process user submitted email;
} else { 
    // Error: email is not in a valid format;
}

Further Reading:

See this one page tutorial on Regular Expressions.  This is from a great site that could be considered a one stop shop for all things Regular Expression related at https://www.regular-expressions.info.  The site breaks down how to use Regular Expressions in a thorough and comprehensive, but digestible manner.

BONUS TIP:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.