Why Use Regular Expressions for Searching

Why Use Regular Expressions for Searching

Regular expressions (regexes) provide powerful searching assistance. They're used to find patterns in content and expand your "find" tools beyond just looking for explicit characters.

For example, let's say you have to find any (American) phone number in a document or on a web page. You don't know all the possible phone numbers to look for, so your best bet with just a CTRL/CMD+F search might be "Phone" or "801," but that won't help you find all the phone numbers.

Instead use a regular expression to define a pattern rather than just a guess. Open up a regex-powered search bar (I usually use my mark-up editor), and enter the pattern you're searching for, using the regex "language."

\(?\d{3}.?\W?\d{3}.\d{4}

The regex above will spot any of the following phone numbers found in the live text:

801-787-7976
250-210-2031
801 898-7610
(801)787-6404
(801) 787-6404
801.801.8015

This works because the regular expression is looking for a pattern: It could start with a paren, but not necessarily. There will be 3 numbers in a row, followed by some character (another paren, perhaps a hyphen or a period) and possibly, but not necessarily a "whitespace character." Then another row of numbers, some character and a row of four final numbers.

Regexes can be tricky to learn at first, but once you've memorized a couple patterns, it starts to become easier and you'll find uses for how they can improve efficiency in your work. I learned a lot from testing out my guesses and reading the hints on regex101.com.

Learn by Doing

Here are a couple challenges if you want to practice:

  • Find a double space after a period.
  • Find links without a trailing slash (links that don't have "/" at the very end).
  • Find all mailto: links.
  • Find repeated words that are next to each other (e.g., to to, and and, this this).