Regular Expression
Regular expressions are patterns composed of characters used to match character combinations in a string. It is a technique to perform search or match operations in strings.
One useful application of regular expression is the input validations. For example: If we need to check if an input entered by user is a number or not, we can use a regex as /^[0-9]+$/. On the form submission we can match the text entered by user with the regex and validate the input.
In JavaScript regular expressions are objects. The global RegExp object in JS has test and exec functions to match the string with the regex.
Quick Guide to Write Regular Expressions
There are 4 main categories of characters used in regex
Simple Patterns
Simple patterns are used to do a substring match. For example: to match all the strings containing product, we can use the regex as /product/ and it will match the following strings
http://xyz.com/product/radio
http://xyz.com/product/tv
Special Characters
- Wildcard (. dot)
The dot is a wildcard character that matches with any character(letter, digit, whitespace, everything)
Pro Tip: The regex .* will tell the computer that any character can be used any no. of times.
Note: You may notice that this actually overrides the matching of the period character, so in order to specifically match a period, you need to escape the dot by using a slash . accordingly. It will match (.) dot in plain text.
- Quantifiers: * + and {}
Plus(+) character is used to repeat the preceding character at least 1 or more time
Example: The regular expression ab+c will match with abc, abbc,abbc etc.
Asterisk(*) symbol is used to repeat the preceding character for 0 or more times.
Example: The regular expression ab*c will give ac, abc, abbc, abbbc etc.
Curly braces{} are used to repeat preceding character as many times as the value inside this bracket
Example: {2} means that the preceding character is to be repeated twice.{min,} means the preceding character is matches min or more
times. {min,max} means that the preceding character is repeated at least min & at most max times.
- optional(?)
A question mark means the last character is optional.
Example : We may write the format for document file as – “docx?.The ‘?’ tells the computer that x may or may not be present in the name of file format.
- ^ and $ - Start and ending position for match
^ means the start of the string
Example: ^\d{3} will match with patterns like “901” in “901-333-“.
$ means the end of string
Example: \d{3}$ will match with patterns like “-333” in “-901-333”.
Pro Tip To match an exact particular word use word inside ^ and $ i.e. ^success$
- Pipe (|)
Matches any one element separated by vertical bar(|) character. It’s a logical OR to denote different possible sets of characters.
Example: th(e|is|at) will match words - the, this and that.
- Character Classes
A character class is used to match the most basic element of a language like a letter, digits, a space or symbol etc
/s: matches any whitespace characters such as space and tab
/S: matches any non-whitespace characters
/d: matches any digit character.The preceding slash distinguishes it from the simple d character and indicates that it is a meta character.
/D: matches any non-digit characters
/w: matches any word character (basically alpha-numeric)
/W: matches any non-word character
/b: matches any word boundary (this would include spaces, dashes, commas, semi-colons, etc)
[set_of_characters]: matches any single character in set_of_characters. By default, the match is case-sensitive.
Example: [abc] will match characters a,b and c in any string.
[^set_of_characters]: Matches any single character that is not in set_of_characters. By default, the match is case sensitive.
Example: [^abc] will match any character except a,b,c.
Escaping
To match the actual ‘+’, ‘.’ etc characters, add a backslash( \ ) before that character. This will make sure that the character is treated a simple character and not a special character
Example: \d+[\+-x\*]\d+ will match patterns like “2+2” and “39” in “(2+2) * 39”.
Parenthesis
A set of different symbols of a regular expression can be grouped together to act as a single unit and behave as a block, by wrapping the regular expression in the parenthesis( ).
Example : ([A-Z]\w+) contains two different elements of the regular expression combined together. This expression will match any pattern
containing uppercase letter followed by any character.
/products/men/cycles/ /products/women/cycles/ /products/kids/cycles/
Regex: ^\/products\/(men|women|kids)\/cycles\/
Regular expression cheat sheet
Useful Resources
- Test your Regex on https://regex101.com/
- Understand complex regex using https://regexper.com/ The website creates Railroad Diagrams out of regex to explain them in a straight-forward way
Regex in JavaScript
- Creating Regex
let patt = /abc/i; // {syntax = /pattern/modifiers}
abc is a pattern (to be used in a search). i is a modifier (modifies the search to be case-insensitive).
Regex can also be created by calling the constructor function of RegExp class.
let re = new RegExp('abc');
- Regular Expression operations using RegExp object Menthods
- test: returns a boolean value by searching a string for a pattern.
var patt = /e/;
patt.test("The best things in life are free!");
output: true
- exec: searches a string for a specified pattern, and returns the found text as an object. If no match is found, it returns an empty (null) object.The following example searches a string for the character “e”:
var obj = /e/.exec("The best things in life are free!");
document.getElementById("demo").innerHTML =
"Found " + obj[0] + " in position " + obj.index + " in the text: " + obj.input;
Output = Found e in position 2 in the text: The best things in life are free!
- Regex Operations using String object Methods
search:
Uses an expression to search for a match,and return the position of match.
function myfun(){
var str = "Check the data"
var p = str.search(/data/i);
console.log(p);
var p = str.search(/Data/);
console.log(p);
}
myfun();
Output : 10
replace:
returns a modified string where the pattern is replaced.
function myFun() {
// input string
var str = "Check your email-address!";
// replacing with modifier i
var txt = str.replace(/email-address/i, "Phone no.");
console.log(txt);
}
myFun();
Output: Check your Phone no.!
match:
returns an array containing all of the matches, including capturing groups, or null if no match is found.
matchAll:
returns an iterator containing all of the matches, including capturing groups.
Go through this guide on regular expressions in JavaScript.