Java Regex Tutorial | Regular Expressions

Java Regular Expressions tutorial shows how to parse text in Java using regular expressions. Java provides the java.util.regex package for pattern matching with regular expressions.

Table of Contents

  1. Regular Expressions
  2. java.util.regex package
  3. Character classes
  4. Predefined character classes
  5. Java Simple Regular Expression
  6. Java Alphanumeric Regex Example
  7. Java Regex Anchors
  8. Java Regex alternations
  9. Regular Expression Phone Number validation
  10. Java Regex for Matching any Currency Symbol Example
  11. Java Regex capturing groups
  12. Java case-insensitive regular expression
  13. Java Regex email example
  14. Java Regex to check Min/Max Length of Input Text

1. Regular Expressions

Regular expressions are used for text searching and more advanced text manipulation. 
Java has built-in API for working with regular expressions; it is located in java.util.regex package.
The following table shows a couple of regular expression strings.
RegexMeaning
.  Matches any single character.
?  Matches the preceding element once or not at all.
+  Matches the preceding element once or more times.
*  Matches the preceding element zero or more times.
^  Matches the starting position within the string.
$  Matches the ending position within the string.
|    Alternation operator.
[abc]  Matches a or b, or c.
[a-c]  Range; matches a or b, or c.
[^abc]  Negation matches everything except a, or b, or c.
\s  Matches white space character.
\w                     Matches a word character; equivalent to [a-zA-Z_0-9]

2. java.util.regex package

The Matcher and Pattern classes provide the facility of Java regular expression. The java.util.regex package provides the following classes and interfaces for regular expressions.
  • MatchResult interface
  • Matcher class
  • Pattern class
  • PatternSyntaxException class
Pattern is a compiled representation of a regular expression.

Matcher is an engine that interprets the pattern and performs match operations against an input string. Matcher has methods such as find(), matches(), end() to perform matching operations. When there is an exception parsing a regular expression, Java throws a PatternSyntaxException.
Read more about these classes and interfaces at https://docs.oracle.com/javase/8/docs/api/index.html?java/util/regex/package-summary.html

3. Character Classes

The following table summarizes the regex for character classes:

ConstructDescription
[abc]  a, b, or c (simple class)
[^abc]  Any character except a, b, or c (negation)
[a-zA-Z]  a through z, or A through Z, inclusive (range)
[a-d[m-p]]  a through d, or m through p: [a-dm-p] (union)
[a-z&&[def]]   d, e, or f (intersection)
[a-z&&[^bc]]  a through z, except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]]               a through z, and not m through p: [a-lq-z] (subtraction)

4. Predefined Character Classes

The Pattern API contains a number of useful predefined character classes, which offer convenient shorthands for commonly used regular expressions:
ConstructDescription
.    Any character (may or may not match line terminators)
\d    A digit: [0-9]
\D    A non-digit: [^0-9]
\s    A whitespace character: [ \t\n\x0B\f\r]
\S    A non-whitespace character: [^\s]
\w    A word character: [a-zA-Z_0-9]
\W                   A non-word character: [^\w]

5. Java Simple Regular Expression

In the example, we have ten words in a list. We check which words match the .even regular expression.
package net.javaguides.corejava.regex;

import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexExample {

    public static void main(String[] args) {

        List < String > words = Arrays.asList("One", "Two",
            "Three", "Four", "Five", "Six", "Seven", "Maven", "Amen", "eleven");

        Pattern p = Pattern.compile(".even");

        for (String word: words) {

            Matcher m = p.matcher(word);

            if (m.matches()) {
                System.out.printf("%s -> matches%n", word);
            } else {
                System.out.printf("%s -> does not match%n", word);
            }
        }
    }
}
Output:
One -> does not match
Two -> does not match
Three -> does not match
Four -> does not match
Five -> does not match
Six -> does not match
Seven -> matches
Maven -> does not match
Amen -> does not match
eleven -> does not match
We compile the pattern. The dot (.) metacharacter stands for any single character in the text.

6. Java Alphanumeric Regex Example

This Java example demonstrates how to write a regular expression to validate user input in such a way that it allows only alphanumeric characters. Alphanumeric characters are all alphabets and numbers i.e. letters A–Z, a–z, and digits 0–9.
package net.javaguides.corejava.regex;

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaAlphanumericRegex {
    public static void main(String[] args) {
        List < String > names = new ArrayList < String > ();

        names.add("JavaGuides");
        names.add("JavaGuides123");
        names.add("JavaGuides-----////"); //Incorrect

        String regex = "^[a-zA-Z0-9]+$";

        Pattern pattern = Pattern.compile(regex);

        for (String name: names) {
            Matcher matcher = pattern.matcher(name);
            System.out.println(matcher.matches());
        }
    }
}
Output:
true
true
false

7. Java Regex Anchors

Anchors match the positions of characters inside a given text. In the next example, we look if a string is located at the beginning of a sentence.
In the below example, we have three sentences. The search pattern is ^Prabhas. The pattern checks if the "Prabhas" string is located at the beginning of the text. Prabhas.$ would look for "Prabhas" at the end of the sentence.
package net.javaguides.corejava.regex;

import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexAnchorExample {

    public static void main(String[] args) {

        List < String > sentences = Arrays.asList("I am looking for Prabhas",
            "Prabhas is an Actor",
            "Mahesh and Prabhas are close friends");

        Pattern p = Pattern.compile("^Prabhas");

        for (String word: sentences) {

            Matcher m = p.matcher(word);

            if (m.find()) {
                System.out.printf("%s -> matches%n", word);
            } else {
                System.out.printf("%s -> does not match%n", word);
            }
        }
    }
}
Output:
I am looking for Prabhas -> does not match
Prabhas is an Actor -> matches
Mahesh and Prabhas are close friends -> does not match

8. Java Regex alternations

The alternation operator | enables to create a regular expression with several choices.
package net.javaguides.corejava.regex;

import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexAlternation {

    public static void main(String[] args) {

        List < String > users = Arrays.asList("Ramesh", "Tom", "Tony",
            "Rocky", "John", "Prabhas");

        Pattern p = Pattern.compile("Ramesh|Tom|Prabhas|Rocky");

        for (String user: users) {

            Matcher m = p.matcher(user);

            if (m.matches()) {
                System.out.printf("%s -> matches%n", user);
            } else {
                System.out.printf("%s -> does not match%n", user);
            }
        }
    }
}
Output:
Ramesh -> matches
Tom -> matches
Tony -> does not match
Rocky -> matches
John -> does not match
Prabhas -> matches

9. Regular Expression Phone Number validation

Validating a phone number using regular expression is tricky because the phone number can be written in many formats and can have extensions also.
For example, here are some of the common way of writing phone numbers:
1234567890 
123-456-7890 
123-456-7890 x1234 
123-456-7890 ext1234 
(123)-456-7890 
123.456.7890 
123 456 7890
Here is the Java program to demonstrates how to use a regular expression to validate phone numbers:
package net.javaguides.corejava.regex;

public class CheckPhoneExample {

    public static void main(String[] args) {
        System.out.println("Phone number 1234567890 validation result: " + validatePhoneNumber("1234567890"));
        System.out.println("Phone number 123-456-7890 validation result: " + validatePhoneNumber("123-456-7890"));
        System.out.println(
            "Phone number 123-456-7890 x1234 validation result: " + validatePhoneNumber("123-456-7890 x1234"));
        System.out.println(
            "Phone number 123-456-7890 ext1234 validation result: " + validatePhoneNumber("123-456-7890 ext1234"));
        System.out.println("Phone number (123)-456-7890 validation result: " + validatePhoneNumber("(123)-456-7890"));
        System.out.println("Phone number 123.456.7890 validation result: " + validatePhoneNumber("123.456.7890"));
        System.out.println("Phone number 123 456 7890 validation result: " + validatePhoneNumber("123 456 7890"));
    }

    private static boolean validatePhoneNumber(String phoneNo) {
        // validate phone numbers of format "1234567890"
        if (phoneNo.matches("\\d{10}"))
            return true;
        // validating phone number with -, . or spaces
        else if (phoneNo.matches("\\d{3}[-\\.\\s]\\d{3}[-\\.\\s]\\d{4}"))
            return true;
        // validating phone number with extension length from 3 to 5
        else if (phoneNo.matches("\\d{3}-\\d{3}-\\d{4}\\s(x|(ext))\\d{3,5}"))
            return true;
        // validating phone number where area code is in braces ()
        else if (phoneNo.matches("\\(\\d{3}\\)-\\d{3}-\\d{4}"))
            return true;
        // return false if nothing matches the input
        else
            return false;

    }
}
Output:
Phone number 1234567890 validation result: true
Phone number 123-456-7890 validation result: true
Phone number 123-456-7890 x1234 validation result: true
Phone number 123-456-7890 ext1234 validation result: true
Phone number (123)-456-7890 validation result: true
Phone number 123.456.7890 validation result: true
Phone number 123 456 7890 validation result: true

10. Java Regex for Matching any Currency Symbol Example

This Java Regex example demonstrates how to match all available currency symbols, e.g. $ Dollar, € Euro, ¥ Yen, in some text content.
package net.javaguides.corejava.regex;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexCurrencySymbol {
    public static void main(String[] args) {

        String content = "Let's find the symbols or currencies: $ Dollar, € Euro, ¥ Yen";

        String regex = "\\p{Sc}";

        Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(content);
        while (matcher.find()) {
            System.out.print("Start index: " + matcher.start());
            System.out.print(" End index: " + matcher.end() + " ");
            System.out.println(" : " + matcher.group());
        }
    }
}
Output:
Start index: 39 End index: 40  : $
Start index: 49 End index: 50  : €
Start index: 57 End index: 58  : ¥

11. Java Regex capturing groups

The capturing group's technique allows us to find out those parts of the string that match the regular pattern. The mather's group() method returns the input subsequence captured by the given group during the previous match operation.
This example prints all HTML tags from the supplied string by capturing a group of characters.
package net.javaguides.corejava.regex;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexGroups {

    public static void main(String[] args) {

        String content = "<p>The <code>Pattern</code> is a compiled " +
            "representation of a regular expression.</p>";

        Pattern p = Pattern.compile("(</?[a-z]*>)");

        Matcher matcher = p.matcher(content);

        while (matcher.find()) {

            System.out.println(matcher.group(1));
        }
    }
}
Output:
<p>
<code>
</code>
</p>

12. Java case-insensitive regular expression

By setting the Pattern.CASE_INSENSITIVE flag, we can have case-insensitive matching.
The example performs a case-insensitive matching of the regular expression.
package net.javaguides.corejava.regex;

import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexCaseInsensitive {

    public static void main(String[] args) {

        List < String > users = Arrays.asList("dog", "Dog", "DOG", "Doggy");

        Pattern p = Pattern.compile("dog", Pattern.CASE_INSENSITIVE);

        users.forEach((user) - > {

            Matcher m = p.matcher(user);

            if (m.matches()) {
                System.out.printf("%s matches%n", user);
            } else {
                System.out.printf("%s does not match%n", user);
            }
        });
    }
}
Output:
dog matches
Dog matches
DOG matches
Doggy does not match

13. Java Regex email example

In the following example, we create a regex pattern for checking email addresses. This example provides only one possible solution.
package net.javaguides.corejava.regex;

import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JavaRegexEmail {

    public static void main(String[] args) {

        List < String > emails = Arrays.asList("[email protected]",
            "tom@yahoocom", "34234sdfa#2345", "[email protected]");

        String regex = "^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\\.[a-zA-Z.]{2,18}$";

        Pattern p = Pattern.compile(regex);

        for (String email: emails) {

            Matcher m = p.matcher(email);

            if (m.matches()) {
                System.out.printf("%s matches%n", email);
            } else {
                System.out.printf("%s does not match%n", email);
            }
        }
    }
}
Output:
[email protected] matches
tom@yahoocom does not match
34234sdfa#2345 does not match
[email protected] matches

14. Java Regex to check Min/Max Length of Input Text

The following regular expression ensures that text is between 1 and 10 characters long, and additionally limits the text to the uppercase letters A–Z. You can modify the regular expression to allow any minimum or maximum text length or allow characters other than A–Z.
package net.javaguides.corejava.regex;

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMinMaxLength {
    public static void main(String[] args) {
        List < String > names = new ArrayList < String > ();

        names.add("RAMESH");
        names.add("JAVAGUIDES");
        names.add("RAMESHJAVAGUIDES"); //Incorrect
        names.add("RAMESH890"); //Incorrect

        String regex = "^[A-Z]{1,10}$";

        Pattern pattern = Pattern.compile(regex);

        for (String name: names) {
            Matcher matcher = pattern.matcher(name);
            System.out.println(matcher.matches());
        }
    }
}
Output:
true
true
false
false

References

Comments

Post a Comment

Leave Comment