Tag Archives: tld

[Java] regular expression to match emails and domains works with gTLD

Recently we got many new top-level domains like .international, .berlin etc, so older regular expressions are no longer valid.
Below you can see a list of valid domains; an address may contain . (dot), – (minus) and .subdomain (domain may contain many sub-domains). A domain must end with a letter, it cannot be . (dot) or a number.

  • site.international
  • site.international.com
  • site.ac.uk
  • site.org
  • site.babia-gora.pl
  • site.pl

Email address may contain a delimiter like – or + and . (dot), it might contain numbers and for sure contain letters. Below is a list of valid email addresses:

  • info9@agilob.net
  • in-fo@site.international
  • in.fo@site.ac.uk
  • in+fo@site.babia-gora.pl
  • 1in-+fo2@site.babia-gora.pl

A regular expression to match gTLD domains can look like this:

^[A-Za-z0-9][A-Za-z0-9/.-]*\.*[A-Za-z]$

and regular expression to match emails in domains above like this:

[A-Za-z0-9-+/.]*@^[A-Za-z0-9][A-Za-z0-9/.-]*\.*[A-Za-z]$

Code to test those regexpes:

package regexp;
 
import java.util.ArrayList;
 
public class RegExpMatcher {
 
    public static void main(String[] args) {
        ArrayList domains = new ArrayList<>();
        domains.add("site.international");
        domains.add("site.international.com");
        domains.add("site.ac.uk");
        domains.add("site.org");
        domains.add("site.babia-gora.pl");
        domains.add("site.pl");
 
        boolean matches;
        System.out.println("Matching domains: ");
 
        for (int i = 0; i < domains.size(); i++) {
            matches = domains.get(i).matches("^[A-Za-z0-9][A-Za-z0-9.-]*\.*[A-Za-z]$");
            if (matches == true) {
                System.out.println(domains.get(i) + " matches");
            } else {
                System.err.println(domains.get(i) + " doesn't match");
            }
        }
 
        System.out.println("nnMatching email addresses: ");
 
        ArrayList emails = new ArrayList&lt;&gt;();
        emails.add("info9@agilob.net");
        emails.add("in-fo@site.international");
        emails.add("in.fo@site.ac.uk");
        emails.add("in+fo@site.babia-gora.pl");
        emails.add("1in-+fo2@site.babia-gora.pl");
 
        for (int i = 0; i < emails.size(); i++) {
            matches = emails.get(i).matches("[A-Za-z0-9-+.]*@[A-Za-z0-9][A-Za-z0-9.-]*\.*[A-Za-z]$");
            if (matches == true) {
                System.out.println(emails.get(i) + " matches");
            } else {
                System.err.println(emails.get(i) + " doesn't match");
            }
        }
    }
}

Output:

Matching domains: 
site.international matches
site.international.com matches
site.ac.uk matches
site.org matches
site.babia-gora.pl matches
site.pl matches


Matching email addresses: 
info9@agilob.net matches
in-fo@site.international matches
in.fo@site.ac.uk matches
in+fo@site.babia-gora.pl matches
1in-+fo2@site.babia-gora.pl matches