预定义字符类

Pattern API 包含许多有用的 预定义字符类,它们为常用的正则表达式提供了方便的简写形式:

Construct Description
. 任何字符(可能匹配行终止符,也可能不匹配行终止符)
\d 一位数字:[0-9]
\D 非数字:[^0-9]
\s 空格字符:[ \t\n\x0B\f\r]
\S 非空白字符:[^\s]
\w Literals 字符:[a-zA-Z_0-9]
\W 非 Literals 字元:[^\w]

在上表中,左栏中的每个构造都是右栏中的字符类的简写。例如,\d表示数字范围(0-9),而\w表示单词字符(任何小写字母,任何大写字母,下划线字符或任何数字)。尽可能使用 预定义的类。它们使您的代码更易于阅读,并消除了格式错误的字符类所导致的错误。

以反斜杠开头的构造称为转义构造。我们在String Literals部分中预览了转义的构造,其中提到了反斜杠以及\Q\E的引用。如果在字符串Literals 中使用转义的构造,则必须在反斜杠之前加上另一个反斜杠,才能编译字符串。例如:

private final String REGEX = "\\d"; // a single digit

在此示例中,\d是正则表达式;要编译代码,需要额外的反斜杠。测试工具直接从Console读取表达式,因此不需要多余的反斜杠。

以下示例演示了 预定义字符类的用法。

Enter your regex: .
Enter input string to search: @
I found the text "@" starting at index 0 and ending at index 1.

Enter your regex: . 
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.

Enter your regex: .
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \d
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.

Enter your regex: \d
Enter input string to search: a
No match found.

Enter your regex: \D
Enter input string to search: 1
No match found.

Enter your regex: \D
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \s
Enter input string to search:  
I found the text " " starting at index 0 and ending at index 1.

Enter your regex: \s
Enter input string to search: a
No match found.

Enter your regex: \S
Enter input string to search:  
No match found.

Enter your regex: \S
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \w
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \w
Enter input string to search: !
No match found.

Enter your regex: \W
Enter input string to search: a
No match found.

Enter your regex: \W
Enter input string to search: !
I found the text "!" starting at index 0 and ending at index 1.

在前三个示例中,正则表达式只是表示“任何字符”的.(“点”元字符)。因此,在所有三种情况下(随机选择的@字符,数字和字母),匹配均成功。其余示例均使用来自预定义字符类表的单个正则表达式构造。您可以参考此表找出每个匹配项的逻辑:

另外,大写字母表示相反的含义:

首页