Skip to content

提取特定字符

提取特定字符,例如从123abc123中提取出abc。 正则表达式需要使用小括号。

提取特定符号内的字符

例如从[[Rust]]中提取出Rust。前后的中括号当作是标记物。

需要类似\\[\\[(.*?)\\]\\]的正则表达式。中括号需要转义。别忘了最里面的小括号。

调用matcher.find()之后,调用matcher.group()可得到包含两边符号的字符串。 调用matcher.group(1)可得到内部的字符串。

中括号示例1

测试代码。注意转义符号的个数。

    public static void main(String[] args) {
        final String[] regs = {"\\[[(.*?)]\\]", "\\[\\[(.*?)\\]\\]"};
        final String[] strs = {"[[Rust]]", "[[[Fisher]]]", "[[abc]", "[abc]]", "[[abc", "[[[[abc]]"};
        for (String s : strs) {
            for (String r : regs) {
                extractTextFrom(r, s);
            }
            System.out.println();
        }
    }

    private static void extractTextFrom(String reg, String txt) {
        if (!txt.matches(reg)) {
            System.out.println(txt + " not matches " + reg + "\n");
            return;
        }
        Pattern pattern = Pattern.compile(reg);
        Matcher matcher = pattern.matcher(txt);
        while (matcher.find()) {
            System.out.println(txt + " for " + reg + " group()  == " + matcher.group());
            System.out.println(txt + " for " + reg + " group(1) ==   " + matcher.group(1));
        }
    }

运行结果

[[Rust]] not matches \[[(.*?)]\]

[[Rust]] for \[\[(.*?)\]\] group()  == [[Rust]]
[[Rust]] for \[\[(.*?)\]\] group(1) ==   Rust

[[[Fisher]]] not matches \[[(.*?)]\]

[[[Fisher]]] for \[\[(.*?)\]\] group()  == [[[Fisher]]
[[[Fisher]]] for \[\[(.*?)\]\] group(1) ==   [Fisher

[[abc] not matches \[[(.*?)]\]

[[abc] not matches \[\[(.*?)\]\]


[abc]] not matches \[[(.*?)]\]

[abc]] not matches \[\[(.*?)\]\]

根据规则提取

根据规则提取出字符串。需要用小括号把规则包围起来。

提取非字母包围的字母

从 [非字母 字母 非字母] 中提取出字母串。使用正则表达式"[^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,}"。 - [a-zA-Z]表示26个字母(包括大小写) - [^a-zA-Z]表示26个字母之外 - [^a-zA-Z]{0,} 表示0次或不限次数匹配字母;{0,} 可用星号*代替 - ([a-zA-Z]{1,})小括号表示提取至少出现1次的字母;{1,}可用加号+代替 - 正则表达式中只有1个小括号,我们可以调用matcher.group(1)获取提取结果

测试代码

    public static void main(String[] args) {
        final String[] regs = {
                "[^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,}"
        };
        final String[] textArr = {"!", "!!!", "?!Rust", "Fisher:'", "?!thanks!?", "<<how>>"};
        for (String r : regs) {
            for (String s : textArr) {
                extractTextFrom(r, s);
                System.out.println();
            }
        }
    }

    private static void extractTextFrom(String reg, String txt) {
        if (!txt.matches(reg)) {
            System.out.println(txt + " not matches " + reg + "\n");
            return;
        }
        Pattern pattern = Pattern.compile(reg);
        Matcher matcher = pattern.matcher(txt);
        while (matcher.find()) {
            System.out.println(txt + " for " + reg + " group()  == " + matcher.group());
            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println(txt + " for " + reg + " group(" + i + ")  == " + matcher.group(i));
            }
        }
    }

运行结果

! not matches [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,}


!!! not matches [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,}


?!Rust for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group()  == ?!Rust
?!Rust for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(0)  == ?!Rust
?!Rust for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(1)  == Rust

Fisher:' for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group()  == Fisher:'
Fisher:' for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(0)  == Fisher:'
Fisher:' for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(1)  == Fisher

?!thanks!? for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group()  == ?!thanks!?
?!thanks!? for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(0)  == ?!thanks!?
?!thanks!? for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(1)  == thanks

<<how>> for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group()  == <<how>>
<<how>> for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(0)  == <<how>>
<<how>> for [^a-zA-Z]{0,}([a-zA-Z]{1,})[^a-zA-Z]{0,} group(1)  == how

另一个例子