{"id":23902832,"url":"https://github.com/mtumilowicz/java11-regex","last_synced_at":"2025-08-08T02:47:12.560Z","repository":{"id":110877464,"uuid":"160921352","full_name":"mtumilowicz/java11-regex","owner":"mtumilowicz","description":"Overview of java regex API.","archived":false,"fork":false,"pushed_at":"2018-12-08T23:24:37.000Z","size":85,"stargazers_count":1,"open_issues_count":0,"forks_count":3,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-06-24T08:47:25.688Z","etag":null,"topics":["java","regex","regex-match","regex-pattern"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mtumilowicz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-12-08T08:53:48.000Z","updated_at":"2025-02-10T00:49:34.000Z","dependencies_parsed_at":"2023-03-13T13:47:29.832Z","dependency_job_id":null,"html_url":"https://github.com/mtumilowicz/java11-regex","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mtumilowicz/java11-regex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtumilowicz%2Fjava11-regex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtumilowicz%2Fjava11-regex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtumilowicz%2Fjava11-regex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtumilowicz%2Fjava11-regex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mtumilowicz","download_url":"https://codeload.github.com/mtumilowicz/java11-regex/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtumilowicz%2Fjava11-regex/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269356058,"owners_count":24403504,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-08T02:00:09.200Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java","regex","regex-match","regex-pattern"],"created_at":"2025-01-04T22:50:47.529Z","updated_at":"2025-08-08T02:47:12.543Z","avatar_url":"https://github.com/mtumilowicz.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.com/mtumilowicz/java11-regex.svg?branch=master)](https://travis-ci.com/mtumilowicz/java11-regex)\n\n# java11-regex\nOverview of java regex API.\n\n_Reference_: https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html  \n_Reference_: https://stackoverflow.com/questions/5319840/greedy-vs-reluctant-vs-possessive-quantifiers  \n_Reference_: https://javascript.info/regexp-groups#example  \n_Reference_: https://stackoverflow.com/questions/6664151/difference-between-b-and-b-in-regex  \n_Reference_: https://stackoverflow.com/questions/4250062/what-is-the-difference-between-and-a-and-z-in-regex\n\n# preface\nA regular expression is a way to describe a pattern in a sequence \nof characters.\n\n## characters\n|Construct   |Matches   |\n|---|---|\n|`x`    |The character x   |\n|`\\\\`   |The backslash character   |\n|`\\t`   |The tab character (`'\\u0009'`)   |\n|`\\n`   |The newline (line feed) character (`'\\u000A'`)   |\n\n## character classes\n|Construct   |Matches   |\n|---|---|\n|`[abc]`   |a, b, or c (simple class)   |\n|`[^abc]`   |Any character except a, b, or c (negation)   |\n|`[a-zA-Z]`   |a through z or A through Z, inclusive (range)   |\n|`[a-d[m-p]]`   |a through d, or m through p: `[a-dm-p]` (union)   |\n|`[a-z\u0026\u0026[def]]`   |d, e, or f (intersection)   |\n|`[a-z\u0026\u0026[^bc]]`   |a through z, except for b and c: `[ad-z]` (subtraction)   |\n|`[a-z\u0026\u0026[^m-p]]`   |a through z, and not m through p: `[a-lq-z]`(subtraction)   |\n\n## predefined character classes\n|Construct   |Matches   |\n|---|---|\n|`.`  |   Any character (may or may not match line terminators)|\n|`\\d` |   A digit: `[0-9]`|\n|`\\D` |   A non-digit: `[^0-9]`|\n|`\\s` |   A whitespace character: `[ \\t\\n\\x0B\\f\\r]`|\n|`\\S` |   A non-whitespace character: `[^\\s]`|\n|`\\w` |   A word character: `[a-zA-Z_0-9]`|\n|`\\W` |   A non-word character: `[^\\w]`|\n\n## boundary matchers\n|Construct   |Matches   |\n|---|---|\n|`^`    |The beginning of a line|\n|`$`    |The end of a line|\n|`\\b`   |A word boundary|\n|`\\B`   |A non-word boundary|\n|`\\A`   |The beginning of the input|\n|`\\z`   |The end of the input|\n\n## linebreak matcher\n`\\R` - Any Unicode linebreak sequence\n\n## greedy, reluctant, possessive\n* A **greedy** quantifier first matches as much as possible and \nthen \"backtracs\" one by one element towards the beginning.\n\n* A **reluctant** or \"non-greedy\" quantifier first matches \nas little as possible then goes one by one element towards \nthe end.\n\n* A **possessive** quantifier is just like the greedy \nquantifier, but it doesn't backtrack.\n\n### greedy quantifiers\n\n|Construct   |Matches   |\n|---|---|\n|`X?`       |X, once or not at all|\n|`X*`       |X, zero or more times|\n|`X+`       |X, one or more times|\n|`X{n}`     |X, exactly n times|\n|`X{n,}`    |X, at least n times|\n|`X{n,m}`   |X, at least n but not more than m times|\n\n### reluctant quantifiers\n\n|Construct   |Matches   |\n|---|---|\n|`X??`        |X, once or not at all|\n|`X*?`        |X, zero or more times|\n|`X+?`        |X, one or more times|\n|`X{n}?`      |X, exactly n times|\n|`X{n,}?`     |X, at least n times|\n|`X{n,m}?`    |X, at least n but not more than m times|\n\n### possessive quantifiers\n\n|Construct   |Matches   |\n|---|---|\n|`X?+`        |X, once or not at all|\n|`X*+`        |X, zero or more times|\n|`X++`        |X, one or more times|\n|`X{n}+`      |X, exactly n times|\n|`X{n,}+`     |X, at least n times|\n|`X{n,m}+`    |X, at least n but not more than m times|\n\n## logical operators\n\n|Construct   |Matches   |\n|---|---|\n|`XY`       |X, once or not at all|\n|X\u0026#124;Y   |Either X or Y|\n|`(X)`        |X, as a capturing group|\n\n* Capturing group example:\n    ```\n    assertTrue(\"gogogogo regex\".matches(\"(go)+\\\\sregex\"));\n    ```\n    starts with \"go\" once or many times then space then \n    regex\n    \n## escaping\nThe backslash character (`'\\'`) serves to introduce \nescaped constructs, as defined in the table above, \nas well as to quote characters that otherwise would \nbe interpreted as unescaped constructs.\n\n## typical invocation\nA typical invocation sequence is:\n```\nPattern p = Pattern.compile(\"a*b\");\nMatcher m = p.matcher(\"aaaaab\");\nboolean b = m.matches();\n```\nor\n```\nboolean b = Pattern.matches(\"a*b\", \"aaaaab\");\n```\nBut note that this method compiles an expression and matches an \ninput sequence against it in a single invocation, so it is\nequivalent to the three statements above, though for \nrepeated matches it is less efficient since it does not \nallow the compiled pattern to be reused.\n\n# project description\n## string regex methods\n* `public boolean matches(String regex)`\n    ```\n    // hour\n    assertTrue(\"1:11\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertTrue(\"9:11\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertTrue(\"0:11\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    \n    // minute\n    assertTrue(\"1:00\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertTrue(\"9:01\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertTrue(\"0:59\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertFalse(\"0:60\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    \n    // hh\n    assertTrue(\"00:00\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertTrue(\"01:00\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertTrue(\"21:00\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    assertFalse(\"30:00\".matches(\"[0-2]?\\\\d:[0-5]\\\\d\"));\n    ```\n    * **remark**: `matches` is equivalent to the three \n    statements below, though for repeated matches it is \n    less efficient since it does not allow the compiled \n    pattern to be reused.\n        ```\n        Pattern p = Pattern.compile(\"[0-2]?\\\\d:[0-5]\\\\d\");\n        Matcher m = p.matcher(\"11:11\");\n        ```\n* `public String[] split(String regex)`\n    ```\n    var namesContainer = \"Michal--|--Marcin--|--Wojtek--|--Ania\";\n    String[] names = namesContainer.split(\"--\\\\|--\");\n   \n    assertThat(names, is(new String[]{\"Michal\", \"Marcin\", \"Wojtek\", \"Ania\"}));\n    ```\n* `public String[] split(String regex, int limit)` - \nlimit is size of returned array\n    ```\n    String[] names = \"Michal--|--Marcin--|--Wojtek--|--Ania\".split(\"--\\\\|--\", 3);\n    \n    assertThat(names, is(new String[]{\"Michal\", \"Marcin\", \"Wojtek--|--Ania\"}));\n    ```\n* `public String replaceAll(String regex, String replacement)`\n    ```\n    String transformed = \"Michal--|--Marcin--|--Wojtek--|--Ania\".replaceAll(\"--\\\\|--\", \"|\");\n    \n    assertThat(transformed, is(\"Michal|Marcin|Wojtek|Ania\"));\n    ```\n* `public String replaceFirst(String regex, String replacement)`\n    ```\n    String transformed = \"Michal--|--Marcin--|--Wojtek--|--Ania\".replaceFirst(\"--\\\\|--\", \"|\");\n    \n    assertThat(transformed, is(\"Michal|Marcin--|--Wojtek--|--Ania\"));\n    ```\n\n## pattern methods\n* `public static Pattern compile(String regex)`\n    * very handy is method `public Predicate\u003cString\u003e asMatchPredicate()`\n    (since java11) which creates a predicate that \n    tests if this pattern matches a given input string\n    ```\n    var emailPattern = Pattern.compile(\"[_.\\\\w]+@([\\\\w]+\\\\.)+[\\\\w]{2,20}\");\n    \n    List\u003cString\u003e emails;\n    \n    try (var reader = Files.newBufferedReader(Path.of(\"emails.txt\"))) {\n        emails = reader.lines()\n                .filter(emailPattern.asMatchPredicate())\n                .collect(Collectors.toList());\n    }\n    \n    assertThat(emails, hasSize(5));\n    assertThat(emails, contains(\n            \"michaltumilowicz@tlen.pl\",\n            \"michal_tumilowicz@tlen.pl\",\n            \"MichalTumilowicz@gmail.com\",\n            \"a.b_cD@a.b.c.d.pl\",\n            \"m12@wp.com.pl\"));\n    ```\n    * where `[_.\\\\w]+@([\\\\w]+\\\\.)+[\\\\w]{2,20}` is:\n        * `[_.\\\\w]+` - either (`_`, `.`, letter/digit) once or more times\n        * then `@`\n        * `([\\\\w]+\\\\.)+` - (letter/digit once or more times with single dot) once or many times\n        * `[\\\\w]{2,20}` - letter/digits twice to twenty times\n* `public static Pattern compile(String regex, int flags)`\n    * useful flags: \n        * `Pattern.MULTILINE` - \n        In multiline mode the expressions `^` and `$` match\n        just after or just before, respectively, a line terminator or the end of\n        the input sequence.  By default these expressions only match at the\n        beginning and the end of the entire input sequence.\n        * `Pattern.CASE_INSENSITIVE`\n* `public static boolean matches(String regex, CharSequence input)`\n    ```\n    public static boolean matches(String regex, CharSequence input) {\n        Pattern p = Pattern.compile(regex);\n        Matcher m = p.matcher(input);\n        return m.matches();\n    }\n    ```\n\n## boundary matchers\n* `\\b` - A word boundary\n    * end\n        ```\n        var txt = \"catmania thiscat thiscatmania\";\n        \n        String replaced = txt.replaceAll(\"cat\\\\b\", \"-\");\n        \n        assertThat(replaced, is(\"catmania this- thiscatmania\"));\n        ```\n    * beginning\n        ```\n        var txt = \"catmania thiscat thiscatmania\";\n        \n        String replaced = txt.replaceAll(\"\\\\bcat\", \"-\");\n        \n        assertThat(replaced, is(\"-mania thiscat thiscatmania\"));\n        ```\n* `\\B` - A non-word boundary\n    * not end\n        ```\n        var txt = \"catmania thiscat thiscatmania\";\n\n        String replaced = txt.replaceAll(\"cat\\\\B\", \"-\");\n\n        assertThat(replaced, is(\"-mania thiscat this-mania\"));\n        ```\n    * not beginning\n        ```        \n        var txt = \"catmania thiscat thiscatmania\";\n        \n        String replaced = txt.replaceAll(\"\\\\Bcat\", \"-\");\n        \n        assertThat(replaced, is(\"catmania this- this-mania\"));\n        ```\n    * neither beginning nor end\n        ```\n        var txt = \"catmania thiscat thiscatmania\";\n        \n        String replaced = txt.replaceAll(\"\\\\Bcat\\\\B\", \"-\");\n        \n        assertThat(replaced, is(\"catmania thiscat this-mania\"));\n        ```\n* `\\A` - The beginning of the input and `\\z` - The end of the input vs\n`^` and `$`\n    * it differs only when `Pattern.MULTILINE` was set:\n        ```\n        var pattern1 = Pattern.compile(\"^Michal$\");\n        var pattern2 = Pattern.compile(\"\\\\AMichal\\\\z\");\n        var pattern_multiline1 = Pattern.compile(\"^Michal$\", Pattern.MULTILINE);\n        var pattern_multiline2 = Pattern.compile(\"\\\\AMichal\\\\z\", Pattern.MULTILINE);\n        \n        var txt = \"Michal\\nMarcin\\nAnia\";\n        \n        // matches\n        assertFalse(pattern1.matcher(txt).matches());\n        assertFalse(pattern2.matcher(txt).matches());\n        assertFalse(pattern_multiline1.matcher(txt).matches());\n        assertFalse(pattern_multiline2.matcher(txt).matches());\n        \n        // find\n        assertFalse(pattern1.matcher(txt).find());\n        assertFalse(pattern2.matcher(txt).find());\n        assertTrue(pattern_multiline1.matcher(txt).find());\n        assertFalse(pattern_multiline2.matcher(txt).find());\n        ```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtumilowicz%2Fjava11-regex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmtumilowicz%2Fjava11-regex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtumilowicz%2Fjava11-regex/lists"}