An open API service indexing awesome lists of open source software.

https://github.com/crispengari/re-python

💎 Regular expression in python.
https://github.com/crispengari/re-python

nlp nlp-python pyth python python3 re regexp regular-expression text-classification text-clustering text-processing

Last synced: 3 months ago
JSON representation

💎 Regular expression in python.

Awesome Lists containing this project

README

          

## Regexp Python

Regular expression in python.

> Regular expression are a powerful feature for text processing, they match patterns in a sentence.





cover

### 1. Special characters

> The following are special characters in regular expressions.

```
^ - matches the beginning of the sentence
$ - matches the end of the sentence
. - matches all characters except new line
\ - escape special characters
A|B - matches expression A or B
+ - matches at least one character
* - matches 0 or more characters
? - optional, matches 1 or 0 characters
{m} - matches the expression to it's left exactly m times
{m,n} - matches the expression to the left m to n times but not less
{m, n}? - matches the expression to it's left m times and n ignores
```

### 2. Character classes (Special Sequence)

> The following are special sequences in regular expressions

```
\w - matches alphanumeric characters which is A-Z,a-z,0-9 and _ .
\d - matches digits 0-9
\s - matches white space characters \t, \n, \r.
\S - matches none white space characters
\b - matches the boundary at the start or end of the word that is \w and \W.
\B - matches where \b does not that is not word boundaries.
\A - matches the expression to it's right at the absolute start of string wether in single or multiple line.
\Z - Matches the expression to it's left at the absolute end of a string wether in single or multiple line.

```

### 3. Sets

> The following are sets in regular expression

```
[] - contains a set of characters to match
[akb] - matches a, k or b
[a-z] - matches any character between a and z inclusively
[a\-z] - matches a, -, or z
[a-] - matches a or -
[-a] - matches a
[a-z0-9]- matches characters from a to z and also numbers from 0-9
[(+*)] - special characters becomes literal and this matches (,+, * and ).
[^ab5] - adding ^ in the character set bracket at the beginning will exclude all the characters in the character set, so thi will match all characters except a, b or 5
```

### 4. Groups

> The following are groups in regular expression.

```
() - Matches the expression inside the parenthesis and groups it.
(?) - ? inside the parenthesis acts like extension notation. Its meaning depends on the character immediately to its right.
(?PAB) - Matches the expression AB, and it can be accessed with the group name.
(?aiLmsux) - Here, a, i, L, m, s, u, and x are flags:

a — Matches ASCII only
i - Ignore case
L - Locale dependent
m - Multi-line
s - Matches all
u - Unicode characters
x - Verbose

(?:A) - matches the expression represented by A, but unlike (?PAB), it cannot be retrieved afterwards.

(?#...) - A comment. Contents are for us to read, not for matching.

A(?=B) - Lookahead assertion. This matches the expression A only if it is followed by B

A(?!B) - Negative lookahead assertion. This matches the expression A only if it is not followed by B.

(?<=B)A - Positive lookbehind assertion. This matches the expression A only if B is immediately to its left. This can only matched fixed length expressions.

(? The following are the popular regular expression function in python.

#### 5.1 `re.findall(A, B)`

> Matches all instances of an expression A in a string B and returns them in a list

#### 5.2 `re.search(A, B)`

> Matches the first instance of an expression A in a string B, and returns it as a re match object.

#### 5.3 `re.split(A, B)`

> Split a string B into a list using the delimiter A

#### 5.4 `re.sub(A, B, C)`

> Replace A with B in the string C.

#### 5.5 `re.match(A, B)`

> Returns the first occurrence of A in B

#### 5.7 `re.compile(A, B)`

> Flags should be used first in the expression string.

#### 5.8 re.finditer(pattern, string, flags=0)

> Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string.

### Code Examples

The code examples are found in the `re.ipynb` file