Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yoav-lavi/melody

Melody is a language that compiles to regular expressions and aims to be more readable and maintainable
https://github.com/yoav-lavi/melody

compiler language melody melodylang regex regexp regular-expression rust

Last synced: 2 days ago
JSON representation

Melody is a language that compiles to regular expressions and aims to be more readable and maintainable

Awesome Lists containing this project

README

        


Melody Logo
Melody Logo



Rust CI


Crates.io


Crates.io


melody playground


melody playground


Melody is a language that compiles to ECMAScript regular expressions, while aiming to be more readable and maintainable.


code example

## Examples

Note: these are for the currently supported syntax and may change

### Batman Theme  try in playground

```rust
16 of "na";

2 of match {
;
"batman";
}

// 🦇🦸‍♂️
```

Turns into

```regex
(?:na){16}(?: batman){2}
```

### Twitter Hashtag  try in playground

```rust
"#";
some of ;

// #melody
```

Turns into

```regex
#\w+
```

### Introductory Courses  try in playground

```rust
some of ;
;
"1";
2 of ;

// classname 1xx
```

Turns into

```regex
[a-zA-Z]+ 1\d{2}
```

### Indented Code (2 spaces)  try in playground

```rust
some of match {
2 of ;
}

some of ;
";";

// let value = 5;
```

Turns into

```regex
(?: {2})+.+;
```

### Semantic Versions  try in playground

```rust
;

option of "v";

capture major {
some of ;
}

".";

capture minor {
some of ;
}

".";

capture patch {
some of ;
}

;

// v1.0.0
```

Turns into

```regex
^v?(?\d+)\.(?\d+)\.(?\d+)$
```

## Playground

You can try Melody in your browser using the [playground](https://melody-playground.vercel.app/)

## Book

Read the book [here](https://yoav-lavi.github.io/melody/book/)

## Install

### Cargo

```sh
cargo install melody_cli
```

### From Source

```sh
git clone https://github.com/yoav-lavi/melody.git
cd melody
cargo install --path crates/melody_cli
```

### Binary

- macOS binaries (`aarch64` and `x86_64`) can be downloaded from the [release page](https://github.com/yoav-lavi/melody/releases)

### Community

- [Brew](https://formulae.brew.sh/formula/melody) (macOS and Linux)
Installation instructions

```sh
brew install melody
```

- [Arch Linux](https://aur.archlinux.org/packages/melody) (maintained by [@ilai-deutel](https://github.com/ilai-deutel))
Installation instructions

1. Installation with an AUR helper, for instance using `paru`:

```bash
paru -Syu melody
```

2. Install manually with `makepkg`:

```bash
git clone https://aur.archlinux.org/melody.git
cd melody
makepkg -si
```

- [NixOS](https://github.com/NixOS/nixpkgs/blob/master/pkgs/by-name/me/melody/package.nix) (maintained by [@jyooru](https://github.com/jyooru))
Installation instructions

1. Declarative installation using `/etc/nixos/configuration.nix`:

```nix
{ pkgs, ... }:
{
environment.systemPackages = with pkgs; [
melody
];
}
```

2. Imperative installation using `nix-env`:

```sh
nix-env -iA nixos.melody
```

## CLI Usage

```
USAGE:
melody [OPTIONS] [INPUT_FILE_PATH]

ARGS:
Read from a file
Use '-' and or pipe input to read from stdin

OPTIONS:
-f, --test-file
Test the compiled regex against the contents of a file

--generate-completions
Outputs completions for the selected shell
To use, write the output to the appropriate location for your shell

-h, --help
Print help information

-n, --no-color
Print output with no color

-o, --output
Write to a file

-r, --repl
Start the Melody REPL

-t, --test
Test the compiled regex against a string

-V, --version
Print version information
```

## Changelog

See the changelog [here](https://github.com/yoav-lavi/melody/blob/main/CHANGELOG.md) or in the [release page](https://github.com/yoav-lavi/melody/releases)

## Syntax

### Quantifiers

- `... of` - used to express a specific amount of a pattern. equivalent to regex `{5}` (assuming `5 of ...`)
- `... to ... of` - used to express an amount within a range of a pattern. equivalent to regex `{5,9}` (assuming `5 to 9 of ...`)
- `over ... of` - used to express more than an amount of a pattern. equivalent to regex `{6,}` (assuming `over 5 of ...`)
- `some of` - used to express 1 or more of a pattern. equivalent to regex `+`
- `any of` - used to express 0 or more of a pattern. equivalent to regex `*`
- `option of` - used to express 0 or 1 of a pattern. equivalent to regex `?`

All quantifiers can be preceded by `lazy` to match the least amount of characters rather than the most characters (greedy). Equivalent to regex `+?`, `*?`, etc.

### Symbols

- `` - matches any single character. equivalent to regex `.`
- `` - matches a space character. equivalent to regex ` `
- `` - matches any kind of whitespace character. equivalent to regex `\s` or `[ \t\n\v\f\r]`
- `` - matches a newline character. equivalent to regex `\n`
- `` - matches a tab character. equivalent to regex `\t`
- `` - matches a carriage return character. equivalent to regex `\r`
- `` - matches a form feed character. equivalent to regex `\f`
- `` - matches a null characther. equivalent to regex `\0`
- `` - matches any single digit. equivalent to regex `\d` or `[0-9]`
- `` - matches a vertical tab character. equivalent to regex `\v`
- `` - matches a word character (any latin letter, any digit or an underscore). equivalent to regex `\w` or `[a-zA-Z0-9_]`
- `` - matches any single latin letter. equivalent to regex `[a-zA-Z]`
- `` - matches any single latin letter or any single digit. equivalent to regex `[a-zA-Z0-9]`
- `` - Matches a character between a character matched by `` and a character not matched by `` without consuming the character. equivalent to regex `\b`
- `` - matches a backspace control character. equivalent to regex `[\b]`

All symbols can be preceeded with `not` to match any character other than the symbol

### Special Symbols

- `` - matches the start of the string. equivalent to regex `^`
- `` - matches the end of the string. equivalent to regex `$`

### Unicode Categories

Note: these are not supported when testing in the CLI (`-t` or `-f`) as the regex engine used does not support unicode categories. These require using the `u` flag.

- `` - any kind of letter from any language
- `` - a lowercase letter that has an uppercase variant
- `` - an uppercase letter that has a lowercase variant.
- `` - a letter that appears at the start of a word when only the first letter of the word is capitalized
- `` - a letter that exists in lowercase and uppercase variants
- `` - a special character that is used like a letter
- `` - a letter or ideograph that does not have lowercase and uppercase variants
- `` - a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)
- `` - a character intended to be combined with another character without taking up extra space (e.g. accents, umlauts, etc.)
- `` - a character intended to be combined with another character that takes up extra space (vowel signs in many Eastern languages)
- `` - a character that encloses the character it is combined with (circle, square, keycap, etc.)
- `` - any kind of whitespace or invisible separator
- `` - a whitespace character that is invisible, but does take up space
- `` - line separator character U+2028
- `` - paragraph separator character U+2029
- `` - math symbols, currency signs, dingbats, box-drawing characters, etc
- `` - any mathematical symbol
- `` - any currency sign
- `` - a combining character (mark) as a full character on its own
- `` - various symbols that are not math symbols, currency signs, or combining characters
- `` - any kind of numeric character in any script
- `` - a digit zero through nine in any script except ideographic scripts
- `` - a number that looks like a letter, such as a Roman numeral
- `` - a superscript or subscript digit, or a number that is not a digit 0–9 (excluding numbers from ideographic scripts)
- `` - any kind of punctuation character
- `` - any kind of hyphen or dash
- `` - any kind of opening bracket
- `` - any kind of closing bracket
- `` - any kind of opening quote
- `` - any kind of closing quote
- `` - a punctuation character such as an underscore that connects words
- `` - any kind of punctuation character that is not a dash, bracket, quote or connectors
- `` - invisible control characters and unused code points
- `` - an ASCII or Latin-1 control character: 0x00–0x1F and 0x7F–0x9F
- `` - invisible formatting indicator
- `` - any code point reserved for private use
- `` - one half of a surrogate pair in UTF-16 encoding
- `` - any code point to which no character has been assigned

These descriptions are from [regular-expressions.info](https://www.regular-expressions.info/unicode.html)

### Character Ranges

- `... to ...` - used with digits or alphabetic characters to express a character range. equivalent to regex `[5-9]` (assuming `5 to 9`) or `[a-z]` (assuming `a to z`)

### Literals

- `"..."` or `'...'` - used to mark a literal part of the match. Melody will automatically escape characters as needed. Quotes (of the same kind surrounding the literal) should be escaped

### Raw

- \`...\` - added directly to the output without any escaping

### Groups

- `capture` - used to open a `capture` or named `capture` block. capture patterns are later available in the list of matches (either positional or named). equivalent to regex `(...)`
- `match` - used to open a `match` block, matches the contents without capturing. equivalent to regex `(?:...)`
- `either` - used to open an `either` block, matches one of the statements within the block. equivalent to regex `(?:...|...)`

### Assertions

- `ahead` - used to open an `ahead` block. equivalent to regex `(?=...)`. use after an expression
- `behind` - used to open an `behind` block. equivalent to regex `(?<=...)`. use before an expression

Assertions can be preceeded by `not` to create a negative assertion (equivalent to regex `(?!...)`, `(?` | `\p{...}` | 🐣 |
| `not <...::...>` | `\P{...}` | 🐣 |
| file watcher | | ❌ |
| multiline groups in REPL | | ❌ |
| `flags: global, multiline, ...` | `/.../gm...` | ❔ |
| (?) | `\#` | ❔ |
| (?) | `\k` | ❔ |
| (?) | `\uYYYY` | ❔ |
| (?) | `\xYY` | ❔ |
| (?) | `\ddd` | ❔ |
| (?) | `\cY` | ❔ |
| (?) | `$1` | ❔ |
| (?) | $\` | ❔ |
| (?) | `$&` | ❔ |
| (?) | `x20` | ❔ |
| (?) | `x{06fa}` | ❔ |
| `any of "a", "b", "c"` \* | `[abc]` | ❓ |
| multiple ranges \* | `[a-zA-Z0-9]` | ❓ |
| regex optimization | | ❓ |
| standard library / patterns | | ❓ |
| reverse compiler | | ❓ |

\* these are expressable in the current syntax using other methods