https://github.com/bicycle1885/fmt.jl
Python-style format strings for Julia
https://github.com/bicycle1885/fmt.jl
formatter julia string
Last synced: 28 days ago
JSON representation
Python-style format strings for Julia
- Host: GitHub
- URL: https://github.com/bicycle1885/fmt.jl
- Owner: bicycle1885
- License: mit
- Created: 2021-04-04T01:03:31.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-12-31T16:37:33.000Z (over 2 years ago)
- Last Synced: 2025-04-06T10:42:29.863Z (about 1 month ago)
- Topics: formatter, julia, string
- Language: Julia
- Homepage:
- Size: 182 KB
- Stars: 39
- Watchers: 5
- Forks: 3
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
**This package is still under active development. The API may change anytime. Almost no error checks. Only handful basic types are supported.**
# Fmt.jl ― Python-style format strings for Julia
Fmt.jl provides a Python-style format language.
It is an alternative of Printf.jl and string utility functions in Base.
Formats are constructed by a [non-standard string literal](https://docs.julialang.org/en/v1/manual/strings/#non-standard-string-literals) prefixed by `f`, called f-strings.
In the following example, a part of an f-string surrounded by curly braces `{` `}` is replaced with a formatted floating-point number:
```
julia> using Fmtjulia> pi = float(π)
3.141592653589793julia> f"π ≈ {$pi:.4f}"
"π ≈ 3.1416"
```The goals of Fmt.jl are:
- **Full-fledged**: It supports almost complete features of Python's format strings.
- **Performant**: The formatter is much faster than `string` and other functions.
- **Lightweight**: It has no dependencies except the Base library.## Overview
The `@f_str` macro (or f-string) is the only exported binding from the `Fmt` module.
This macro can interpolate variables into a string with format specification.
Interpolation happens inside replacement fields surrounded by a pair of curly braces `{}`; other parts of an f-string are treated as ordinal strings.
A replacement field usually has an argument `ARG` and a specification `SPEC` separated by a colon: `{ARG:SPEC}`, although both of them can be omitted.Let's see some examples.
```julia
# load @f_str
using Fmt# default format
x = 42
f"x is {$x}." == "x is 42."# binary, octal, decimal, and hexadecimal format
f"{$x:b}" == "101010"
f"{$x:o}" == "52"
f"{$x:d}" == "42"
f"{$x:x}" == "2a"
f"{$x:X}" == "2A"# format with a minimum width
f"{$x:4}" == " 42"
f"{$x:6}" == " 42"# left, center, and right alignment
f"{$x:<6}" == "42 "
f"{$x:^6}" == " 42 "
f"{$x:>6}" == " 42"
f"{$x:*<6}" == "42****"
f"{$x:*^6}" == "**42**"
f"{$x:*>6}" == "****42"# dynamic width
n = 6
f"{$x:<{$n}}" == "42 "
f"{$x:^{$n}}" == " 42 "
f"{$x:>{$n}}" == " 42"# grouping digits with thousand separator
x = 1234567
f"{$x:,}" == "1,234,567"
```In addition to f-strings, Fmt.jl provides two formatting functions:
- `Fmt.format(fstr, args...; kwargs...)` creates a formatted string by applying `args` and `kwargs` to `fstr`.
- `Fmt.printf([out,] fstr, args...; kwargs...)` prints a formatted string to `out` (default: `stdout`) by applying `args` and `kwargs` to `fstr`.When using these functions, you cannot interpolate replacement fields with `$`. All replacement values are given as function arguments:
```julia
using Fmt# positional arguments with implicit numbering
Fmt.format(f"{} and {}", "Alice", "Bob") == "Alice and Bob"# positional arguments with explicit numbering
Fmt.format(f"{1} and {2}", "Alice", "Bob") == "Alice and Bob"
Fmt.format(f"{2} and {1}", "Alice", "Bob") == "Bob and Alice"# keyword arguments
Fmt.format(f"{A} and {B}", A = "Alice", B = "Bob") == "Alice and Bob"
Fmt.format(f"{B} and {A}", A = "Alice", B = "Bob") == "Bob and Alice"# box drawing example
Fmt.printf(f"""
┌{1:─^{2}}┐ ┌{1:─^{2}}┐
│{A: ^{2}}│ ──────────> │{B: ^{2}}│
└{1:─^{2}}┘ └{1:─^{2}}┘
""", "", 15, A = "Alice", B = "Bob")
# ┌───────────────┐ ┌───────────────┐
# │ Alice │ ──────────> │ Bob │
# └───────────────┘ └───────────────┘
```The syntax of f-strings is borrowed from [Python's Format String Syntax](https://docs.python.org/3/library/string.html#format-string-syntax), which is ported to C++ as [C++20 std::format](https://en.cppreference.com/w/cpp/utility/format) and Rust as [std::fmt](https://doc.rust-lang.org/std/fmt/).
See the next sections for details of the syntax and semantic supported by Fmt.jl.## Syntax
Each replacement field is surrounded by a pair of curly braces.
To escape curly braces, double curly braces (`{{` and `}}`) are interpreted as single curly braces (`{` and `}`).
Backslash-escaped characters are treated in the same way as in usual strings.
However, dollar signs `$` are no longer a special character for interpolation; that is, no interpolation happens outside replacement fields.The syntax of a replacement field is formally defined as follows:
```
# replacement field
field = '{'[argument]['/'conv][':'spec]'}'
argument = number | ['$']identifier | '$('expression')'
number = digit+
identifier = any valid identifier
expression = any valid expression
digit = '0' | '1' | '2' | … | '9'
conv = 's' | 'r'# format specification
spec = [[fill]align][sign][altform][zero][width][grouping]['.'precision][type]
fill = any valid character (except '{' and '}') | '{'[argument]'}'
align = '<' | '^' | '>'
sign = '+' | '-' | ' '
altform = '#'
zero = '0'
width = digit+ | '{'[argument]'}'
grouping = ',' | '_'
precision = digit+ | '{'[argument]'}'
type = 'd' | 'X' | 'x' | 'o' | 'B' | 'b' | 'c' | 'p' | 's'
'F' | 'f' | 'E' | 'e' | 'G' | 'g' | 'A' | 'a' | '%'
```Note that *syntactic* validity does not imply *semantic* validity.
For example, `{:,s}` is syntactically valid but semantically invalid, because the string type `s` does not support the thousands separator `,`.A sequence of `zero` and `width` may be ambiguous because `width` may start with `0`.
To resolve the ambiguity, if `0` is followed by a digit, the leading zero is interpreted as `zero` and the following digits are interpreted as `width`.
Otherwise, the zero is interpreted as `width`.## Semantic
The semantic of the format specification is basically the same as that of Python.
Fields that have an argument prefixed by `$` are interpolated like ordinal strings.
Currently, mixing interpolated and non-interpolated replacement fields in an f-string is not allowed.
The f-string returns a string if there is a field with interpolation.
Otherwise, it returns an `Fmt.Format` object, which can be passed to `Fmt.format` and `Fmt.printf` as the formatting template.```julia
f"x is {$x}." isa String #> true
f"x is {x}." isa Fmt.Format #> true
f"x is {}." isa Fmt.Format #> true
f"x is x." isa Fmt.Format #> true
```### Argument
The `argument` is either positional or keyword.
Positional arguments are numbered from one, and their values are supplied from arguments passed to the `Fmt.format` function.
If numbers are omitted, they are automatically numbered incrementally from left to right, which is independent from other kinds of arguments.
Keyword arguments are named by a variable and may be interpolated.
If a keyword argument is interpolated (indicated by `$`), its value is supplied from the context where the replacement field is placed; otherwise, its value is supplied from a keyword argument with the same name passed to the `Fmt.format` function.
Currently, you cannot mix interpolated keyword arguments with other kinds of arguments in a single format.Interpolated formats immediately return a string of the `String` type, while other formats are evaluated to an `Fmt.Format` object.
The `Fmt.format` object can be passed to the `Fmt.format` function as its first argument to create a formatted string.```julia
# Positional arguments
Fmt.format(f"{1} {2}", "foo", "bar") == "foo bar"# Positional arguments (implicit numbers)
Fmt.format(f"{} {}", "foo", "bar") == "foo bar"# Keyword arguments
Fmt.format(f"{x} {y}", x = "foo", y = "bar") == "foo bar"# Positional and keyword arguments
Fmt.format(f"{1} {x} {2}", "foo", "bar", x = "and") == "foo and bar"# Keyword arguments with interpolation
x, y = "foo", "bar"
f"{$x} {$y}" == "foo bar"
```### Conversion
Conversion is indicated by `/` followed by `s` or `r`.
If conversion is specified, the argument is first converted to a string representation using the `string` or `repr` function.
As the conversion characters suggest, `/s` converts the argument using the `string` function and `/r` with the `repr` function.```julia
# Conversion
Fmt.format(f"{/s}", 'a') == "a"
Fmt.format(f"{/r}", 'a') == "'a'"
```Python uses `!` to mark the conversion syntax.
Fmt.jl uses `/` instead to avoid syntactic ambiguity, because Julia allows `!` as a valid character for identifiers.### Fill and alignment
The content of a formatted value can be aligned within the specified `width`.
Note that text alignment does not make sense unless `width` is specified.The `align` character indicates an alignment type as follows:
- `<` : left alignment
- `^` : center alignment
- `>` : right alignmentThe left and right margins are filled with `fill`.
It can be any character except `{` and `}`.
If omitted, a space character (i.e., U+0020) is used.```julia
# Alignment with the default fill
Fmt.format(f"{:<7}", "foo") == "foo "
Fmt.format(f"{:^7}", "foo") == " foo "
Fmt.format(f"{:>7}", "foo") == " foo"# Alignment with a specified fill
Fmt.format(f"{:*<7}", "foo") == "foo****"
Fmt.format(f"{:*^7}", "foo") == "**foo**"
Fmt.format(f"{:*>7}", "foo") == "****foo"
```### Sign
`sign` controls the character indicating the sign of a number:
- `-` : a sign should be used only for negative values (default)
- `+` : a sign should be used for both non-negative and negative values
- space : a sign should be used only for negative values and a space should be used for non-negative valuesNote that `sign` is only meaningful for numbers.
```julia
Fmt.format(f"{:-}", 3) == "3"
Fmt.format(f"{:-}", -3) == "-3"
Fmt.format(f"{:+}", 3) == "+3"
Fmt.format(f"{:+}", -3) == "-3"
Fmt.format(f"{: }", 3) == " 3"
Fmt.format(f"{: }", -3) == "-3"
```### Alternate form (altform)
`altform` (`#`) indicates that the value should be formatted in a different way, depending on the type of the value and the `type` character.
For integers, it indicates that the prefix (`0b`, `0o`, `0x`, or `0X`) should be added before digits:
```julia
# Standard form of integers
Fmt.format("{:o}", 42) == "52"
Fmt.format("{:x}", 42) == "2a"# Alternate form of integers
Fmt.format("{:#o}", 42) == "0o52"
Fmt.format("{:#x}", 42) == "0x2a"
```For floating-point numbers, it indicates ... (TBD).
### Zero
`zero` (`0`) indicates that sign-aware zero padding should be added to fill the width specified by `width`.
That is, zeros for padding are added after the sign, not before the sign like `fill`.
The following example illustrates the difference between sign-aware padding and sign-ignorant padding:
```julia
# Sign-aware zero padding
Fmt.format(f"{:+08}", 42) == "+0000042"# Sign-ignorant zero padding
Fmt.format(f"{:0>+8}", 42) == "00000+42"
```### Width
`width` indicates the minimum width of a formatted string.
```julia
# Format an integer with minimum width 4
Fmt.format(f"{:4}", 123) == " 123"
Fmt.format(f"{:4}", 1234) == "1234"
Fmt.format(f"{:4}", 12345) == "12345"
```The default alignment depends on the type of a value.
For example, numbers are left-aligned while strings are right-aligned unless `align` is specified.
```julia
Fmt.format(f"{:4}", 1) == " 1"
Fmt.format(f"{:4}", "a") == "a "
```### Grouping
`grouping` spcifies the way of grouping digits.
For integers with the decimal format, `,` and `_` indicates thousand separator (e.g., `1,234,567`).
For integers with the binary, octal or hexadecimal format, `_` indicates four-digit separator (e.g., `0x1234_5678`).
For floating-point numbers, integral parts are grouped.
```julia
# integers
Fmt.format(f"{:,}", 123456789) == "123,456,789"
Fmt.format(f"{:_}", 123456789) == "123_456_789"
Fmt.format(f"{:#_x}", 0xdeadbeef) == "0xdead_beef"# floats
Fmt.format(f"{:,f}", 2.99792458e8) == "299,792,458.000000"
Fmt.format(f"{:_f}", 2.99792458e8) == "299_792_458.000000"
```### Precision
For floating-point numbers, `precision` specifies the precision of a formatted representation string of a number.
TBD
```julia
Fmt.format(f"{:.2f}", Float64(pi)) == "3.14"
Fmt.format(f"{:.3f}", Float64(pi)) == "3.142"
Fmt.format(f"{:.4f}", Float64(pi)) == "3.1416"
```### Type
#### Integers
| Type | Description |
| :--: | ----------- |
| `d` | decimal |
| `X` | hexadecimal (uppercase) |
| `x` | hexadecimal (lowercase) |
| `o` | octal |
| `B` | binary (uppercase) |
| `b` | binary (lowecase) |
| `c` | character |
| none | decimal |#### Floating-point numbers
| Type | Description |
| :--: | ----------- |
| `F` | fixed-point notation (uppercase) |
| `f` | fixed-point notation (lowercase) |
| `E` | scientific notation (uppercase) |
| `e` | scientific notation (lowercase) |
| `A` | hexadecimal notation (uppercase) |
| `a` | hexadecimal notation (lowercase) |
| `G` | general notation (uppercase) |
| `g` | general notation (lowercase) |
| `%` | percentage (multiplied by 100) |
| none | general notation |There are three kinds of notations for floating-point numbers.
Fixed-point notation refers to a notation without exponent part, such as `3.14` and `0.001`.
Scientific notation refers to a notation with exponent part, such as `6.02e+23` and `1e-8`.
Hexadecimal notation is similar to scientific notation, but it is prefixed by `0x` and its fractional part is denoted in hexadecimal digits.General notation may be in fixed-point notation or scientific notation, depending on the exponent part of a number.
It chooses fixed-point notation if the exponent part of the value is within a "reasonable" range.
Otherwise, it chooses scientific notation because denoting the value in fixed-point notation will be too long.`F` and `f` force fixed-point notation.
The only difference between `F` and `f` is that `F` uses uppercase letters for (positive and negative) infinities and NaNs (i.e., `INF` and `NAN`, respectively) whiel `f` uses lowercase letters (i.e., `inf` and `nan`, respectively).`E` and `e` force scientific notation.
The difference between `E` and `e` is analogous to that of `F` and `f`, but the prefix of exponent part is denoted in an uppercase letter (i.e., `E`) for `E` and in an lowercase letter (i.e., `e`) for `e`.`A` and `a` force hexadecimal notation.
`A` uses uppercase letters and `a` uses lowercase letters.`G` uses `F` or `E`, and `g` uses `f` or `e`, depending on the value as already mentioned above.
`%` multiplies a value by 100, denotes the value in fixed-point notation, and appends the `%` mark.
If no type specifier is given, the notation is the same as that of `g` but at least one digit is shown past the decimal point.
#### Rationals
If `type` is `F` or `f`, it formats the number in fixed-point notation.
If `type` is `%`, it formats the number in the same way as `f` but the number is multiplied by 100, followed by `%`.
If no `type` is specified, it formats the number with its (reduced) numerator and denominator separated by a slash (e.g., '3/10').#### Other values
`p` is for pointers and `s` for strings.
These are the default for each type and do not specify any special format.## Performance
Fmt.jl is carefully optimized and will be faster than naive printing.
Let's see the next benchmarking script, which prints a pair of integers to devnull.```julia
using Fmt
using Printf
using Formattingfmt_print(out, x, y) = print(out, f"({$x}, {$y})\n")
sprintf_print(out, x, y) = print(out, @sprintf("(%d, %d)\n", x, y))
naive_print(out, x, y) = print(out, '(', x, ", ", y, ")\n")
string_print(out, x, y) = print(out, "($x, $y)\n")
const expr = FormatExpr("({1}, {2})\n")
formatting_print(out, x, y) = print(out, format(expr, x, y))function benchmark(printer, out, x, y)
@assert length(x) == length(y)
for i in 1:length(x)
printer(out, x[i], y[i])
end
endusing Random
Random.seed!(1234)
x = rand(-999:999, 1_000_000)
y = rand(-999:999, 1_000_000)using BenchmarkTools
for printer in [fmt_print, sprintf_print, naive_print,
string_print, formatting_print]
print(f"{$printer:>20}:")
@btime benchmark($printer, $devnull, $x, $y)
end
```The result on my machine is:
```
$ julia benchmark/compare.jl
fmt_print: 37.928 ms (2000000 allocations: 91.55 MiB)
sprintf_print: 77.613 ms (2000000 allocations: 106.81 MiB)
naive_print: 202.531 ms (4975844 allocations: 198.00 MiB)
string_print: 316.838 ms (7975844 allocations: 365.84 MiB)
formatting_print: 716.088 ms (23878703 allocations: 959.44 MiB)
```Benchmark environment:
- CPU: AMD Ryzen 9 3950X
- OS: GNU/Linux 5.9.12
- Julia: v1.6.0
- Formatting.jl: v0.4.2## Related projects
- [Printf.jl](https://docs.julialang.org/en/v1/stdlib/Printf/) provides C-style formatting macros. In my opinion, it doesn't match dynamic nature of Julia because it needs type specifier.
- [Formatting.jl](https://github.com/JuliaIO/Formatting.jl) provides similar functionality with different APIs. Fmt.jl is much simpler and more performant.