Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dk1a/solidity-stringutils
StrSlice & Slice library for Solidity
https://github.com/dk1a/solidity-stringutils
ethereum library slice smart-contracts solidity string
Last synced: 3 months ago
JSON representation
StrSlice & Slice library for Solidity
- Host: GitHub
- URL: https://github.com/dk1a/solidity-stringutils
- Owner: dk1a
- License: mit
- Created: 2022-12-06T21:07:33.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-01-18T20:50:37.000Z (about 2 years ago)
- Last Synced: 2024-10-14T01:25:38.359Z (3 months ago)
- Topics: ethereum, library, slice, smart-contracts, solidity, string
- Language: Solidity
- Homepage:
- Size: 235 KB
- Stars: 21
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# StrSlice & Slice library for Solidity
- Types: [StrSlice](src/StrSlice.sol) for strings, [Slice](src/Slice.sol) for bytes, [StrChar](src/StrChar.sol) for characters
- [Gas efficient](https://github.com/dk1a/solidity-stringutils-gas)
- Versioned releases, available for both foundry and hardhat
- Simple imports, you only need e.g. `StrSlice` and `toSlice`
- `StrSlice` enforces UTF-8 character boundaries; `StrChar` validates character encoding
- Clean, well-documented and thoroughly-tested source code
- Optional [PRBTest](https://github.com/paulrberg/prb-test) extension with assertions like `assertContains` and `assertLt` for both slices and native `bytes`, `string`
- `Slice` and `StrSlice` are value types, not structs
- Low-level functions like [memchr](src/utils/memchr.sol), [memcmp, memmove etc](src/utils/mem.sol)## Install
### Node
```sh
yarn add @dk1a/solidity-stringutils
```### Forge
```sh
forge install --no-commit dk1a/solidity-stringutils
```## StrSlice
```solidity
import { StrSlice, toSlice } from "@dk1a/solidity-stringutils/src/StrSlice.sol";using { toSlice } for string;
/// @dev Returns the content of brackets, or empty string if not found
function extractFromBrackets(string memory stuffInBrackets) pure returns (StrSlice extracted) {
StrSlice s = stuffInBrackets.toSlice();
bool found;(found, , s) = s.splitOnce(toSlice("("));
if (!found) return toSlice("");(found, s, ) = s.rsplitOnce(toSlice(")"));
if (!found) return toSlice("");return s;
}
/*
assertEq(
extractFromBrackets("((1 + 2) + 3) + 4"),
toSlice("(1 + 2) + 3")
);
*/
```See [ExamplesTest](test/Examples.t.sol).
Internally `StrSlice` uses `Slice` and extends it with logic for multibyte UTF-8 where necessary.
| Method | Description |
| ---------------- | ------------------------------------------------ |
| `len` | length in **bytes** |
| `isEmpty` | true if len == 0 |
| `toString` | copy slice contents to a **new** string |
| `keccak` | equal to `keccak256(s.toString())`, but cheaper |
**concatenate**
| `add` | Concatenate 2 slices into a **new** string |
| `join` | Join slice array on `self` as separator |
**compare**
| `cmp` | 0 for eq, < 0 for lt, > 0 for gt |
| `eq`,`ne` | ==, != (more efficient than cmp) |
| `lt`,`lte` | <, <= |
| `gt`,`gte` | >, >= |
**index**
| `isCharBoundary` | true if given index is an allowed boundary |
| `get` | get 1 UTF-8 character at given index |
| `splitAt` | (slice[:index], slice[index:]) |
| `getSubslice` | slice[start:end] |
**search**
| `find` | index of the start of the **first** match |
| `rfind` | index of the start of the **last** match |
| | *return `type(uint256).max` for no matches* |
| `contains` | true if a match is found |
| `startsWith` | true if starts with pattern |
| `endsWith` | true if ends with pattern |
**modify**
| `stripPrefix` | returns subslice without the prefix |
| `stripSuffix` | returns subslice without the suffix |
| `splitOnce` | split into 2 subslices on the **first** match |
| `rsplitOnce` | split into 2 subslices on the **last** match |
| `replacen` | *experimental* replace `n` matches |
| | *replacen requires 0 < pattern.len() <= to.len()*|
**iterate**
| `chars` | character iterator over the slice |
**ascii**
| `isAscii` | true if all chars are ASCII |
**dangerous**
| `asSlice` | get underlying Slice |
| `ptr` | get memory pointer |Indexes are in **bytes**, not characters. Indexing methods revert if `isCharBoundary` is false.
## StrCharsIter
*Returned by `chars` method of `StrSlice`*
```solidity
import { StrSlice, toSlice, StrCharsIter } from "@dk1a/solidity-stringutils/src/StrSlice.sol";using { toSlice } for string;
/// @dev Returns a StrSlice of `str` with the 2 first UTF-8 characters removed
/// reverts on invalid UTF8
function removeFirstTwoChars(string memory str) pure returns (StrSlice) {
StrCharsIter memory chars = str.toSlice().chars();
for (uint256 i; i < 2; i++) {
if (chars.isEmpty()) break;
chars.next();
}
return chars.asStr();
}
/*
assertEq(removeFirstTwoChars(unicode"๐!ใใใซใกใฏ"), unicode"ใใใซใกใฏ");
*/
```| Method | Description |
| ---------------- | ------------------------------------------------ |
| `asStr` | get underlying StrSlice of the remainder |
| `len` | remainder length in **bytes** |
| `isEmpty` | true if len == 0 |
| `next` | advance the iterator, return the next StrChar |
| `nextBack` | advance from the back, return the next StrChar |
| `count` | returns the number of UTF-8 characters |
| `validateUtf8` | returns true if the sequence is valid UTF-8 |
**dangerous**
| `unsafeNext` | advance unsafely, return the next StrChar |
| `unsafeCount` | unsafely count chars, read the source for caveats|
| `ptr` | get memory pointer |`count`, `validateUtf8`, `unsafeCount` consume the iterator in O(n).
Safe methods revert on an invalid UTF-8 byte sequence.
`unsafeNext` does NOT check if the iterator is empty, may underflow! Does not revert on invalid UTF-8. If returned `StrChar` is invalid, it will have length 0. Otherwise length 1-4.
Internally `next`, `unsafeNext`, `count` all use `_nextRaw`. It's very efficient, but very unsafe and complicated. Read the source and import it separately if you need it.
## StrChar
Represents a single UTF-8 encoded character.
Internally it's bytes32 with leading byte at MSB.It's returned by some methods of `StrSlice` and `StrCharsIter`.
| Method | Description |
| ---------------- | ------------------------------------------------ |
| `len` | character length in bytes |
| `toBytes32` | returns the underlying `bytes32` value |
| `toString` | copy the character to a new string |
| `toCodePoint` | returns the unicode code point (`ord` in python) |
| `cmp` | 0 for eq, < 0 for lt, > 0 for gt |
| `eq`,`ne` | ==, != |
| `lt`,`lte` | <, <= |
| `gt`,`gte` | >, >= |
| `isValidUtf8` | usually true |
| `isAscii` | true if the char is ASCII |Import `StrChar__` (static function lib) to use `StrChar__.fromCodePoint` for code point to `StrChar` conversion.
`len` can return `0` *only* for invalid UTF-8 characters. But some invalid chars *may* have non-zero len! (use `isValidUtf8` to check validity). Note that `0x00` is a valid 1-byte UTF-8 character, its len is 1.
`isValidUtf8` can be false if the character was formed with an unsafe method (fromUnchecked, wrap).
## Slice
```solidity
import { Slice, toSlice } from "@dk1a/solidity-stringutils/src/Slice.sol";using { toSlice } for bytes;
function findZeroByte(bytes memory b) pure returns (uint256 index) {
return b.toSlice().find(
bytes(hex"00").toSlice()
);
}
```See `using {...} for Slice global` in the source for a function summary. Many are shared between `Slice` and `StrSlice`, but there are differences.
Internally Slice has very minimal assembly, instead using `memcpy`, `memchr`, `memcmp` and others; if you need the low-level functions, see `src/utils/`.
## Assertions (PRBTest extension)
```solidity
import { PRBTest } from "@prb/test/src/PRBTest.sol";
import { Assertions } from "@dk1a/solidity-stringutils/src/test/Assertions.sol";contract StrSliceTest is PRBTest, Assertions {
function testContains() public {
bytes memory b1 = "12345";
bytes memory b2 = "3";
assertContains(b1, b2);
}function testLt() public {
string memory s1 = "123";
string memory s2 = "124";
assertLt(s1, s2);
}
}
```You can completely ignore slices if all you want is e.g. `assertContains` for native `bytes`/`string`.
## Acknowledgements
- [Arachnid/solidity-stringutils](https://github.com/Arachnid/solidity-stringutils) - I basically wanted to make an updated version of solidity-stringutils
- [rust](https://doc.rust-lang.org/core/index.html) - most similarities are in names and general structure; the implementation can't really be similar (solidity doesn't even have generics)
- [paulrberg/prb-math](https://github.com/paulrberg/prb-math) - good template for solidity data structure libraries with `using {...} for ... global`
- [brockelmore/memmove](https://github.com/brockelmore/memmove) - good assembly memory management examples