Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/st235/uric
A Uniform Resource Identifier (URI) implementation in C++, designed strictly in compliance with RFC 3986.
https://github.com/st235/uric
cpp parser pico raspberry-pi-pico rfc3986 uri url
Last synced: about 1 month ago
JSON representation
A Uniform Resource Identifier (URI) implementation in C++, designed strictly in compliance with RFC 3986.
- Host: GitHub
- URL: https://github.com/st235/uric
- Owner: st235
- License: mit
- Created: 2024-07-15T17:02:09.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-08-03T16:47:18.000Z (5 months ago)
- Last Synced: 2024-08-03T23:55:52.088Z (5 months ago)
- Topics: cpp, parser, pico, raspberry-pi-pico, rfc3986, uri, url
- Language: C++
- Homepage: https://datatracker.ietf.org/doc/html/rfc3986
- Size: 218 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# URIC
A Uniform Resource Identifier (URI) implementation in `C++`, designed strictly in compliance with [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).
The library also includes an implementation of Uniform Resource Locator (URL) rules, which are a subset of URI rules. This implementation follows [the `MDN` definition of URL]((https://developer.mozilla.org/en-US/docs/Learn/Common_questions/Web_mechanics/What_is_a_URL)), specifically:
> Parameters
> `?key1=value1&key2=value2` are extra parameters provided to the Web server. Those parameters are a list of key/value pairs separated with the & symbol. The Web server can use those parameters to do extra stuff before returning the resource. Each Web server has its own rules regarding parameters, and the only reliable way to know if a specific Web server is handling parameters is by asking the Web server owner.The library was tested with [`Raspberry Pi Pico and Pico W`](https://www.raspberrypi.com/documentation/microcontrollers/raspberry-pi-pico.html).
>[!NOTE]
>In the event that RFC 3986 is unavailable, please refer to the [local copy](./rfc3986.pdf).## Bringing Dependencies
To install the library, which is utilizes CMake, follow these simple steps:
first, include the FetchContent module in your CMake configuration. Then, declare the library by specifying the `git` repository and branch. Finally, make the content available and link the library to your target. Here’s a sample CMake snippet for reference:
```cmake
include(FetchContent)FetchContent_Declare(
URIC
GIT_REPOSITORY [email protected]:st235/URIC.git
GIT_TAG "main"
GIT_SHALLOW TRUE
GIT_PROGRESS ON
)
FetchContent_MakeAvailable(URIC)target_link_libraries(uri-validator uric)
```## API
Using the API is really straightforward. Here’s a quick example to get you started:
First, parse the URI using `uri::Uri::parse(uri)`. If the URI is invalid, you'll get `std::nullopt` object. If it’s valid, you can easily access different parts of the URI. For example, you can get the scheme, authority, and other components.
Here’s a code snippet (from [samples](./samples/)) to illustrate:
```cpp
const auto& uri_opt = uri::Uri::parse(uri_to_verify);
if (!uri_opt) {
std::cout << "Not a valid URI." << std::endl;
return 0;
}const auto& uri = uri_opt.value();
if (uri.getScheme()) {
std::cout << "Scheme: " << uri.getScheme().value() << std::endl;
}if (uri.getAuthority()) {
const auto& authority = uri.getAuthority().value();std::cout << "Authority:" << std::endl;
if (authority.getUserInfo()) {
std::cout << "\tUser Info: " << authority.getUserInfo().value() << std::endl;
}std::cout << "\tHost: " << authority.getHost() << std::endl;
if (authority.getPort()) {
std::cout << "\tPort: " << authority.getPort().value() << std::endl;
}
}
```### Normalisation
The library provides handy methods for path normalisation, according to the `RFC 3986`.
The rules are:
- The following unreserved characters are percent decoded:
- Alphabetical characters: `a-z`, `A-Z` (decoded from `%41`-`%5A` and `%61`-`%7A`)
- Digit characters: `0-9` (decoded from `%30`-`%39`)
- hyphen '-' (`%2D`), period '.' (`%2E`), underscore '_' (`%5F`), and tilde '~' (`%7E`)
- These reserved characters are not encoded or decoded: `: / ? # [ ] @ ! $ & ' ( ) * + , ; =`
- Other characters, for example literal byte values, are percent encoded.
- Percent encoded representations are converted to upper case.
- Paths are normalized according to the Remove Dot Segments protocol.>[!NOTE]
> Use `Uri::normalisePath` to perform path normalisation.## Grammar
>[!NOTE]
> URI Grammar for future reference, see `RFC 3986` for more details.```text
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-emptyURI-reference = URI / relative-ref
absolute-URI = scheme ":" hier-part [ "?" query ]
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
relative-part = "//" authority path-abempty
/ path-absolute
/ path-noscheme
/ path-emptyscheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
authority = [ userinfo "@" ] host [ ":" port ]
userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
host = IP-literal / IPv4address / reg-name
port = *DIGITIP-literal = "[" ( IPv6address / IPvFuture ) "]"
IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
IPv6address = 6( h16 ":" ) ls32
/ "::" 5( h16 ":" ) ls32
/ [ h16 ] "::" 4( h16 ":" ) ls32
/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
/ [ *4( h16 ":" ) h16 ] "::" ls32
/ [ *5( h16 ":" ) h16 ] "::" h16
/ [ *6( h16 ":" ) h16 ] "::"h16 = 1*4HEXDIG
ls32 = ( h16 ":" h16 ) / IPv4address
IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octetdec-octet = DIGIT ; 0-9
/ %x31-39 DIGIT ; 10-99
/ "1" 2DIGIT ; 100-199
/ "2" %x30-34 DIGIT ; 200-249
/ "25" %x30-35 ; 250-255reg-name = *( unreserved / pct-encoded / sub-delims )
path = path-abempty ; begins with "/" or is empty
/ path-absolute ; begins with "/" but not "//"
/ path-noscheme ; begins with a non-colon segment
/ path-rootless ; begins with a segment
/ path-empty ; zero characterspath-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
query = *( pchar / "/" / "?" )
fragment = *( pchar / "/" / "?" )
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
```## Building the project
Use these commands to build the project:
```bash
mkdir build
cmake .. -DCOMPILE_TESTS=ON
make
```### Running test
A lot of logic in the library heavily relies on _unit_ tests.
To run them you need yo _successfully build the project_ and run the command below:```bash
ctest --output-on-failure [-R filter regex]
```