https://github.com/edjcase/motoko_url_kit
A library with Url helper utilities for parsing and manipulation
https://github.com/edjcase/motoko_url_kit
Last synced: 5 months ago
JSON representation
A library with Url helper utilities for parsing and manipulation
- Host: GitHub
- URL: https://github.com/edjcase/motoko_url_kit
- Owner: edjCase
- License: mit
- Created: 2025-06-23T16:38:28.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-23T22:26:39.000Z (about 1 year ago)
- Last Synced: 2025-06-23T23:24:01.059Z (about 1 year ago)
- Language: Motoko
- Size: 21.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# URL Kit
[](https://mops.one/url-kit)
[](https://github.com/edjcase/motoko_url_kit/blob/main/LICENSE)
A comprehensive URL parsing and manipulation library for Motoko on the Internet Computer.
## Overview
URL Kit is a robust library designed to handle all aspects of URL processing in Motoko applications. It provides complete RFC-compliant URL parsing, validation, manipulation, and encoding/decoding capabilities with support for various host types, optional domain parsing, and comprehensive error handling.
Key features:
- 🔍 **Complete URL Parsing**: Parse any URL into structured components (scheme, authority, path, query, fragment)
- 🌐 **Multi-Host Support**: Handle host names, IPv4, and IPv6 addresses with proper validation
- 🏷️ **Flexible Domain Parsing**: Optional domain parsing with comprehensive or custom suffix lists
- 🔗 **URL Manipulation**: Add, remove, and modify query parameters with ease
- 🔄 **Encoding/Decoding**: Proper URL encoding and decoding with UTF-8 support
- ⚖️ **Normalization**: Normalize URLs for accurate comparison and deduplication
- 🛡️ **Validation**: Comprehensive validation with detailed error messages
- 🎯 **Type Safety**: Strongly typed URL components for compile-time safety
- 📊 **IPv6 Support**: Full IPv6 address parsing with compression and various formats
- 🔧 **Path Handling**: Flexible path parsing with custom separators and normalization
## Package
### MOPS
```bash
mops add url-kit
```
To setup MOPS package manager, follow the instructions from the [MOPS Site](https://mops.one)
## Quick Start
Here's a simple example to get started with URL parsing:
```motoko
import UrlKit "mo:url-kit";
// Parse a URL - no domain parser needed for basic parsing
let urlResult = UrlKit.fromText("https://api.example.com:8080/users?page=1&limit=10#results");
let url = switch (urlResult) {
case (#ok(url)) url;
case (#err(errorMsg)) {
// Handle parsing error
};
};
// Access URL components
// url = {
scheme = ?"https";
authority = ?{
user = null;
host = #name("api.example.com"); // Host names are parsed as simple strings
port = ?8080;
};
path = { segments = ["users"]; trailingSlash = false };
queryParams = [("page", "1"), ("limit", "10")];
fragment = ?"results"
// }
// Get specific query parameter
let page = UrlKit.getQueryParam(url, "page"); // ?"1"
// Add query parameters
let urlWithAuth = UrlKit.addQueryParam(url, ("token", "abc123"));
// Convert back to text
let urlText = UrlKit.toText(urlWithAuth);
```
## Comprehensive Example
Here's a more detailed example showing various URL manipulation capabilities:
```motoko
import UrlKit "mo:url-kit";
import Host "mo:url-kit/Host";
// Parse different types of URLs
let examples = [
"https://user:pass@sub.example.com:8080/api/v1/users?page=1&sort=name#section1",
"http://192.168.1.1:3000/dashboard",
"https://[2001:db8::1]:8443/secure",
"file:///path/to/file.txt",
"//cdn.example.com/assets/style.css"
];
for (urlText in examples.vals()) {
switch (UrlKit.fromText(urlText)) {
case (#ok(url)) {
// Analyze the URL structure
switch (url.authority) {
case (?authority) {
// Check host type
switch (authority.host) {
case (#name(hostName)) {
// Host name (domain name, hostname, etc.)
// For domain parsing, use the separate domain parsers
};
case (#ipV4(ip)) {
// IPv4 address: (192, 168, 1, 1)
let hostText = Host.toText(authority.host, authority.port);
};
case (#ipV6(ip)) {
// IPv6 address with proper formatting
let hostText = Host.toText(authority.host, authority.port);
};
};
// Check for user authentication
switch (authority.user) {
case (?userInfo) {
let username = userInfo.username;
let password = userInfo.password;
};
case (null) {};
};
};
case (null) {
// No authority (e.g., mailto:, file: schemes)
};
};
// Manipulate query parameters
let withParams = url
|> UrlKit.addQueryParam(_, ("timestamp", "123456789"))
|> UrlKit.addQueryParamMulti(_, [("version", "v2"), ("format", "json")])
|> UrlKit.removeQueryParam(_, "page");
// Normalize for comparison
let normalizeOpts = { usernameIsCaseSensitive = false; pathIsCaseSensitive = false; queryKeysAreCaseSensitive = false; removeEmptyPathSegments = true; resolvePathDotSegments = true; preserveTrailingSlash = false };
let normalized = UrlKit.normalize(withParams, normalizeOpts);
// Convert back to string
let finalUrl = UrlKit.toText(normalized);
};
case (#err(error)) {
// Handle parsing errors with detailed messages
};
};
};
```
## Core API
### URL Type
The core `Url` type represents a parsed URL with all its components:
```motoko
public type Url = {
scheme : ?Text; // "https", "http", "mailto", etc.
authority : ?Authority; // Host, port, and user info
path : Path.Path; // Path segments with trailing slash flag
queryParams : [(Text, Text)]; // Query parameters as key-value pairs
fragment : ?Text; // Fragment identifier
};
public type Authority = {
user : ?UserInfo; // Username and password
host : Host.Host; // Host name or IP address
port : ?Nat16; // Port number
};
```
### Parsing and Conversion
```motoko
// Parse URL from text
UrlKit.fromText(url : Text) : Result.Result
// Convert URL back to text
UrlKit.toText(url : Url) : Text
// Normalize URL for comparison
UrlKit.normalize(url : Url, options : NormalizationOptions) : Url
// Compare URLs for equality
UrlKit.equal(url1 : Url, url2 : Url, options : NormalizationOptions) : Bool
```
### Query Parameter Manipulation
```motoko
// Get query parameter value
UrlKit.getQueryParam(url : Url, key : Text) : ?Text
// Add single query parameter
UrlKit.addQueryParam(url : Url, param : (Text, Text)) : Url
// Add multiple query parameters
UrlKit.addQueryParamMulti(url : Url, params : [(Text, Text)]) : Url
// Remove query parameter by key
UrlKit.removeQueryParam(url : Url, key : Text) : Url
// Remove multiple query parameters
UrlKit.removeQueryParamMulti(url : Url, keys : [Text]) : Url
```
### Encoding and Decoding
```motoko
// URL encode text (percent encoding)
UrlKit.encodeText(value : Text, hexIsUpperCase : Bool) : Text
// URL decode text
UrlKit.decodeText(value : Text) : Result.Result
```
## Host Types
URL Kit supports various host types with proper validation:
### Host Names
```motoko
import Host "mo:url-kit/Host";
// Parse host with optional port
let hostResult = Host.fromText("example.com:8080");
// Result: (#name("example.com"), ?8080)
// Convert host to text
let hostText = Host.toText(host);
// Normalize host (lowercase)
let normalized = Host.normalize(host);
```
### IPv4 Addresses
```motoko
import IpV4 "mo:url-kit/IpV4";
// Parse IPv4 address
let ipResult = IpV4.fromText("192.168.1.1");
// Result: (192, 168, 1, 1)
// Convert back to text
let ipText = IpV4.toText(ip); // "192.168.1.1"
```
### IPv6 Addresses
```motoko
import IpV6 "mo:url-kit/IpV6";
// Parse IPv6 address (supports compression and various formats)
let ipResult = IpV6.fromText("2001:db8::1");
// Convert to text with default options (compressed format, lowercase)
let ipText = IpV6.toText(ip); // "2001:db8::1"
// Convert to text with custom formatting options
let full = IpV6.toTextAdvanced(ip, { format = #full; isUpperCase = false }); // "2001:0db8:0000:0000:0000:0000:0000:0001"
let standard = IpV6.toTextAdvanced(ip, { format = #standard; isUpperCase = false }); // "2001:db8:0:0:0:0:0:1"
let compressed = IpV6.toTextAdvanced(ip, { format = #compressed; isUpperCase = true }); // "2001:DB8::1"
```
### Host Parsing and Formatting
```motoko
import Host "mo:url-kit/Host";
// Parse host with port
let hostResult = Host.fromText("example.com:8080");
// Result: (#name("example.com"), ?8080)
// Convert host to text (with optional port)
let hostText = Host.toText(host, port);
// Normalize host (lowercase)
let normalized = Host.normalize(host);
```
## Path Handling
## Path Handling
```motoko
import Path "mo:url-kit/Path";
// Parse path from text
let path = Path.fromText("/api/v1/users");
// Result: { segments = ["api", "v1", "users"]; trailingSlash = false }
// Convert path back to text
let pathText = Path.toText(path); // "/api/v1/users"
// Join path segments
let newPath = Path.join(path, "123");
let newNewPath = Path.joinMulti(newPath, ["456", "profile"]);
// Result: { segments = ["api", "v1", "users", "123", "456", "profile"]; trailingSlash = false }
// Normalize path with options
let options = { isCaseSensitive = false; removeEmptySegments = true; resolveDotSegments = true; preserveTrailingSlash = false };
let normalized = Path.normalize(path, options);
```
## URL Examples
### Basic HTTP/HTTPS URLs
```motoko
// Simple HTTPS URL
"https://example.com"
// URL with port and path
"https://api.example.com:8080/v1/users"
// URL with query parameters
"https://example.com/search?q=motoko&type=repo"
// URL with fragment
"https://docs.example.com/guide#installation"
// Complete URL with all components
"https://user:pass@api.example.com:8080/v1/users?page=1&limit=10#results"
```
### IP Address URLs
```motoko
// IPv4 address
"http://192.168.1.1:3000/dashboard"
// IPv6 address (note the brackets)
"https://[2001:db8::1]:8443/api"
// IPv6 with embedded IPv4
"http://[::ffff:192.168.1.1]/mixed"
```
### Special Schemes
```motoko
// File URLs
"file:///path/to/file.txt"
"file://server/share/document.pdf"
// Custom schemes
"custom-protocol://data.example.com/resource"
```
### Relative URLs
```motoko
// Protocol-relative URL
"//cdn.example.com/assets/style.css"
// Absolute path
"/api/users/123"
// Query only
"?search=term"
// Fragment only
"#section1"
```
## Domain Parsing
As of v3.0, domain parsing has been separated from basic host parsing. The Host type now simply stores names as strings (`#name`), and domain parsing is handled by dedicated domain parsers when needed.
### Domain Parsers
URL Kit provides two types of domain parsers:
#### Comprehensive Domain Parser (Recommended)
Uses the complete Public Suffix List for accurate domain parsing:
```motoko
import ComprehensiveDomainParser "mo:url-kit/ComprehensiveDomainParser";
import Domain "mo:url-kit/Domain";
// Create a comprehensive domain parser
let domainParser = ComprehensiveDomainParser.ComprehensiveDomainParser();
// Parse a domain name
let domainResult = domainParser.parse("blog.github.io");
// Result: #ok({ name = "github"; suffix = "io"; subdomains = ["blog"] })
switch (domainResult) {
case (#ok(domain)) {
let name = domain.name; // "github"
let suffix = domain.suffix; // "io"
let subdomains = domain.subdomains; // ["blog"]
};
case (#err(msg)) {
// Handle parsing error
};
};
```
#### Custom Domain Parsing
For custom domain suffix lists, use the Domain module directly:
```motoko
import Domain "mo:url-kit/Domain";
// Parse with custom suffixes
let customSuffixes = ["com", "org", "test", "local"];
let domainResult = Domain.fromText("api.example.com", customSuffixes);
// Result: #ok({ name = "example"; suffix = "com"; subdomains = ["api"] })
```
#### Custom Domain Parsing
For custom domain suffix lists, use the Domain module directly:
```motoko
import Domain "mo:url-kit/Domain";
// Parse with custom suffixes
let customSuffixes = ["com", "org", "test", "local"];
let domainResult = Domain.fromText("api.example.com", customSuffixes);
// Result: #ok({ name = "example"; suffix = "com"; subdomains = ["api"] })
```
### Domain Operations
```motoko
import Domain "mo:url-kit/Domain";
// Parse domain with specified suffixes
let domainResult = Domain.fromText("blog.github.io", ["github.io", "io"]);
// Validate domain structure
let validation = Domain.validate(domain);
// Convert domain to text
let domainText = Domain.toText(domain);
// Normalize domain (lowercase)
let normalized = Domain.normalize(domain);
```
### Integration with URLs
When you need domain parsing for URLs, extract the host name and parse it separately:
```motoko
import UrlKit "mo:url-kit";
import ComprehensiveDomainParser "mo:url-kit/ComprehensiveDomainParser";
let url = switch (UrlKit.fromText("https://blog.example.com/path")) {
case (#ok(u)) u;
case (#err(_)) return; // Handle error
};
// Extract host name for domain parsing
switch (url.authority) {
case (?authority) {
switch (authority.host) {
case (#name(hostName)) {
// Parse the host name as a domain
let domainParser = ComprehensiveDomainParser.ComprehensiveDomainParser();
let domainResult = domainParser.parse(hostName);
switch (domainResult) {
case (#ok(domain)) {
// Work with parsed domain components
let rootDomain = domain.name # "." # domain.suffix; // "example.com"
};
case (#err(_)) {
// Host name is not a valid domain (e.g., IP address, localhost)
};
};
};
case (#ipV4(_) or #ipV6(_)) {
// IP addresses don't need domain parsing
};
};
};
case (null) {};
};
```
## Domain Suffix List
URL Kit includes an automatically generated domain suffix list based on the [Public Suffix List](https://publicsuffix.org/) for accurate domain parsing. The suffix list helps distinguish between domain names and subdomains.
The [`ComprehensiveDomainParser`](src/ComprehensiveDomainParser.mo) uses this comprehensive list for accurate domain parsing, while you can also create custom parsers with [`Domain.fromText`](src/Domain.mo) for specific use cases.
### Updating the Suffix List
The domain suffix list should be updated periodically to include new top-level domains and suffixes. Run the provided script to regenerate the list:
```bash
# Requires Python 3 and internet connection
./scripts/rebuild_suffix_list.sh
```
This script:
1. Downloads the latest Public Suffix List from https://publicsuffix.org/
2. Processes and filters the data
3. Generates a new `src/data/DomainSuffixData.mo` file with the current suffixes
4. Structures the data as an efficient tree for fast lookups
The generated file contains a compressed tree structure that allows for efficient suffix matching during domain parsing. You should run this script periodically (e.g., monthly) to keep the suffix list current.
## Performance
URL Kit is designed for performance with:
- **Efficient Domain Matching**: Tree-based suffix lookup for O(log n) domain validation
- **Minimal Allocations**: Careful memory management during parsing
- **Lazy Evaluation**: Components are parsed only when needed
- **Optimized String Operations**: Efficient text processing for encoding/decoding
- **Compressed Suffix Data**: Compact representation of the public suffix list
## Testing
Run the comprehensive test suite:
```bash
mops test
```
The test suite covers:
- URL parsing success and failure cases
- All host type variations (domains, IPv4, IPv6, hostnames)
- Query parameter manipulation
- URL encoding/decoding edge cases
- Normalization and equality comparison
- IPv6 address formatting variations
- Domain parser functionality
- Error handling scenarios
## Migration Guide
### Breaking Changes in v3.0
1. **Host Type Simplified**: The Host type has been simplified by consolidating `#domain` and `#hostname` variants into a single `#name` variant:
```motoko
// Old (v2.x)
switch (host) {
case (#domain(domain)) { /* domain components */ };
case (#hostname(name)) { /* hostname string */ };
case (#ipV4(ip)) { /* IPv4 address */ };
case (#ipV6(ip)) { /* IPv6 address */ };
};
// New (v3.0)
switch (host) {
case (#name(hostName)) { /* any host name string */ };
case (#ipV4(ip)) { /* IPv4 address */ };
case (#ipV6(ip)) { /* IPv6 address */ };
};
```
2. **Domain Parser Removed from URL Parsing**: URL parsing no longer requires a domain parser parameter:
```motoko
// Old (v2.x)
import ComprehensiveDomainParser "mo:url-kit/ComprehensiveDomainParser";
let domainParser = ComprehensiveDomainParser.ComprehensiveDomainParser();
UrlKit.fromText("https://example.com", domainParser)
// New (v3.0)
UrlKit.fromText("https://example.com")
```
3. **Separate Domain Parsing**: Domain parsing is now a separate step when needed:
```motoko
// Old (v2.x) - automatic domain parsing
let url = UrlKit.fromText("https://blog.example.com", domainParser);
switch (url.authority.host) {
case (#domain(domain)) {
let name = domain.name; // "example"
let suffix = domain.suffix; // "com"
let subdomains = domain.subdomains; // ["blog"]
};
};
// New (v3.0) - explicit domain parsing when needed
let url = UrlKit.fromText("https://blog.example.com");
switch (url.authority.host) {
case (#name(hostName)) {
let domainParser = ComprehensiveDomainParser.ComprehensiveDomainParser();
switch (domainParser.parse(hostName)) {
case (#ok(domain)) {
let name = domain.name; // "example"
let suffix = domain.suffix; // "com"
let subdomains = domain.subdomains; // ["blog"]
};
case (#err(_)) { /* not a valid domain */ };
};
};
};
```
### Breaking Changes in v2.0
1. **Domain Parser Required**: All URL parsing functions now require a `domainParser` parameter:
```motoko
// Old (v1.x)
UrlKit.fromText("https://example.com")
// New (v2.0)
import ComprehensiveDomainParser "mo:url-kit/ComprehensiveDomainParser";
let domainParser = ComprehensiveDomainParser.ComprehensiveDomainParser();
UrlKit.fromText("https://example.com", domainParser)
```
2. **Domain Parsing**: Domain parsing is now more flexible with custom suffix support:
```motoko
// Using comprehensive parser (recommended)
let domainParser = ComprehensiveDomainParser.ComprehensiveDomainParser();
let result = domainParser.parse("example.com");
// Using custom suffixes
let result = Domain.fromText("example.test", ["test", "local"]);
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. When contributing:
1. Add tests for new functionality
2. Update documentation for API changes
3. Follow existing code style and patterns
4. Ensure all tests pass
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.