https://github.com/edadma/dynamic_string.h
A dynamic string single-header library in C
https://github.com/edadma/dynamic_string.h
c dynamic-string embedded header-only-library portable reference-counting
Last synced: 3 months ago
JSON representation
A dynamic string single-header library in C
- Host: GitHub
- URL: https://github.com/edadma/dynamic_string.h
- Owner: edadma
- License: other
- Created: 2025-08-23T01:06:11.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-08-29T11:32:05.000Z (5 months ago)
- Last Synced: 2025-08-29T14:59:25.201Z (5 months ago)
- Topics: c, dynamic-string, embedded, header-only-library, portable, reference-counting
- Language: C
- Homepage: https://edadma.github.io/dynamic_string.h/
- Size: 296 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# dynamic_string.h
A modern, efficient, single-file string library for C featuring:
- **Direct C compatibility** - `ds_string` works with ALL C functions (no conversion needed!)
- **Reference counting** with automatic memory management
- **Immutable strings** for safety and sharing
- **Efficient StringBuilder** for string construction
- **Unicode support** with UTF-8 storage and codepoint iteration
- **Zero dependencies** - just drop in the `.h` file
- **Memory safe** - eliminates common C string bugs
## Quick Example
```c
#define DS_IMPLEMENTATION
#include "dynamic_string.h"
int main() {
// Create strings with the short, clean API
ds_string greeting = ds_new("Hello");
ds_string full = ds_append(greeting, " World! π");
// Works directly with ALL C functions - no conversion needed!
printf("%s\n", full); // Direct usage!
int result = strcmp(full, "Hello"); // Works with strcmp!
FILE* f = fopen(full, "r"); // Works with fopen!
// Unicode-aware iteration
ds_codepoint_iter iter = ds_codepoints(full);
uint32_t codepoint;
while ((codepoint = ds_iter_next(&iter)) != 0) {
printf("U+%04X ", codepoint);
}
// Automatic cleanup via reference counting
ds_release(&greeting);
ds_release(&full);
return 0;
}
```
## Major Features
### π Direct C Compatibility
The biggest advantage: `ds_string` IS a `char*` that works with every C function:
```c
ds_string path = ds_new("/tmp/file.txt");
FILE* f = fopen(path, "r"); // Direct usage!
if (strstr(path, "tmp")) { ... } // Works with strstr!
printf("Path: %s\n", path); // No conversion needed!
```
### π Reference Counting
- **Automatic memory management** - no manual `free()` needed
- **Efficient sharing** - multiple variables can point to same string data
- **Copy-on-write** - strings are only copied when actually modified
```c
ds_string original = ds_new("Shared data");
ds_string copy = ds_retain(original); // Shares memory, no copying!
printf("Ref count: %zu\n", ds_refcount(original)); // 2
ds_release(&original);
ds_release(©); // Automatic cleanup
```
### π‘οΈ Immutable Strings
- **Thread-safe reading** - immutable strings can be safely shared between threads
- **Functional style** - operations return new strings, originals unchanged
- **No accidental modification** - prevents common C string bugs
```c
ds_string base = ds_new("Hello");
ds_string modified = ds_append(base, " World");
// base is still "Hello", modified is "Hello World"
```
### β‘ Efficient StringBuilder
- **In-place construction** - builds strings efficiently without intermediate allocations
- **Reference-counted** - safe sharing between functions
- **Rich API** - formatting, numeric operations, and content manipulation
- **Automatic growth** - handles capacity management
```c
ds_builder sb = ds_builder_create();
ds_builder_append_format(sb, "Processing %d items:\n", 1000);
for (int i = 0; i < 1000; i++) {
ds_builder_append_format(sb, "Item %d: ", i);
ds_builder_append_double(sb, i * 3.14, 2);
ds_builder_append(sb, "\n");
}
ds_string result = ds_builder_to_string(sb); // sb becomes consumed
ds_builder_release(&sb); // Always safe to call
```
### π Unicode Support
- **UTF-8 storage** - compact and C-compatible
- **Codepoint iteration** - proper Unicode handling
- **Mixed text support** - seamlessly handle ASCII, emoji, and international text
```c
ds_string unicode = ds_new("Hello π δΈη");
printf("Bytes: %zu, Codepoints: %zu\n",
ds_length(unicode), ds_codepoint_length(unicode));
// Iterate through codepoints
ds_codepoint_iter iter = ds_codepoints(unicode);
uint32_t cp;
while ((cp = ds_iter_next(&iter)) != 0) {
printf("U+%04X ", cp);
}
```
## β οΈ Error Handling
**Important: NULL Parameter Validation**
As of v0.3.0, all functions perform strict NULL parameter validation using assertions:
```c
// β
Valid usage
ds_string str = ds_new("Hello");
size_t len = ds_length(str); // Works fine
ds_release(&str);
// β Will trigger assertion failure
size_t len = ds_length(NULL); // Assertion: ds_length: str cannot be NULL
ds_string result = ds_append(NULL, "text"); // Assertion: ds_append: str cannot be NULL
```
**Benefits:**
- **Immediate error detection** - bugs are caught at the source
- **Clear error messages** - assertions specify exactly which parameter is NULL
- **Consistent behavior** - all functions handle NULL parameters the same way
- **Development safety** - prevents silent failures and undefined behavior
**Migration from older versions:**
- Code that passes valid (non-NULL) parameters continues to work unchanged
- Code that relied on graceful NULL handling will now trigger assertions
- Use a debugger or assertion handler to catch and fix NULL parameter issues
## Usage Patterns
### Basic Usage
```c
// In exactly ONE .c file:
#define DS_IMPLEMENTATION
#include "dynamic_string.h"
// In all other files:
#include "dynamic_string.h"
```
### Correct Memory Management Patterns
**Simple operations:**
```c
ds_string str = ds_new("Hello");
// Use str directly with any C function...
ds_release(&str); // Always clean up
```
**Transform and replace:**
```c
ds_string str = ds_new("Hello");
ds_string new_str = ds_append(str, " World");
ds_release(&str); // Release original
str = new_str; // Replace reference
// Use str...
ds_release(&str);
```
**Multiple operations - use StringBuilder:**
```c
ds_builder sb = ds_builder_create();
ds_builder_append(&sb, "Building ");
ds_builder_append(&sb, "efficiently");
ds_string result = ds_builder_to_string(&sb); // sb consumed here
ds_builder_release(&sb); // Always safe to call
// Use result...
ds_release(&result);
```
**β Don't do this (memory leaks):**
```c
ds_string str = ds_new("Hello");
str = ds_append(str, " World"); // LEAK! Lost reference to "Hello"
str = ds_append(str, "!"); // LEAK! Lost reference to "Hello World"
```
### When to Use StringBuilder vs Simple Operations
**Use simple operations for:**
- One or two concatenations
- Transforming existing strings
- Functional-style processing
**Use StringBuilder for:**
- Building strings in loops
- Multiple concatenations (3+)
- Performance-critical string construction
- When you don't know final size
## API Reference
### Core Functions
```c
// Creation and memory management
ds_string ds_new(const char* text); // Create string
ds_string ds_create_length(const char* text, size_t length);
ds_string ds_retain(ds_string str); // Share reference
void ds_release(ds_string* str); // Release reference
// String operations (immutable - return new strings)
ds_string ds_append(ds_string str, const char* text);
ds_string ds_append_char(ds_string str, uint32_t codepoint);
ds_string ds_prepend(ds_string str, const char* text);
ds_string ds_insert(ds_string str, size_t index, const char* text);
ds_string ds_substring(ds_string str, size_t start, size_t len);
ds_string ds_concat(ds_string a, ds_string b);
ds_string ds_join(ds_string* strings, size_t count, const char* separator);
// Utility functions
size_t ds_length(ds_string str);
size_t ds_refcount(ds_string str);
int ds_compare(ds_string a, ds_string b);
int ds_find(ds_string str, const char* needle);
int ds_starts_with(ds_string str, const char* prefix);
int ds_ends_with(ds_string str, const char* suffix);
int ds_is_shared(ds_string str);
int ds_is_empty(ds_string str);
```
### StringBuilder Functions
```c
// StringBuilder creation and management
ds_builder ds_builder_create(void);
ds_builder ds_builder_create_with_capacity(size_t capacity);
ds_builder ds_builder_retain(ds_builder sb);
void ds_builder_release(ds_builder* sb);
// Basic building operations (modify in-place)
int ds_builder_append(ds_builder sb, const char* text);
int ds_builder_append_char(ds_builder sb, uint32_t codepoint);
int ds_builder_append_string(ds_builder sb, ds_string str);
int ds_builder_insert(ds_builder sb, size_t index, const char* text);
void ds_builder_clear(ds_builder sb);
// Formatting functions (NEW in v0.3.1)
int ds_builder_append_format(ds_builder sb, const char* fmt, ...);
int ds_builder_append_format_v(ds_builder sb, const char* fmt, va_list args);
// Numeric append functions (NEW in v0.3.1)
int ds_builder_append_int(ds_builder sb, int value);
int ds_builder_append_uint(ds_builder sb, unsigned int value);
int ds_builder_append_long(ds_builder sb, long value);
int ds_builder_append_double(ds_builder sb, double value, int precision);
// Buffer operations (NEW in v0.3.1)
int ds_builder_append_length(ds_builder sb, const char* text, size_t length);
int ds_builder_prepend(ds_builder sb, const char* text);
int ds_builder_replace_range(ds_builder sb, size_t start, size_t end, const char* replacement);
// Content manipulation (NEW in v0.3.1)
int ds_builder_remove_range(ds_builder sb, size_t start, size_t length);
// Conversion (StringBuilder becomes consumed)
ds_string ds_builder_to_string(ds_builder sb);
// Inspection
size_t ds_builder_length(ds_builder sb);
size_t ds_builder_capacity(ds_builder sb);
const char* ds_builder_cstr(ds_builder sb);
```
### Unicode Functions
```c
// Unicode iteration and analysis
ds_codepoint_iter ds_codepoints(ds_string str);
uint32_t ds_iter_next(ds_codepoint_iter* iter);
int ds_iter_has_next(const ds_codepoint_iter* iter);
size_t ds_codepoint_length(ds_string str);
uint32_t ds_codepoint_at(ds_string str, size_t index);
```
### Convenience Macros
```c
#define ds_empty() ds_new("")
#define ds_from_literal(lit) ds_new(lit)
```
## Configuration
Override behavior with macros before including:
```c
#define DS_MALLOC my_malloc
#define DS_FREE my_free
#define DS_REALLOC my_realloc
#define DS_STATIC // Make all functions static
#define DS_IMPLEMENTATION
#include "dynamic_string.h"
```
## Building Examples
```bash
git clone https://github.com/yourusername/dynamic_string.h
cd dynamic_string.h
mkdir build && cd build
cmake ..
make
./example
```
## Version History
### v0.3.1 (Current)
- **Added**: `ds_builder_append_format()` and `ds_builder_append_format_v()` - Printf-style formatting directly into StringBuilder
- **Added**: Numeric append functions: `ds_builder_append_int()`, `ds_builder_append_uint()`, `ds_builder_append_long()`, `ds_builder_append_double()`
- **Added**: Buffer operations: `ds_builder_append_length()`, `ds_builder_prepend()`, `ds_builder_replace_range()`
- **Added**: Content manipulation: `ds_builder_remove_range()`
- **Documentation**: Complete Doxygen documentation for all new StringBuilder functions
- **Documentation**: Added comprehensive unit tests for all new functions (83 total tests)
- **Enhancement**: StringBuilder now supports advanced string building operations for more efficient construction
### v0.3.0
- **BREAKING CHANGE**: Renamed `ds_stringbuilder` to `ds_builder` for consistency
- **BREAKING CHANGE**: `ds_builder` is now heap-allocated and reference-counted like `ds_string`
- **BREAKING CHANGE**: Replaced `ds_builder_destroy()` with `ds_builder_release()` that sets pointer to NULL
- **Added**: `ds_builder_retain()` for safe sharing of builders between different parts of code
- **Changed**: Builder functions now take `ds_builder` (pointer) instead of `ds_builder*`
- **Improved**: Consistent retain/release semantics across both `ds_string` and `ds_builder`
- **Fixed**: Memory management is now fully reference-counted for builders
### v0.2.2
- **BREAKING CHANGE**: All functions now assert on NULL `ds_string` parameters instead of handling them gracefully
- **API Safety**: Consistent assertion-based parameter validation across entire API
- **Error Detection**: NULL parameters now trigger immediate assertion failures with descriptive error messages
- **Testing**: Implemented comprehensive assertion testing framework with signal handling
- **Documentation**: Updated all function documentation to specify parameters "must not be NULL"
- **Compatibility**: Maintains 100% backward compatibility for valid (non-NULL) usage patterns
- **Testing**: All 71 tests pass with new assertion-based error handling
### v0.2.1
- **Critical Fix**: Buffer overflow vulnerability in `ds_create_length()` - now correctly handles requested length vs source string length
- **Critical Fix**: Segmentation fault in test suite caused by dangerous NULL parameter testing
- **Security**: Added comprehensive NULL parameter validation with assertions for `ds_create_length()`, `ds_prepend()`, `ds_insert()`, `ds_substring()`, `ds_replace()`, and `ds_replace_all()`
- **Fixed**: `ds_insert()` beyond-bounds behavior now inserts at string end instead of returning unchanged
- **Testing**: Corrected 6 failing tests that had incorrect expectations, all 73 tests now pass
- **API Safety**: Functions now fail fast with clear assertion messages instead of silent undefined behavior
### v0.2.0
- **Added**: Atomic reference counting support (`DS_ATOMIC_REFCOUNT`) for safe concurrent reference sharing
- **Added**: `ds_contains()` - clean boolean check for substring presence
- **Added**: `ds_find_last()` - find last occurrence of substring
- **Added**: `ds_hash()` - FNV-1a hash function for use in hash tables
- **Added**: `ds_compare_ignore_case()` - case-insensitive string comparison
- **Added**: `ds_truncate()` - smart truncation with optional ellipsis
- **Added**: `ds_format_v()` - va_list version of ds_format for wrapper functions
- **Added**: `ds_escape_json()` / `ds_unescape_json()` - JSON string escaping/unescaping
- **Removed**: Version macros (not needed for single-header libraries)
- **Improved**: All complex string building operations now use StringBuilder internally for efficiency
- **Documentation**: Added CLAUDE.md for AI assistance and comprehensive usage examples
### v0.1.0
- **Added**: Version macros (`DS_VERSION_MAJOR`, `DS_VERSION_MINOR`, `DS_VERSION_PATCH`, `DS_VERSION_STRING`)
- **Added**: Programmatic version checking with `DS_VERSION` macro
- **Documentation**: Comprehensive doxygen comments with code examples and cross-references
- **Build**: GitHub Actions workflow for automatic documentation generation and deployment
### v0.0.2
- **Breaking**: `ds()` renamed to `ds_new()` for clarity
- **Breaking**: `ds_sb_*` functions renamed to `ds_builder_*` for readability
- **Breaking**: Removed `ds_cstr()` - direct C compatibility
- **Breaking**: `ds_ref_count()` renamed to `ds_refcount()`
- **Breaking**: StringBuilder becomes single-use after `ds_builder_to_string()`
- **Fixed**: Memory corruption bugs in StringBuilder
- **Improved**: Consistent internal API usage
- **Added**: String transformation functions (`ds_trim`, `ds_split`, `ds_format`)
### v0.0.1
- Initial release with basic functionality
## Memory Layout
Each string uses a **single allocation** with metadata and data stored together:
```
Memory: [refcount|length|string_data|\0]
^
ds_string points here
```
This provides:
- **Better cache locality** - metadata and data in same allocation
- **Fewer allocations** - no separate buffer allocation
- **Direct C compatibility** - pointer points to actual string data
- **Automatic null termination** - works with C string functions
## License
Dual licensed under your choice of:
- [MIT License](LICENSE-MIT)
- [The Unlicense](LICENSE-UNLICENSE) (public domain)
Choose whichever works better for your project!
## Design Philosophy
This library brings modern string handling to C by combining:
- **Rust's approach** - UTF-8 storage with codepoint iteration
- **Swift's model** - reference counting with automatic memory management
- **STB-style delivery** - single header file, easy integration
- **C compatibility** - works seamlessly with existing C code
The result is a string library that's both **safe and fast**, making C string handling as pleasant as higher-level
languages while maintaining C's performance characteristics and perfect compatibility with existing C APIs.