Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/josephg/librope
UTF-8 rope library for C
https://github.com/josephg/librope
Last synced: 12 days ago
JSON representation
UTF-8 rope library for C
- Host: GitHub
- URL: https://github.com/josephg/librope
- Owner: josephg
- License: other
- Created: 2012-08-21T05:09:48.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2021-10-13T07:35:09.000Z (about 3 years ago)
- Last Synced: 2024-10-16T04:51:10.107Z (27 days ago)
- Language: C
- Size: 96.7 KB
- Stars: 273
- Watchers: 12
- Forks: 26
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
librope
=======This is a little C library for heavyweight utf-8 strings (rope). Unlike regular C strings, ropes can do substring insertion and deletion in O(log n) time.
librope is implemented using skip lists, which have the same big-O time complexity as trees but don't require rebalancing.
librope is _fast_. It will happily perform [~15 million edit operations per second](https://home.seph.codes/public/rope_bench/realworld/C-JumpRope/automerge-paper/report/index.html) on a modern CPU. Inserts and deletes in librope outperform straight C strings for any document longer than a few hundred bytes.
## Support
This library works (C code never dies). But I'm moving to rust for my newer projects. This library has been rewritten in rust as [Jumprope](https://crates.io/crates/jumprope). Jumprope is another 2-3x faster than this library on real world editing traces. Its obnoxiously fast.
Usage
-----Just add `rope.c` and `rope.h` to your project.
Be sure to add `rope.c` to your compile line as well.```c
// Import rope library into project
#include "rope.h"// Make a new empty rope
rope *r = rope_new();// Put some content in it (at position 0)
rope_insert(r, 0, "Hi there!");// Delete 6 characters at position 2
rope_del(r, 2, 6);// Get the whole string back out of the rope
uint8_t *str = rope_create_cstr(r);// str now contains "Hi!"! Test it out!:
_rope_print(r);// Done with the rope?
rope_free(r);
```Wide Character String Compatibility
-----------------------------------String insertion / deletion positions in Javascript, Objective-C (NSString), Java, C# and others are **wrong sometimes**!!!
These languages store strings as `wchar` arrays (arrays of two byte characters). Some characters in the unicode character set require more than two bytes. These languages encode such characters using multiple wchars as per UTF-16. This works most of the time. However, insertion and deletion positions in these strings still refer to offsets in the underlying array. So unicode characters which take up 4 bytes in UTF-16 count as two characters for the purpose of deletion ranges, insertion positions and string length.
Even though these characters are exceptionally rare, I don't want my editor to go all funky if people start getting creative. About a quarter of librope's code is dedicated to fixing this mismatch. However, bookkeeping isn't free - librope performance drops by 35% when wchar conversion support is enabled.
For more information, read my [blog post about it](https://josephg.com/blog/string-length-lies).
Long story short, if you need to interoperate with strings from any of these dodgy languages, here's what you do:
- Compile with `-DROPE_WCHAR=1`. This macro enables the expensive wchar bookkeeping.
- Use the alternate insert & delete functions `rope_insert_at_wchar(...)` and `rope_del_at_wchar(...)` when your index / size is specified in UTF-16 offsets.Take a look at the header file for documentation.
#### Beware:
- When using `rope_insert_at_wchar` you still need to convert the string you're inserting into UTF-8 before you pass it into librope.
- The API lets you try to delete or insert halfway through a large character. You probably don't want to do that.
- librope is 100% faithful when it comes to the characters you're inserting. If your string has byte order marks, you might want to remove them before passing the string into librope.