Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kddnewton/yarp-identifiers
https://github.com/kddnewton/yarp-identifiers
Last synced: 23 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/kddnewton/yarp-identifiers
- Owner: kddnewton
- Created: 2023-08-28T21:33:36.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-28T21:45:48.000Z (over 1 year ago)
- Last Synced: 2024-11-13T08:36:41.188Z (3 months ago)
- Language: C
- Size: 2.93 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# yarp-identifiers
In the [https://github.com/ruby/yarp](yarp) project, we need to parse identifiers in Ruby source. Frequently our parsing of identifiers shows up in our performance profiles (because identifiers are very common in Ruby code). In order to parse them, you start at some starting point (a letter or an underscores) and then read as many letters, digits, and underscores as you can. In Ruby this is actually encoding dependent, as you can encode your source in dozens of encodings. We want to only bother attempting to optimize the default path here (`UTF-8`).
In YARP, we use a lookup table to determine if a character is a letter, digit, or underscore. This is a 256 byte table, and we index into it with the character's byte value.
This repository is exploring other approaches, like using SIMD instructions to do the lookup or using SIMD instructions to check against ranges of characters. At the moment it is only attempting to do this with ASCII.
To compile and run the code, run `make`.