{"id":15056437,"url":"https://github.com/sirwumpus/erlang-bs","last_synced_at":"2025-04-10T04:10:49.142Z","repository":{"id":57492489,"uuid":"84501487","full_name":"SirWumpus/erlang-bs","owner":"SirWumpus","description":"C-like string and ctype functions for Erlang binary strings.","archived":false,"fork":false,"pushed_at":"2023-01-03T17:17:01.000Z","size":89,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-07T19:49:02.591Z","etag":null,"topics":["binary-strings","ctype-functions","erlang","fnv-1a","string-manipulation","strings","sunday"],"latest_commit_sha":null,"homepage":null,"language":"Erlang","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SirWumpus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-03-10T00:23:29.000Z","updated_at":"2023-01-17T18:18:38.000Z","dependencies_parsed_at":"2023-02-01T08:16:07.236Z","dependency_job_id":null,"html_url":"https://github.com/SirWumpus/erlang-bs","commit_stats":null,"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SirWumpus%2Ferlang-bs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SirWumpus%2Ferlang-bs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SirWumpus%2Ferlang-bs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SirWumpus%2Ferlang-bs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SirWumpus","download_url":"https://codeload.github.com/SirWumpus/erlang-bs/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248154987,"owners_count":21056543,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binary-strings","ctype-functions","erlang","fnv-1a","string-manipulation","strings","sunday"],"created_at":"2024-09-24T21:51:28.494Z","updated_at":"2025-04-10T04:10:49.124Z","avatar_url":"https://github.com/SirWumpus.png","language":"Erlang","funding_links":[],"categories":[],"sub_categories":[],"readme":"Binary String Support\n=====================\n\nC-like string and ctype functions for Erlang binary strings.\n\n\nData Types\n----------\n\n* Bs = binary() ; string\n* Ch = integer()\n* Base = 0 | 2..36\n* Delims = binary() ; string\n* Fmt = binary() ; string\n* Index = non_neg_integer()\n* Length = non_neg_integer()\n* Date = { Year, Month, Day }\n* Time = { Hour, Minute, Second }\n* Tz = integer() ; time zone offset in seconds, eg. -18000 = -05:00 = EST, -12600 = -03:30 = Newfoundland NST\n* Dtz = { Date, Time, Tz }\n* Utc = { Date, Time, 0 }\n* Pattern = binary() ; sub-string to find\n* EpochSeconds = non_neg_integer() ; seconds since 1970-01-01T00:00:00Z.\n\n\nExports\n-------\n\n### ctype:isbase(Ch, Base) -\u003e boolean()\nTrue if the character is in the given number base (2..36).\n\n- - -\n\n### ctype:isblank(Ch) -\u003e boolean()\nTrue if the character is a space or tab.\n\n- - -\n### ctype:isspace(Ch) -\u003e boolean()\nTrue if the character is a space, tab, carriage return, line feed, form feed, or vertical tab.\n\n- - -\n### ctype:isprint(Ch) -\u003e boolean()\nTrue if the character is a printable character, including space.\n\n- - -\n### ctype:isdigit(Ch) -\u003e boolean()\nTrue if the character is a decimal digit.\n\n- - -\n### ctype:isxdigit(Ch) -\u003e boolean()\nTrue if the character is a hexadecimal digit.\n\n- - -\n### ctype:iscntrl(Ch) -\u003e boolean()\nTrue if the characters is a control character.\n\n- - -\n### ctype:isupper(Ch) -\u003e boolean()\nTrue if the character is an upper case character.\n\n- - -\n### ctype:islower(Ch) -\u003e boolean()\nTrue if the character is an lower case character.\n\n- - -\n### ctype:isalpha(Ch) -\u003e boolean()\nTrue if the character is an alphabetic character.\n\n- - -\n### ctype:isalnum(Ch) -\u003e boolean()\nTrue if the character is an alphabetic or numeric character.\n\n- - -\n### ctype:ispunct(Ch) -\u003e boolean()\nTrue if the character is an punctuation character.\n\n- - -\n### ctype:tolower(Ch) -\u003e boolean()\nIf the character is alphabetic, return its corresponding lower case letter; otherwise the character as-is.\n\n- - -\n### ctype:toupper(Ch) -\u003e boolean()\nIf the character is alphabetic, return its corresponding upper case letter; otherwise the character as-is.\n\n- - -\n### fnv:hash32(Bs) -\u003e integer()\nReturn a Fowler-Noll-Vo 1a 32-bit hash for the given binary string. \n\n- - -\n### fnv:hash56(Bs) -\u003e integer()\nReturn a Fowler-Noll-Vo 1a 56-bit hash for the given binary string. \n\n- - -\n### str:at(Bs, Index) -\u003e byte() | badarg\nReturn the character/byte at the given index in the binary string.  Alias for `binary:at/2`.\n\n- - -\n### str:cat(Bs1, Bs2) -\u003e Bs\nConcatenate two binary strings.\n\n- - -\n### str:chr(Bs, Ch) -\u003e  Index | -1\nReturn index of first occurrence of character in the binary string; otherwise `-1` if not found.\n\n- - -\n### str:casecmp(Bs1, Bs2) -\u003e integer()\nReturn an integer greater than, equal to, or less than 0 according to whether case-less binary string Bs1 is greater than, equal to, or less than case-less binary string Bs2.\n\n- - -\n### str:casestr(Bs, Pattern) -\u003e Index | -1\nReturn the index of the first occurence of the case-less Pattern in Bs; otherwise `-1` if not found.\n\n- - -\n### str:cmp(Bs1, Bs2) -\u003e integer() \nReturn an integer greater than, equal to, or less than 0 according to whether binary string Bs1 is greater than, equal to, or less than binary string Bs2.\n\n- - -\n### str:cpy(Bs) -\u003e Bs\nReturn a copy of Bs.  Alias for `binary:copy/1`.\n\n- - -\n### str:cspn(Bs, Delims) -\u003e Length\nReturn the number of leading characters in the binary string before any of the delimiters are found.\n\n- - -\n### str:error(Reason) -\u003e Bs\nReturn binary string error message for Reason.\n\n- - -\n### str:ftime(Fmt, {Date, Time}) -\u003e Bs\n### str:ftime(Fmt, {Date, Time, Tz}) -\u003e Bs\n### str:ftime(Fmt, EpochSeconds) -\u003e Bs\nUse binary string Fmt to format Date and Time into Bs.  All ordinary characters are copied as-is, while the following format characters are replaced (similar to strftime(3)).  Without Tz, assumes local time of user $TZ or system when $TZ is unset.  To ensure UTC, use ftime/2 with Tz = 0.\n\n**%A** is replaced by the ~~locale's~~ English full weekday name.\n\n**%a** is replaced by the ~~locale's~~ English abbreviated weekday name.\n\n**%B** is replaced by the ~~locale's~~ English full month name.\n\n**%b** or **%h** is replaced by the ~~locale's~~ English abbreviated month name.\n\n**%C** is replaced by the century (a year divided by 100 and truncated to an integer) as a decimal number [00,99].\n\n**%c** is replaced by the ~~locale's~~ RFC appropriate date and time representation, `%e %b %Y %H:%M:%S`.\n\n**%D** is replaced by the date in the format `%m/%d/%y`.\n\n**%d** is replaced by the day of the month as a decimal number [01,31].\n\n**%e** is replaced by the day of month as a decimal number [1,31]; single digits are preceded by a blank.\n\n**%F** is replaced by the date in the format `%Y-%m-%d` (the ISO 8601 date format).\n\n**%G** is replaced by the ISO 8601 year with century as a decimal number.\n\n**%g** is replaced by the ISO 8601 year without century as a decimal number (00-99).  This is the year that includes the greater part of the week, where Monday as the first day of a week.  See also `%V`.\n\n**%H** is replaced by the hour (24-hour clock) as a decimal number [00,23].\n\n**%I** is replaced by the hour (12-hour clock) as a decimal number [01,12].\n\n**%j** is replaced by the day of the year as a decimal number [001,366].\n\n**%k** is replaced by the hour (24-hour clock) as a decimal number [0,23]; single digits are preceded by a blank.\n\n**%l** is replaced by the hour (12-hour clock) as a decimal number [1,12]; single digits are preceded by a blank.\n\n**%M** is replaced by the minute as a decimal number [00,59].\n\n**%m** is replaced by the month as a decimal number [01,12].\n\n**%n** is replaced by a newline.\n\n**%p** is replaced by ~~the locale's equivalent of~~ either \"am\" or \"pm\".\n\n**%R** is replaced by the time in the format `%H:%M`.\n\n**%r** is replaced by the ~~locale's~~ representation of 12-hour clock time using AM/PM notation.\n\n**%s** is replaced by the number of UTC seconds since the Epoch.\n\n**%T** is replaced by the time in the format `%H:%M:%S`.\n\n**%t** is replaced by a tab.\n\n**%u** is replaced by the weekday (Monday as the first day of the week) as a decimal number [1,7].\n\n**%V** is replaced by the week number of the year (Monday as the first day of the week) as a decimal number [01,53]. According to ISO 8601 the week containing January 1 is week 1 if it has four or more days in the new year, otherwise it is week 53 of the previous year, and the next week is week 1.\n\n**%v** is replaced by the date in the format `%e-%b-%Y`.\n\n**%Y** is replaced by the year with century as a decimal number.\n\n**%y** is replaced by the year without century as a decimal number [00,99].\n\n**%z** is replaced by the offset from the Prime Meridian in the format +HHMM or -HHMM (ISO 8601) as appropriate, with positive values representing locations east of Greenwich, or by the empty string if this is not determinable.  `[+-]hhmm`.\n\n**%%** is replaced by `%'.\n\n\n- - -\n### str:isprintable(Bs) -\u003e boolean()\nTrue if the binary string consists only of printable characters, ie. every character is one of `ctype:isprint/1`, `ctype:isspace/1`, or ASCII control characters `BEL`, `BS`, or `ESC`.\n\n- - -\n### str:join(Ch, BsList) -\u003e Bs\n### str:join(Delim, BsList) -\u003e Bs\n\nUsing a character or binary string as a delimeter, join all the list of binary strings together into one string.\n\n- - -\n### str:len(Bs) -\u003e Length\nLength of binary string.  Alias for `byte_size/1`.\n\n- - -\n### str:lower(Bs) -\u003e Bs\nReturn a binary string converted to lower case.\n\n- - -\n### str:lpad(Bs, Pad, Width) -\u003e Bs\nReturn Bs padded with Pad characters to the left until at least Width.\n\n- - -\n### str:ltrim(Bs) -\u003e Bs\nRemove leading whitespace from a binary string.\n\n- - -\n### str:ncasecmp(Bs1, Bs2, Length) -\u003e integer() \nReturn an integer greater than, equal to, or less than 0 according to whether case-less binary string Bs1 is greater than, equal to, or less than case-less binary string Bs2, comparing at most Length octets.\n\n- - -\n### str:ncat(Bs1, Bs2, Length) -\u003e Bs\nAppend Length characters from Bs2 to Bs1.\n\n- - -\n### str:ncmp(Bs1, Bs2, Length) -\u003e integer()\nReturn an integer greater than, equal to, or less than 0 according to whether binary string Bs1 is greater than, equal to, or less than binary string Bs2, comparing at most Length octets.\n\n- - -\n### str:pad_int(Int, Pad, Width) -\u003e Bs\nReturn a binary string with the decimal integer right justified to the minimum field width; numbers shorter than the field width are left padded.  If the integer is negative and Pad is the zero (0) character, then a minus sign appears ahead of the zero padding.  Positive numbers have no sign.\n\n- - -\n### str:pad_sign_int(Int, Pad, Width) -\u003e Bs\nReturn a binary string with the signed decimal integer right justified to the minimum field width; numbers shorter than the field width are left padded.  If Pad is the zero (0) character, then the plus or minus sign appears ahead of the zero padding.\n\n- - -\n### str:ptime(Bs, Fmt) -\u003e { {Date, Time, Tz}, \u003c\u003c Rest \u003e\u003e } | {badarg, \u003c\u003c Here \u003e\u003e}\nParse the leading date-time of Bs according to Fmt and return a date-time-tz tuple and the remainder of the string not consumed.  If a time zone conversion is not specified in the Fmt, then the local time zone of the user's `$TZ` or system is assumed.  If there is a parse error, `badarg` and remainder of the binary string where the parse failed is returned.\n\nThe format string consists of zero or more conversion specifications, whitespace characters as defined by `ctype:isspace()`, and ordinary characters.  Whitespace matches zero or more whitespace characters and ordinary characters match themselves.  The following `%` format conversions are supportted:\n\n**%a** the day of week, using ~~the locale's~~ English weekday names; either the abbreviated or full name may be specified.  Case is ignored.\n\n**%A** the same as %a.\n\n**%b** the month, using ~~the locale's~~ English month names; either the abbreviated or full name may be specified.  Case is ignored.\n\n**%B** the same as %b.\n\n**%c** the date and time, using ~~the locale's date and time format~~ `%e %b %Y %H:%M:%S`.\n\n**%d** the day of month [1,31]; leading zeros are permitted but not required.\n\n**%D** the date as %m/%d/%y.\n\n**%e** the same as %d.\n\n**%F** the date as %Y-%m-%d (the ISO 8601 date format).\n\n**%h** the same as %b.\n\n**%H** the hour (24-hour clock) [0,23]; leading zeros are permitted but not required.\n\n**%I** the hour (12-hour clock) [1,12]; leading zeros are permitted but not required.\n\n**%j** the day number of the year [1,366]; leading zeros are permitted but not required.\n\n**%k** the same as %H.\n\n**%l** the same as %I.\n\n**%m** the month number [1,12]; leading zeros are permitted but not required.\n\n**%M** the minute [0,59]; leading zeros are permitted but not required.\n\n**%n** any white-space, including none.\n\n**%p** ~~the locale's equivalent of~~ AM or PM.  Case is ignored.\n\n**%r** the time (12-hour clock) with %p, ~~using the locale's time format~~ `%l:%M %p`.\n\n**%R** the time as %H:%M.\n\n**%S** the seconds [0,61]; leading zeros are permitted but not required.\n\n**%s** the number of UTC seconds since the Epoch.\n\n**%t** any white-space, including none.\n\n**%T** the time as %H:%M:%S.\n\n**%y** the year within the 20th century [69,99] (1969..1999) or the 21st century [0,68] (2000..2068); leading zeros are permitted but not required.\n\n**%Y** the year, including the century (i.e., 1996).\n\n**%z** an ISO 8601 or RFC-2822 timezone specification.  This is one of the following: the offset from Universal Time Coordinate (`UTC') specified as: \"[+-]hh[:]mm\".\n\n**%%** matches a literal `%'.  No argument is converted.\n\n- - -\n### str:rchr(Bs, Ch) -\u003e  Index | -1\nReturn index of last occurrence of character in the binary string; otherwise `-1` if not found.\n\n- - -\n### str:rpad(Bs, Pad, Width) -\u003e Bs\nReturn Bs padded with Pad characters to the right until at least Width.\n\n- - -\n### str:ncpy(Bs, Length) -\u003e Bs\nReturn a copy of the first Length octets of Bs. \n\n- - -\n### str:rev(Bs) -\u003e Bs\nReverse the binary string.\n\n- - -\n### str:rtrim(Bs) -\u003e  Bs\nRemove trailing whitespace from a binary string.\n\n- - -\n### str:split(Bs) -\u003e List\n### str:split(Bs, Delims) -\u003e List\nReturn a list of binary strings from `Bs` split by `Delims` using `str:token`.  Default `Delims` are whitespace characters.\n\n- - -\n### str:spn(Bs, Delims) -\u003e Length\nReturn the number of leading delimiters in the binary string.\n\n- - -\n### str:str(Bs, Pattern) -\u003e Index | -1\nReturn the index of the first occurence of Pattern in Bs; otherwise `-1` if not found.\n\n- - -\n### str:sub(Bs, Start) -\u003e Bs  \nReturn the binary substring from starting index until  end of string.  The index counts from zero (0).\n\n- - -\n### str:sub(Bs, Start, Stop) -\u003e Bs\nReturn the binary substring between start and stop index, excluding stop.  The indices counts from zero (0).  Wrapper for `binary_part/3`.\n\n- - -\n### str:to_date_time(Bs) -\u003e {{Date, Time, Tz}, \u003c\u003c Rest \u003e\u003e} | badarg\nAttempt to parse the leading portion of Bs as an ISO 8601, RFC 2822, or ctime() date-time string.  If time zone information is missing, then the local time zone is assumed.  `badarg` is returned if no input is consumed.\n\n- - -\n### str:to_int(Bs, Base) -\u003e { integer(), \u003c\u003c Rest \u003e\u003e } | {badarg, \u003c\u003c Here \u003e\u003e}\nReturn a tulpe of the leading parsed integer and remaining binary string.  The parsed integer string can be padded with leading zeros.  If base is zero or 16, the string may then include a '0x' prefix, and the number will be read in base 16; otherwise, a zero base is taken as 10 (decimal) unless the next character is '0', in which case it is taken as 8 (octal).  `badarg` is returned if no input is consumed.\n\n- - -\n### str:tok(Bs, Delims) -\u003e {\u003c\u003c Token \u003e\u003e, \u003c\u003c Rest \u003e\u003e}\nReturn a tuple of the first token separated by one or more delimiters and the remaining binary string.\n\n- - -\n### str:token(Bs) -\u003e {\u003c\u003c Token \u003e\u003e, \u003c\u003c Rest \u003e\u003e}\n### str:token(Bs, Delims) -\u003e {\u003c\u003c Token \u003e\u003e, \u003c\u003c Rest \u003e\u003e}\nToken parser that understands single and double quoted string segments, and backslash escaped characters.  Within a quoted string segment, backslash is itself, eg. `\"ab\\cd\"` is `ab\\cd`, and paired quotes represent a literal quote, eg. `\"ab\"\"cd\"` yields `ab\"cd` and `'12''34'` yields `12'34`.  Return a tuple of the first token separated by an unquoted delimiter from the set of `Delims` followed by any whitespace, and the remaining binary string.  Default `Delims` are ASCII whitespace characters.\n\n- - -\n### str:tr(Bs, FromSet) -\u003e Bs\n### str:tr(Bs, FromSet, ToSet) -\u003e Bs\nFor each character in Bs found at position N of FromSet (a binary string) is replaced by a character at position N of ToSet (a binary string); if ToSet is shorter than FromSet, then the last character of ToSet is used.  If ToSet is missing or empty, then characters in FromSet are deleted from Bs.\n\n- - -\n### str:trim(Bs) -\u003e Bs\nRemove leading and trailing whitespace from a binary string.\n\n- - -\n### str:upper(Bs) -\u003e Bs\nReturn a binary string converted to upper case.\n\n- - -\n### sunday:init(Pattern, MaxErr) -\u003e {Pattern, MaxErr, DeltaMap} | badarg\nGenerate the delta shift table used for the Boyer-Moore-Sunday approximate string matching for MaxErr mismatches.  The return tuple can be passed to `sunday:search/2`.\n\n- - -\n### sunday:search(Bs, Pattern) -\u003e Index | -1 | badarg\nEquivalent to `sunday:search(Bs, sunday:init(Pattern, 0))`.\n\n- - -\n### sunday:search(Bs, Pattern, MaxErr) -\u003e Index | -1 | badarg\nEquivalent to `sunday:search(Bs, sunday:init(Pattern, MaxErr))`.\n\n- - -\n### sunday:search(Bs, {Pattern, MaxErr, DeltaMap}) -\u003e Index | -1\nGeneralised Boyer-Moore-Sunday approximate string matching for MaxErr mismatches.  For MaxErr=0, the program performs exact string searching.  Return the index of the first occurence of Pattern in Bs; otherwise `-1` if not found.\n\n\nReferences\n----------\n\nFowler, Noll, Vo; 1994  \n\u003chttp://www.isthe.com/chongo/tech/comp/fnv/index.html\u003e\n\nFowler, Noll, Vo on Wikipedia  \n\u003chttps://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function\u003e\n\nWhich hashing algorithm is best for uniqueness and speed?  \n\u003chttp://programmers.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed\u003e\n\n\"A very fast substring search algorithm\";  \nDaniel M. Sunday; Communications ofthe ACM; August 1990;  \n\u003chttps://csclub.uwaterloo.ca/~pbarfuss/p132-sunday.pdf\u003e\n\n\"Approximate Boyer-Moore String Matching\";  \nJorma Tarhio And Esko Ukkonen; 1990;  \n\u003chttps://www.cs.hut.fi/u/tarhio/papers/abm.pdf\u003e\n\n\"Approximate Boyer-Moore String Matching\" Explained;  \nPresention by Kuei-hao Chen;  \n\u003chttp://t2.ecp168.net/webs@73/cyberhood/Approximate_String_Matching/BHM_approximate_string_Algorithm.ppt\u003e\n\n\nCopyright\n---------\n\nCopyright 2017, 2021 by Anthony Howe.  All rights reserved.\n\n\nMIT License\n-----------\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsirwumpus%2Ferlang-bs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsirwumpus%2Ferlang-bs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsirwumpus%2Ferlang-bs/lists"}