Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/voku/portable-utf8
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for PHP.
https://github.com/voku/portable-utf8
ascii hacktoberfest multibyte multibyte-strings php php7 string string-encoding string-manipulation unicode utf-8 utf8
Last synced: 1 day ago
JSON representation
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for PHP.
- Host: GitHub
- URL: https://github.com/voku/portable-utf8
- Owner: voku
- License: apache-2.0
- Created: 2014-05-24T11:37:29.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2024-10-22T14:37:53.000Z (about 2 months ago)
- Last Synced: 2024-10-29T15:47:22.424Z (about 1 month ago)
- Topics: ascii, hacktoberfest, multibyte, multibyte-strings, php, php7, string, string-encoding, string-manipulation, unicode, utf-8, utf8
- Language: PHP
- Homepage:
- Size: 8.84 MB
- Stars: 507
- Watchers: 19
- Forks: 73
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: .github/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE-APACHE
- Security: SECURITY.md
Awesome Lists containing this project
- awesome - voku/portable-utf8 - 🉑 Portable UTF-8 library - performance optimized (unicode) string functions for PHP. (PHP)
- awesome-php - Portable UTF-8 - A string manipulation library with UTF-8 safe replacement methods. (Table of Contents / Strings)
- awesome-php-cn - Portable UTF-8 - 一个字符串处理库与utf - 8安全的替代方法. (目录 / 字符串 Strings)
- awesome-projects - Portable UTF-8 - A string manipulation library with UTF-8 safe replacement methods. (PHP / Strings)
- awesome-php - Portable UTF-8 - A string manipulation library with UTF-8 safe replacement methods. (Table of Contents / Strings)
README
[//]: # (AUTO-GENERATED BY "PHP README Helper": base file -> docs/base.md)
[![SWUbanner](https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/banner2-direct.svg)](https://github.com/vshymanskyy/StandWithUkraine/blob/main/docs/README.md)[![Build Status](https://github.com/voku/portable-utf8/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/voku/portable-utf8/actions)
[![Build status](https://ci.appveyor.com/api/projects/status/gnejjnk7qplr7f5t/branch/master?svg=true)](https://ci.appveyor.com/project/voku/portable-utf8/branch/master)
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fvoku%2Fportable-utf8.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fvoku%2Fportable-utf8?ref=badge_shield)
[![codecov.io](https://codecov.io/github/voku/portable-utf8/coverage.svg?branch=master)](https://codecov.io/github/voku/portable-utf8?branch=master)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/997c9bb10d1c4791967bdf2e42013e8e)](https://www.codacy.com/app/voku/portable-utf8)
[![Latest Stable Version](https://poser.pugx.org/voku/portable-utf8/v/stable)](https://packagist.org/packages/voku/portable-utf8)
[![Total Downloads](https://poser.pugx.org/voku/portable-utf8/downloads)](https://packagist.org/packages/voku/portable-utf8)
[![License](https://poser.pugx.org/voku/portable-utf8/license)](https://packagist.org/packages/voku/portable-utf8)
[![Donate to this project using PayPal](https://img.shields.io/badge/paypal-donate-yellow.svg)](https://www.paypal.me/moelleken)
[![Donate to this project using Patreon](https://img.shields.io/badge/patreon-donate-yellow.svg)](https://www.patreon.com/voku)# 🉑 Portable UTF-8
## Description
It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your server.
The benefit of Portable UTF-8 is that it is easy to use, easy to bundle. This library will also
auto-detect your server environment and will use the installed php-extensions if they are available,
so you will have the best possible performance.As a fallback we will use Symfony Polyfills, if needed. (https://github.com/symfony/polyfill)
The project based on ...
+ Hamid Sarfraz's work - [portable-utf8](http://pageconfig.com/attachments/portable-utf8.php)
+ Nicolas Grekas's work - [tchwork/utf8](https://github.com/tchwork/utf8)
+ Behat's work - [Behat/Transliterator](https://github.com/Behat/Transliterator)
+ Sebastián Grignoli's work - [neitanod/forceutf8](https://github.com/neitanod/forceutf8)
+ Ivan Enderlin's work - [hoaproject/Ustring](https://github.com/hoaproject/Ustring)
+ and many cherry-picks from "GitHub"-gists and "Stack Overflow"-snippets ...## Demo
Here you can test some basic functions from this library and you can compare some results with the native php function results.
+ [encoder.suckup.de](https://encoder.suckup.de/)
## Index
* [Alternative](#alternative)
* [Install](#install-portable-utf-8-via-composer-require)
* [Why Portable UTF-8?](#why-portable-utf-8)
* [Requirements and Recommendations](#requirements-and-recommendations)
* [Warning](#warning)
* [Usage](#usage)
* [Class methods](#class-methods)
* [Unit Test](#unit-test)
* [License and Copyright](#license-and-copyright)## Alternative
If you like a more Object Oriented Way to edit strings, then you can take a look at [voku/Stringy](https://github.com/voku/Stringy), it's a fork of "danielstjules/Stringy" but it used the "Portable UTF-8"-Class and some extra methods.
```php
// Standard library
strtoupper('fòôbàř'); // 'FòôBàř'
strlen('fòôbàř'); // 10// mbstring
// WARNING: if you don't use a polyfill like "Portable UTF-8", you need to install the php-extension "mbstring" on your server
mb_strtoupper('fòôbàř'); // 'FÒÔBÀŘ'
mb_strlen('fòôbàř'); // '6'// Portable UTF-8
use voku\helper\UTF8;
UTF8::strtoupper('fòôbàř'); // 'FÒÔBÀŘ'
UTF8::strlen('fòôbàř'); // '6'// voku/Stringy
use Stringy\Stringy as S;
$stringy = S::create('fòôbàř');
$stringy->toUpperCase(); // 'FÒÔBÀŘ'
$stringy->length(); // '6'
```## Install "Portable UTF-8" via "composer require"
```shell
composer require voku/portable-utf8
```If your project do not need some of the Symfony polyfills please use the `replace` section of your `composer.json`.
This removes any overhead from these polyfills as they are no longer part of your project. e.g.:
```json
{
"replace": {
"symfony/polyfill-php72": "1.99",
"symfony/polyfill-iconv": "1.99",
"symfony/polyfill-intl-grapheme": "1.99",
"symfony/polyfill-intl-normalizer": "1.99",
"symfony/polyfill-mbstring": "1.99"
}
}
```## Why Portable UTF-8?[]()
PHP 5 and earlier versions have no native Unicode support. To bridge the gap, there exist several extensions like "mbstring", "iconv" and "intl".The problem with "mbstring" and others is that most of the time you cannot ensure presence of a specific one on a server. If you rely on one of these, your application is no more portable. This problem gets even severe for open source applications that have to run on different servers with different configurations. Considering these, I decided to write a library:
## Requirements and Recommendations
* No extensions are required to run this library. Portable UTF-8 only needs PCRE library that is available by default since PHP 4.2.0 and cannot be disabled since PHP 5.3.0. "\u" modifier support in PCRE for UTF-8 handling is not a must.
* PHP 5.3 is the minimum requirement, and all later versions are fine with Portable UTF-8.
* PHP 7.0 is the minimum requirement since version 4.0 of Portable UTF-8, otherwise composer will install an older version
* PHP 8.0 support is also available and will adapt the behaviours of the native functions.
* To speed up string handling, it is recommended that you have "mbstring" or "iconv" available on your server, as well as the latest version of PCRE library
* Although Portable UTF-8 is easy to use; moving from native API to Portable UTF-8 may not be straight-forward for everyone. It is highly recommended that you do not update your scripts to include Portable UTF-8 or replace or change anything before you first know the reason and consequences. Most of the time, some native function may be all what you need.
* There is also a shim for "mbstring", "iconv" and "intl", so you can use it also on shared webspace.## Usage
Example 1: UTF8::cleanup()
```php
echo UTF8::cleanup('�Düsseldorf�');
// will output:
// Düsseldorf
```Example 2: UTF8::strlen()
```php
$string = 'string with utf-8 chars åèä - doo-bee doo-bee dooh';echo strlen($string) . "\n
";
echo UTF8::strlen($string) . "\n
";// will output:
// 70
// 67$string_test1 = strip_tags($string);
$string_test2 = UTF8::strip_tags($string);echo strlen($string_test1) . "\n
";
echo UTF8::strlen($string_test2) . "\n
";// will output:
// 53
// 50
```Example 3: UTF8::fix_utf8()
```phpecho UTF8::fix_utf8('Düsseldorf');
echo UTF8::fix_utf8('ä');
// will output:
// Düsseldorf
// ä
```# Portable UTF-8 | API
The API from the "UTF8"-Class is written as small static methods that will match the default PHP-API.
## Class methods
access
add_bom_to_string
array_change_key_case
between
binary_to_str
bom
callback
char_at
chars
checkForSupport
chr
chr_map
chr_size_list
chr_to_decimal
chr_to_hex
chunk_split
clean
cleanup
codepoints
collapse_whitespace
count_chars
css_identifier
css_stripe_media_queries
ctype_loaded
decimal_to_chr
decode_mimeheader
emoji_decode
emoji_encode
emoji_from_country_code
encode
encode_mimeheader
extract_text
file_get_contents
file_has_bom
filter
filter_input
filter_input_array
filter_var
filter_var_array
finfo_loaded
first_char
fits_inside
fix_simple_utf8
fix_utf8
getCharDirection
getSupportInfo
getUrlParamFromArray
get_file_type
get_random_string
get_unique_string
has_lowercase
has_uppercase
has_whitespace
hex_to_chr
hex_to_int
html_encode
html_entity_decode
html_escape
html_stripe_empty_tags
htmlentities
htmlspecialchars
iconv_loaded
int_to_hex
intlChar_loaded
intl_loaded
is_alpha
is_alphanumeric
is_ascii
is_base64
is_binary
is_binary_file
is_blank
is_bom
is_empty
is_hexadecimal
is_html
is_json
is_lowercase
is_printable
is_punctuation
is_serialized
is_uppercase
is_url
is_utf8
is_utf16
is_utf32
json_decode
json_encode
json_loaded
lcfirst
lcwords
levenshtein
ltrim
max
max_chr_width
mbstring_loaded
min
normalize_encoding
normalize_line_ending
normalize_msword
normalize_whitespace
ord
parse_str
pcre_utf8_support
range
rawurldecode
regex_replace
remove_bom
remove_duplicates
remove_html
remove_html_breaks
remove_ileft
remove_invisible_characters
remove_iright
remove_left
remove_right
replace
replace_all
replace_diamond_question_mark
rtrim
showSupport
single_chr_html_encode
spaces_to_tabs
str_camelize
str_capitalize_name
str_contains
str_contains_all
str_contains_any
str_dasherize
str_delimit
str_detect_encoding
str_ends_with
str_ends_with_any
str_ensure_left
str_ensure_right
str_humanize
str_iends_with
str_iends_with_any
str_insert
str_ireplace
str_ireplace_beginning
str_ireplace_ending
str_istarts_with
str_istarts_with_any
str_isubstr_after_first_separator
str_isubstr_after_last_separator
str_isubstr_before_first_separator
str_isubstr_before_last_separator
str_isubstr_first
str_isubstr_last
str_last_char
str_limit
str_limit_after_word
str_longest_common_prefix
str_longest_common_substring
str_longest_common_suffix
str_matches_pattern
str_obfuscate
str_offset_exists
str_offset_get
str_pad
str_pad_both
str_pad_left
str_pad_right
str_repeat
str_replace_beginning
str_replace_ending
str_replace_first
str_replace_last
str_shuffle
str_slice
str_snakeize
str_sort
str_split
str_split_array
str_split_pattern
str_starts_with
str_starts_with_any
str_substr_after_first_separator
str_substr_after_last_separator
str_substr_before_first_separator
str_substr_before_last_separator
str_substr_first
str_substr_last
str_surround
str_titleize
str_titleize_for_humans
str_to_binary
str_to_lines
str_to_words
str_truncate
str_truncate_safe
str_underscored
str_upper_camelize
str_word_count
strcasecmp
strcmp
strcspn
string
string_has_bom
strip_tags
strip_whitespace
stripos
stripos_in_byte
stristr
strlen
strlen_in_byte
strnatcasecmp
strnatcmp
strncasecmp
strncmp
strpbrk
strpos
strpos_in_byte
strrchr
strrev
strrichr
strripos
strripos_in_byte
strrpos
strrpos_in_byte
strspn
strstr
strstr_in_byte
strtocasefold
strtolower
strtoupper
strtr
strwidth
substr
substr_compare
substr_count
substr_count_in_byte
substr_count_simple
substr_ileft
substr_in_byte
substr_iright
substr_left
substr_replace
substr_right
swapCase
symfony_polyfill_used
tabs_to_spaces
titlecase
to_ascii
to_boolean
to_filename
to_int
to_iso8859
to_string
to_utf8
to_utf8_string
trim
ucfirst
ucwords
urldecode
utf8_decode
utf8_encode
whitespace_table
words_limit
wordwrap
wordwrap_per_line
ws## access(string $str, int $pos, string $encoding): string
↑
Return the character at the specified position: $str[1] like functionality.EXAMPLE:
UTF8::access('fòô', 1); // 'ò'
**Parameters:**
- `string $strA UTF-8 string.
`
- `int $posThe position of character to return.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringSingle multi-byte character.
`--------
## add_bom_to_string(string $str): non-empty-string
↑
Prepends UTF-8 BOM character to the string and returns the whole string.INFO: If BOM already existed there, the Input string is returned.
EXAMPLE:
UTF8::add_bom_to_string('fòô'); // "\xEF\xBB\xBF" . 'fòô'
**Parameters:**
- `string $strThe input string.
`**Return:**
- `non-empty-stringThe output string that contains BOM.
`--------
## array_change_key_case(array $array, int $case, string $encoding): string[]
↑
Changes all keys in an array.**Parameters:**
- `array $arrayThe array to work on
`
- `int $case [optional]Either CASE_UPPER
`
or CASE_LOWER (default)
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string[]An array with its keys lower- or uppercased.
`--------
## between(string $str, string $start, string $end, int $offset, string $encoding): string
↑
Returns the substring between $start and $end, if found, or an empty
string. An optional offset may be supplied from which to begin the
search for the start string.**Parameters:**
- `string $str`
- `string $startDelimiter marking the start of the substring.
`
- `string $endDelimiter marking the end of the substring.
`
- `int $offset [optional]Index from which to begin the search. Default: 0
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## binary_to_str(string $bin): string
↑
Convert binary into a string.INFO: opposite to UTF8::str_to_binary()
EXAMPLE:
UTF8::binary_to_str('11110000100111111001100010000011'); // '😃'
**Parameters:**
- `string $bin 1|0`**Return:**
- `string`--------
## bom(): non-empty-string
↑
Returns the UTF-8 Byte Order Mark Character.INFO: take a look at UTF8::$bom for e.g. UTF-16 and UTF-32 BOM values
EXAMPLE:
UTF8::bom(); // "\xEF\xBB\xBF"
**Parameters:**
__nothing__**Return:**
- `non-empty-stringUTF-8 Byte Order Mark.
`--------
## callback(callable(string): string $callback, string $str): string[]
↑**Parameters:**
- `callable(string): string $callback`
- `string $str`**Return:**
- `string[]`--------
## char_at(string $str, int $index, string $encoding): string
↑
Returns the character at $index, with indexes starting at 0.**Parameters:**
- `string $strThe input string.
`
- `int<1, max> $indexPosition of the character.
`
- `string $encoding [optional]Default is UTF-8
`**Return:**
- `stringThe character at $index.
`--------
## chars(string $str): string[]
↑
Returns an array consisting of the characters in the string.**Parameters:**
- `T $strThe input string.
`**Return:**
- `string[]An array of chars.
`--------
## checkForSupport(): true|null
↑
This method will auto-detect your server environment for UTF-8 support.**Parameters:**
__nothing__**Return:**
- `true|null`--------
## chr(int $code_point, string $encoding): string|null
↑
Generates a UTF-8 encoded character from the given code point.INFO: opposite to UTF8::ord()
EXAMPLE:
UTF8::chr(0x2603); // '☃'
**Parameters:**
- `int $code_pointThe code point for which to generate a character.
`
- `string $encoding [optional]Default is UTF-8
`**Return:**
- `string|nullMulti-byte character, returns null on failure or empty input.
`--------
## chr_map(callable(string): string $callback, string $str): string[]
↑
Applies callback to all characters of a string.EXAMPLE:
UTF8::chr_map([UTF8::class, 'strtolower'], 'Κόσμε'); // ['κ','ό', 'σ', 'μ', 'ε']
**Parameters:**
- `callable(string): string $callback`
- `string $strUTF-8 string to run callback on.
`**Return:**
- `string[]The outcome of the callback, as array.
`--------
## chr_size_list(string $str): int[]
↑
Generates an array of byte length of each character of a Unicode string.1 byte => U+0000 - U+007F
2 byte => U+0080 - U+07FF
3 byte => U+0800 - U+FFFF
4 byte => U+10000 - U+10FFFFEXAMPLE:
UTF8::chr_size_list('中文空白-test'); // [3, 3, 3, 3, 1, 1, 1, 1, 1]
**Parameters:**
- `T $strThe original unicode string.
`**Return:**
- `int[]An array of byte lengths of each character.
`--------
## chr_to_decimal(string $char): int
↑
Get a decimal code representation of a specific character.INFO: opposite to UTF8::decimal_to_chr()
EXAMPLE:
UTF8::chr_to_decimal('§'); // 0xa7
**Parameters:**
- `string $charThe input character.
`**Return:**
- `int`--------
## chr_to_hex(int|string $char, string $prefix): string
↑
Get hexadecimal code point (U+xxxx) of a UTF-8 encoded character.EXAMPLE:
UTF8::chr_to_hex('§'); // U+00a7
**Parameters:**
- `int|string $charThe input character
`
- `string $prefix [optional]`**Return:**
- `stringThe code point encoded as U+xxxx.
`--------
## chunk_split(string $str, int $chunk_length, string $end): string
↑
Splits a string into smaller chunks and multiple lines, using the specified line ending character.EXAMPLE:
UTF8::chunk_split('ABC-ÖÄÜ-中文空白-κόσμε', 3); // "ABC\r\n-ÖÄ\r\nÜ-中\r\n文空白\r\n-κό\r\nσμε"
**Parameters:**
- `T $strThe original string to be split.
`
- `int<1, max> $chunk_length [optional]The maximum character length of a chunk.
`
- `string $end [optional]The character(s) to be inserted at the end of each chunk.
`**Return:**
- `stringThe chunked string.
`--------
## clean(string $str, bool $remove_bom, bool $normalize_whitespace, bool $normalize_msword, bool $keep_non_breaking_space, bool $replace_diamond_question_mark, bool $remove_invisible_characters, bool $remove_invisible_characters_url_encoded): string
↑
Accepts a string and removes all non-UTF-8 characters from it + extras if needed.EXAMPLE:
UTF8::clean("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef …” — 😃 - Düsseldorf'
**Parameters:**
- `string $strThe string to be sanitized.
`
- `bool $remove_bom [optional]Set to true, if you need to remove
`
UTF-BOM.
- `bool $normalize_whitespace [optional]Set to true, if you need to normalize the
`
whitespace.
- `bool $normalize_msword [optional]Set to true, if you need to normalize MS
`
Word chars e.g.: "…"
=> "..."
- `bool $keep_non_breaking_space [optional]Set to true, to keep non-breaking-spaces,
`
in
combination with
$normalize_whitespace
- `bool $replace_diamond_question_mark [optional]Set to true, if you need to remove diamond
`
question mark e.g.: "�"
- `bool $remove_invisible_characters [optional]Set to false, if you not want to remove
`
invisible characters e.g.: "\0"
- `bool $remove_invisible_characters_url_encoded [optional]Set to true, if you not want to remove
`
invisible url encoded characters e.g.: "%0B"
WARNING:
maybe contains false-positives e.g. aa%0Baa -> aaaa.**Return:**
- `stringAn clean UTF-8 encoded string.
`--------
## cleanup(string $str): string
↑
Clean-up a string and show only printable UTF-8 chars at the end + fix UTF-8 encoding.EXAMPLE:
UTF8::cleanup("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef …” — 😃 - Düsseldorf'
**Parameters:**
- `string $strThe input string.
`**Return:**
- `string`--------
## codepoints(string|string[] $arg, bool $use_u_style): int[]|string[]
↑
Accepts a string or an array of chars and returns an array of Unicode code points.INFO: opposite to UTF8::string()
EXAMPLE:
UTF8::codepoints('κöñ'); // array(954, 246, 241)
// ... OR ...
UTF8::codepoints('κöñ', true); // array('U+03ba', 'U+00f6', 'U+00f1')**Parameters:**
- `T $argA UTF-8 encoded string or an array of such chars.
`
- `bool $use_u_styleIf True, will return code points in U+xxxx format,
`
default, code points will be returned as integers.**Return:**
- `int[]|string[]`
The array of code points:
int[] for $u_style === false
string[] for $u_style === true
--------
## collapse_whitespace(string $str): string
↑
Trims the string and replaces consecutive whitespace characters with a
single space. This includes tabs and newline characters, as well as
multibyte whitespace such as the thin space and ideographic space.**Parameters:**
- `string $strThe input string.
`**Return:**
- `stringA string with trimmed $str and condensed whitespace.
`--------
## count_chars(string $str, bool $clean_utf8, bool $try_to_use_mb_functions): int[]
↑
Returns count of characters used in a string.EXAMPLE:
UTF8::count_chars('κaκbκc'); // array('κ' => 3, 'a' => 1, 'b' => 1, 'c' => 1)
**Parameters:**
- `T $strThe input string.
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `bool $try_to_use_mb_functions [optional]Set to false, if you don't want to use`
**Return:**
- `int[]An associative array of Character as keys and
`
their count as values.--------
## css_identifier(string $str, string[] $filter, bool $strip_tags, bool $strtolower): string
↑
Create a valid CSS identifier for e.g. "class"- or "id"-attributes.EXAMPLE:
UTF8::css_identifier('123foo/bar!!!'); // _23foo-bar
copy&past from https://github.com/drupal/core/blob/8.8.x/lib/Drupal/Component/Utility/Html.php#L95
**Parameters:**
- `string $strINFO: if no identifier is given e.g. " " or "", we will create a unique string automatically
`
- `array $filter`
- `bool $strip_tags`
- `bool $strtolower`**Return:**
- `string`--------
## css_stripe_media_queries(string $str): string
↑
Remove css media-queries.**Parameters:**
- `string $str`**Return:**
- `string`--------
## ctype_loaded(): bool
↑
Checks whether ctype is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## decimal_to_chr(int|string $int): string
↑
Converts an int value into a UTF-8 character.INFO: opposite to UTF8::string()
EXAMPLE:
UTF8::decimal_to_chr(931); // 'Σ'
**Parameters:**
- `int|string $int`**Return:**
- `string`--------
## decode_mimeheader(string $str, string $encoding): false|string
↑
Decodes a MIME header field**Parameters:**
- `string $str`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `false|stringA decoded MIME field on success,
`
or false if an error occurs during the decoding.--------
## emoji_decode(string $str, bool $use_reversible_string_mappings): string
↑
Decodes a string which was encoded by "UTF8::emoji_encode()".INFO: opposite to UTF8::emoji_encode()
EXAMPLE:
UTF8::emoji_decode('foo CHARACTER_OGRE', false); // 'foo 👹'
//
UTF8::emoji_decode('foo _-_PORTABLE_UTF8_-_308095726_-_627590803_-_8FTU_ELBATROP_-_', true); // 'foo 👹'**Parameters:**
- `string $strThe input string.
`
- `bool $use_reversible_string_mappings [optional]`
When TRUE, we se a reversible string mapping
between "emoji_encode" and "emoji_decode".**Return:**
- `string`--------
## emoji_encode(string $str, bool $use_reversible_string_mappings): string
↑
Encode a string with emoji chars into a non-emoji string.INFO: opposite to UTF8::emoji_decode()
EXAMPLE:
UTF8::emoji_encode('foo 👹', false)); // 'foo CHARACTER_OGRE'
//
UTF8::emoji_encode('foo 👹', true)); // 'foo _-_PORTABLE_UTF8_-_308095726_-_627590803_-_8FTU_ELBATROP_-_'**Parameters:**
- `string $strThe input string
`
- `bool $use_reversible_string_mappings [optional]`
when TRUE, we use a reversible string mapping
between "emoji_encode" and "emoji_decode"**Return:**
- `string`--------
## emoji_from_country_code(string $country_code_iso_3166_1): string
↑
Convert any two-letter country code (ISO 3166-1) to the corresponding Emoji.**Parameters:**
- `string $country_code_iso_3166_1e.g. DE
`**Return:**
- `stringEmoji or empty string on error.
`--------
## encode(string $to_encoding, string $str, bool $auto_detect_the_from_encoding, string $from_encoding): string
↑
Encode a string with a new charset-encoding.INFO: This function will also try to fix broken / double encoding,
so you can call this function also on a UTF-8 string and you don't mess up the string.EXAMPLE:
UTF8::encode('ISO-8859-1', '-ABC-中文空白-'); // '-ABC-????-'
//
UTF8::encode('UTF-8', '-ABC-中文空白-'); // '-ABC-中文空白-'
//
UTF8::encode('HTML', '-ABC-中文空白-'); // '-ABC-中文空白-'
//
UTF8::encode('BASE64', '-ABC-中文空白-'); // 'LUFCQy3kuK3mlofnqbrnmb0t'**Parameters:**
- `string $to_encodinge.g. 'UTF-16', 'UTF-8', 'ISO-8859-1', etc.
`
- `string $strThe input string
`
- `bool $auto_detect_the_from_encoding [optional]Force the new encoding (we try to fix broken / double
`
encoding for UTF-8)
otherwise we auto-detect the current
string-encoding
- `string $from_encoding [optional]e.g. 'UTF-16', 'UTF-8', 'ISO-8859-1', etc.
`
A empty string will trigger the autodetect anyway.**Return:**
- `string`--------
## encode_mimeheader(string $str, string $from_charset, string $to_charset, string $transfer_encoding, string $linefeed, int $indent): false|string
↑**Parameters:**
- `string $str`
- `string $from_charset [optional]Set the input charset.
`
- `string $to_charset [optional]Set the output charset.
`
- `string $transfer_encoding [optional]Set the transfer encoding.
`
- `string $linefeed [optional]Set the used linefeed.
`
- `int<1, max> $indent [optional]Set the max length indent.
`**Return:**
- `false|stringAn encoded MIME field on success,
`
or false if an error occurs during the encoding.--------
## extract_text(string $str, string $search, int|null $length, string $replacer_for_skipped_text, string $encoding): string
↑
Create an extract from a sentence, so if the search-string was found, it tries to center in the output.**Parameters:**
- `string $strThe input string.
`
- `string $searchThe searched string.
`
- `int|null $length [optional]Default: null === text->length / 2
`
- `string $replacer_for_skipped_text [optional]Default: …
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## file_get_contents(string $filename, bool $use_include_path, resource|null $context, int|null $offset, int|null $max_length, int $timeout, bool $convert_to_utf8, string $from_encoding): false|string
↑
Reads entire file into a string.EXAMPLE:
UTF8::file_get_contents('utf16le.txt'); // ...
WARNING: Do not use UTF-8 Option ($convert_to_utf8) for binary files (e.g.: images) !!!
**Parameters:**
- `string $filename`
Name of the file to read.
- `bool $use_include_path [optional]`
Prior to PHP 5, this parameter is called
use_include_path and is a bool.
As of PHP 5 the FILE_USE_INCLUDE_PATH can be used
to trigger include path
search.
- `resource|null $context [optional]`
A valid context resource created with
stream_context_create. If you don't need to use a
custom context, you can skip this parameter by &null;.
- `int|null $offset [optional]`
The offset where the reading starts.
- `int<0, max>|null $max_length [optional]`
Maximum length of data read. The default is to read until end
of file is reached.
- `int $timeoutThe time in seconds for the timeout.
`
- `bool $convert_to_utf8 WARNING!!!Maybe you can't use this option for
`
some files, because they used non default utf-8 chars. Binary files
like images or pdf will not be converted.
- `string $from_encoding [optional]e.g. 'UTF-16', 'UTF-8', 'ISO-8859-1', etc.
`
A empty string will trigger the autodetect anyway.**Return:**
- `false|stringThe function returns the read data as string or false on failure.
`--------
## file_has_bom(string $file_path): bool
↑
Checks if a file starts with BOM (Byte Order Mark) character.EXAMPLE:
UTF8::file_has_bom('utf8_with_bom.txt'); // true
**Parameters:**
- `string $file_pathPath to a valid file.
`**Return:**
- `booltrue if the file has BOM at the start, false otherwise
`--------
## filter(array|object|string $var, int $normalization_form, string $leading_combining): mixed
↑
Normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed.EXAMPLE:
UTF8::filter(array("\xE9", 'à', 'a')); // array('é', 'à', 'a')
**Parameters:**
- `TFilter $var`
- `int $normalization_form`
- `string $leading_combining`**Return:**
- `mixed`--------
## filter_input(int $type, string $variable_name, int $filter, int|int[]|null $options): mixed
↑
"filter_input()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed.Gets a specific external variable by name and optionally filters it.
EXAMPLE:
// _GET['foo'] = 'bar';
UTF8::filter_input(INPUT_GET, 'foo', FILTER_UNSAFE_RAW)); // 'bar'**Parameters:**
- `int $type`
One of INPUT_GET, INPUT_POST,
INPUT_COOKIE, INPUT_SERVER, or
INPUT_ENV.
- `string $variable_name`
Name of a variable to get.
- `int $filter [optional]`
The ID of the filter to apply. The
manual page lists the available filters.
- `int|int[]|null $options [optional]`
Associative array of options or bitwise disjunction of flags. If filter
accepts options, flags can be provided in "flags" field of array.**Return:**
- `mixed`
Value of the requested variable on success, FALSE if the filter fails, or NULL if the
variable_name variable is not set. If the flag FILTER_NULL_ON_FAILURE is used, it
returns FALSE if the variable is not set and NULL if the filter fails.--------
## filter_input_array(int $type, array|null $definition, bool $add_empty): array|false|null
↑
"filter_input_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed.Gets external variables and optionally filters them.
EXAMPLE:
// _GET['foo'] = 'bar';
UTF8::filter_input_array(INPUT_GET, array('foo' => 'FILTER_UNSAFE_RAW')); // array('bar')**Parameters:**
- `int $type`
One of INPUT_GET, INPUT_POST,
INPUT_COOKIE, INPUT_SERVER, or
INPUT_ENV.
- `array|null $definition [optional]
An array defining the arguments. A valid key is a string
containing a variable name and a valid value is either a filter type, or an array
optionally specifying the filter, flags and options. If the value is an
array, valid keys are filter which specifies the
filter type,
flags which specifies any flags that apply to the
filter, and options which specifies any options that
apply to the filter. See the example below for a better understanding.`
This parameter can be also an integer holding a filter constant. Then all values in the
input array are filtered by this filter.
- `bool $add_empty [optional]`
Add missing keys as NULL to the return value.**Return:**
- `array|false|null`
An array containing the values of the requested variables on success, or FALSE on failure.
An array value will be FALSE if the filter fails, or NULL if the variable is not
set. Or if the flag FILTER_NULL_ON_FAILURE is used, it returns FALSE if the variable
is not set and NULL if the filter fails.--------
## filter_var(float|int|string|null $variable, int $filter, int|int[] $options): mixed
↑
"filter_var()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed.Filters a variable with a specified filter.
EXAMPLE:
UTF8::filter_var('-ABC-中文空白-', FILTER_VALIDATE_URL); // false
**Parameters:**
- `float|int|string|null $variable`
Value to filter.
- `int $filter [optional]`
The ID of the filter to apply. The
manual page lists the available filters.
- `int|int[] $options [optional]
Associative array of options or bitwise disjunction of flags. If filter
accepts options, flags can be provided in "flags" field of array. For
the "callback" filter, callable type should be passed. The
callback must accept one argument, the value to be filtered, and return
the value after filtering/sanitizing it.`
// for filters that accept options, use this format
$options = array(
'options' => array(
'default' => 3, // value to return if the filter fails
// other options here
'min_range' => 0
),
'flags' => FILTER_FLAG_ALLOW_OCTAL,
);
$var = filter_var('0755', FILTER_VALIDATE_INT, $options);
// for filter that only accept flags, you can pass them directly
$var = filter_var('oops', FILTER_VALIDATE_BOOLEAN, FILTER_NULL_ON_FAILURE);
// for filter that only accept flags, you can also pass as an array
$var = filter_var('oops', FILTER_VALIDATE_BOOLEAN,
array('flags' => FILTER_NULL_ON_FAILURE));
// callback validate filter
function foo($value)
{
// Expected format: Surname, GivenNames
if (strpos($value, ", ") === false) return false;
list($surname, $givennames) = explode(", ", $value, 2);
$empty = (empty($surname) || empty($givennames));
$notstrings = (!is_string($surname) || !is_string($givennames));
if ($empty || $notstrings) {
return false;
} else {
return $value;
}
}
$var = filter_var('Doe, Jane Sue', FILTER_CALLBACK, array('options' => 'foo'));**Return:**
- `mixedThe filtered data, or FALSE if the filter fails.
`--------
## filter_var_array(array $data, array|int $definition, bool $add_empty): array|false|null
↑
"filter_var_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed.Gets multiple variables and optionally filters them.
EXAMPLE:
$filters = [
'name' => ['filter' => FILTER_CALLBACK, 'options' => [UTF8::class, 'ucwords']],
'age' => ['filter' => FILTER_VALIDATE_INT, 'options' => ['min_range' => 1, 'max_range' => 120]],
'email' => FILTER_VALIDATE_EMAIL,
];$data = [
'name' => 'κόσμε',
'age' => '18',
'email' => '[email protected]'
];UTF8::filter_var_array($data, $filters, true); // ['name' => 'Κόσμε', 'age' => 18, 'email' => '[email protected]']
**Parameters:**
- `array $data`
An array with string keys containing the data to filter.
- `array|int $definition [optional]
An array defining the arguments. A valid key is a string
containing a variable name and a valid value is either a
filter type, or an
array optionally specifying the filter, flags and options.
If the value is an array, valid keys are filter
which specifies the filter type,
flags which specifies any flags that apply to the
filter, and options which specifies any options that
apply to the filter. See the example below for a better understanding.`
This parameter can be also an integer holding a filter constant. Then all values
in the input array are filtered by this filter.
- `bool $add_empty [optional]`
Add missing keys as NULL to the return value.**Return:**
- `array|false|null`
An array containing the values of the requested variables on success, or FALSE on failure.
An array value will be FALSE if the filter fails, or NULL if the variable is not
set.--------
## finfo_loaded(): bool
↑
Checks whether finfo is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## first_char(string $str, int $n, string $encoding): string
↑
Returns the first $n characters of the string.**Parameters:**
- `T $strThe input string.
`
- `int<1, max> $nNumber of characters to retrieve from the start.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## fits_inside(string $str, int $box_size): bool
↑
Check if the number of Unicode characters isn't greater than the specified integer.EXAMPLE:
UTF8::fits_inside('κόσμε', 6); // false
**Parameters:**
- `string $str the original string to be checked`
- `int $box_size the size in number of chars to be checked against string`**Return:**
- `boolTRUE if string is less than or equal to $box_size, FALSE otherwise.
`--------
## fix_simple_utf8(string $str): string
↑
Try to fix simple broken UTF-8 strings.INFO: Take a look at "UTF8::fix_utf8()" if you need a more advanced fix for broken UTF-8 strings.
EXAMPLE:
UTF8::fix_simple_utf8('Düsseldorf'); // 'Düsseldorf'
If you received an UTF-8 string that was converted from Windows-1252 as it was ISO-8859-1
(ignoring Windows-1252 chars from 80 to 9F) use this function to fix it.
See: http://en.wikipedia.org/wiki/Windows-1252**Parameters:**
- `string $strThe input string
`**Return:**
- `string`--------
## fix_utf8(string|string[] $str): string|string[]
↑
Fix a double (or multiple) encoded UTF8 string.EXAMPLE:
UTF8::fix_utf8('Fédération'); // 'Fédération'
**Parameters:**
- `TFixUtf8 $str you can use a string or an array of strings`**Return:**
- `string|string[]Will return the fixed input-"array" or
`
the fixed input-"string".--------
## getCharDirection(string $char): string
↑
Get character of a specific character.EXAMPLE:
UTF8::getCharDirection('ا'); // 'RTL'
**Parameters:**
- `string $char`**Return:**
- `string'RTL' or 'LTR'.
`--------
## getSupportInfo(string|null $key): mixed
↑
Check for php-support.**Parameters:**
- `string|null $key`**Return:**
- `mixed Return the full support-"array", if $key === null
return bool-value, if $key is used and available
otherwise return null`--------
## getUrlParamFromArray(string $param, array $data): mixed
↑
Get data from an array via array like string.EXAMPLE:
$array['foo'][123] = 'lall'; UTF8::getUrlParamFromArray('foo[123]', $array); // 'lall'
**Parameters:**
- `string $param`
- `array $data`**Return:**
- `mixed`--------
## get_file_type(string $str, array $fallback):
↑
Warning: this method only works for some file-types (png, jpg)
if you need more supported types, please use e.g. "finfo"**Parameters:**
- `string $str`
- `array{ext: (null|string), mime: (null|string), type: (null|string)} $fallback`**Return:**
- `array{ext: (null|string), mime: (null|string), type: (null|string)}`--------
## get_random_string(int $length, string $possible_chars, string $encoding): string
↑**Parameters:**
- `int<1, max> $lengthLength of the random string.
`
- `T $possible_chars [optional]Characters string for the random selection.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## get_unique_string(int|string $extra_entropy, bool $use_md5): non-empty-string
↑**Parameters:**
- `int|string $extra_entropy [optional]Extra entropy via a string or int value.
`
- `bool $use_md5 [optional]Return the unique identifier as md5-hash? Default: true
`**Return:**
- `non-empty-string`--------
## has_lowercase(string $str): bool
↑
Returns true if the string contains a lower case char, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not the string contains a lower case character.
`--------
## has_uppercase(string $str): bool
↑
Returns true if the string contains an upper case char, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not the string contains an upper case character.
`--------
## has_whitespace(string $str): bool
↑
Returns true if the string contains whitespace, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not the string contains whitespace.
`--------
## hex_to_chr(string $hexdec): string
↑
Converts a hexadecimal value into a UTF-8 character.INFO: opposite to UTF8::chr_to_hex()
EXAMPLE:
UTF8::hex_to_chr('U+00a7'); // '§'
**Parameters:**
- `string $hexdecThe hexadecimal value.
`**Return:**
- `stringOne single UTF-8 character.
`--------
## hex_to_int(string $hexdec): false|int
↑
Converts hexadecimal U+xxxx code point representation to integer.INFO: opposite to UTF8::int_to_hex()
EXAMPLE:
UTF8::hex_to_int('U+00f1'); // 241
**Parameters:**
- `string $hexdecThe hexadecimal code point representation.
`**Return:**
- `false|intThe code point, or false on failure.
`--------
## html_encode(string $str, bool $keep_ascii_chars, string $encoding): string
↑
Converts a UTF-8 string to a series of HTML numbered entities.INFO: opposite to UTF8::html_decode()
EXAMPLE:
UTF8::html_encode('中文空白'); // '中文空白'
**Parameters:**
- `T $strThe Unicode string to be encoded as numbered entities.
`
- `bool $keep_ascii_chars [optional]Keep ASCII chars.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringHTML numbered entities.
`--------
## html_entity_decode(string $str, int|null $flags, string $encoding): string
↑
UTF-8 version of html_entity_decode()The reason we are not using html_entity_decode() by itself is because
while it is not technically correct to leave out the semicolon
at the end of an entity most browsers will still interpret the entity
correctly. html_entity_decode() does not convert entities without
semicolons, so we are left with our own little solution here. Bummer.Convert all HTML entities to their applicable characters.
INFO: opposite to UTF8::html_encode()
EXAMPLE:
UTF8::html_entity_decode('中文空白'); // '中文空白'
**Parameters:**
- `T $str`
The input string.
- `int|null $flags [optional]
A bitmask of one or more of the following flags, which specify how to handle quotes
and which document type to use. The default is ENT_COMPAT | ENT_HTML401.
Available flags constantsConstant Name
DescriptionENT_COMPAT
Will convert double-quotes and leave single-quotes alone.ENT_QUOTES
Will convert both double and single quotes.ENT_NOQUOTES
Will leave both double and single quotes unconverted.ENT_HTML401
Handle code as HTML 4.01.
ENT_XML1
Handle code as XML 1.
ENT_XHTML
Handle code as XHTML.
ENT_HTML5
Handle code as HTML 5.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe decoded string.
`--------
## html_escape(string $str, string $encoding): string
↑
Create a escape html version of the string via "UTF8::htmlspecialchars()".**Parameters:**
- `string $str`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## html_stripe_empty_tags(string $str): string
↑
Remove empty html-tag.e.g.:
**Parameters:**
- `string $str`**Return:**
- `string`--------
## htmlentities(string $str, int $flags, string $encoding, bool $double_encode): string
↑
Convert all applicable characters to HTML entities: UTF-8 version of htmlentities().EXAMPLE:
UTF8::htmlentities('<白-öäü>'); // '<白-öäü>'
**Parameters:**
- `string $str`
The input string.
- `int $flags [optional]
A bitmask of one or more of the following flags, which specify how to handle
quotes, invalid code unit sequences and the used document type. The default is
ENT_COMPAT | ENT_HTML401.
Available flags constantsConstant Name
DescriptionENT_COMPAT
Will convert double-quotes and leave single-quotes alone.ENT_QUOTES
Will convert both double and single quotes.ENT_NOQUOTES
Will leave both double and single quotes unconverted.ENT_IGNORE
Silently discard invalid code unit sequences instead of returning
an empty string. Using this flag is discouraged as it
may have security implications.ENT_SUBSTITUTE
Replace invalid code unit sequences with a Unicode Replacement Character
U+FFFD (UTF-8) or &#FFFD; (otherwise) instead of returning an empty
string.ENT_DISALLOWED
Replace invalid code points for the given document type with a
Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD;
(otherwise) instead of leaving them as is. This may be useful, for
instance, to ensure the well-formedness of XML documents with
embedded external content.ENT_HTML401
Handle code as HTML 4.01.
ENT_XML1
Handle code as XML 1.
ENT_XHTML
Handle code as XHTML.
ENT_HTML5
Handle code as HTML 5.
`
- `string $encoding [optional]`
Like htmlspecialchars,
htmlentities takes an optional third argument
encoding which defines encoding used in
conversion.
Although this argument is technically optional, you are highly
encouraged to specify the correct value for your code.
- `bool $double_encode [optional]`
When double_encode is turned off PHP will not
encode existing html entities. The default is to convert everything.**Return:**
- `string`
The encoded string.
If the input string contains an invalid code unit
sequence within the given encoding an empty string
will be returned, unless either the ENT_IGNORE or
ENT_SUBSTITUTE flags are set.--------
## htmlspecialchars(string $str, int $flags, string $encoding, bool $double_encode): string
↑
Convert only special characters to HTML entities: UTF-8 version of htmlspecialchars()INFO: Take a look at "UTF8::htmlentities()"
EXAMPLE:
UTF8::htmlspecialchars('<白-öäü>'); // '<白-öäü>'
**Parameters:**
- `T $str`
The string being converted.
- `int $flags [optional]
A bitmask of one or more of the following flags, which specify how to handle
quotes, invalid code unit sequences and the used document type. The default is
ENT_COMPAT | ENT_HTML401.
Available flags constantsConstant Name
DescriptionENT_COMPAT
Will convert double-quotes and leave single-quotes alone.ENT_QUOTES
Will convert both double and single quotes.ENT_NOQUOTES
Will leave both double and single quotes unconverted.ENT_IGNORE
Silently discard invalid code unit sequences instead of returning
an empty string. Using this flag is discouraged as it
may have security implications.ENT_SUBSTITUTE
Replace invalid code unit sequences with a Unicode Replacement Character
U+FFFD (UTF-8) or &#FFFD; (otherwise) instead of returning an empty
string.ENT_DISALLOWED
Replace invalid code points for the given document type with a
Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD;
(otherwise) instead of leaving them as is. This may be useful, for
instance, to ensure the well-formedness of XML documents with
embedded external content.ENT_HTML401
Handle code as HTML 4.01.
ENT_XML1
Handle code as XML 1.
ENT_XHTML
Handle code as XHTML.
ENT_HTML5
Handle code as HTML 5.
`
- `string $encoding [optional]
Defines encoding used in conversion.`
For the purposes of this function, the encodings
ISO-8859-1, ISO-8859-15,
UTF-8, cp866,
cp1251, cp1252, and
KOI8-R are effectively equivalent, provided the
string itself is valid for the encoding, as
the characters affected by htmlspecialchars occupy
the same positions in all of these encodings.
- `bool $double_encode [optional]`
When double_encode is turned off PHP will not
encode existing html entities, the default is to convert everything.**Return:**
- `stringThe converted string.
`
If the input string contains an invalid code unit
sequence within the given encoding an empty string
will be returned, unless either the ENT_IGNORE or
ENT_SUBSTITUTE flags are set.--------
## iconv_loaded(): bool
↑
Checks whether iconv is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## int_to_hex(int $int, string $prefix): string
↑
Converts Integer to hexadecimal U+xxxx code point representation.INFO: opposite to UTF8::hex_to_int()
EXAMPLE:
UTF8::int_to_hex(241); // 'U+00f1'
**Parameters:**
- `int $intThe integer to be converted to hexadecimal code point.
`
- `string $prefix [optional]`**Return:**
- `string the code point, or empty string on failure`--------
## intlChar_loaded(): bool
↑
Checks whether intl-char is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## intl_loaded(): bool
↑
Checks whether intl is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## is_alpha(string $str): bool
↑
Returns true if the string contains only alphabetic chars, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only alphabetic chars.
`--------
## is_alphanumeric(string $str): bool
↑
Returns true if the string contains only alphabetic and numeric chars, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only alphanumeric chars.
`--------
## is_ascii(string $str): bool
↑
Checks if a string is 7 bit ASCII.EXAMPLE:
UTF8::is_ascii('白'); // false
**Parameters:**
- `string $strThe string to check.
`**Return:**
- `bool`
true if it is ASCII
false otherwise--------
## is_base64(string|null $str, bool $empty_string_is_valid): bool
↑
Returns true if the string is base64 encoded, false otherwise.EXAMPLE:
UTF8::is_base64('4KSu4KWL4KSo4KS/4KSa'); // true
**Parameters:**
- `string|null $strThe input string.
`
- `bool $empty_string_is_valid [optional]Is an empty string valid base64 or not?
`**Return:**
- `boolWhether or not $str is base64 encoded.
`--------
## is_binary(int|string $input, bool $strict): bool
↑
Check if the input is binary... (is look like a hack).EXAMPLE:
UTF8::is_binary(01); // true
**Parameters:**
- `int|string $input`
- `bool $strict`**Return:**
- `bool`--------
## is_binary_file(string $file): bool
↑
Check if the file is binary.EXAMPLE:
UTF8::is_binary('./utf32.txt'); // true
**Parameters:**
- `string $file`**Return:**
- `bool`--------
## is_blank(string $str): bool
↑
Returns true if the string contains only whitespace chars, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only whitespace characters.
`--------
## is_bom(string $str): bool
↑
Checks if the given string is equal to any "Byte Order Mark".WARNING: Use "UTF8::string_has_bom()" if you will check BOM in a string.
EXAMPLE:
UTF8::is_bom("\xef\xbb\xbf"); // true
**Parameters:**
- `string $strThe input string.
`**Return:**
- `booltrue if the $utf8_chr is Byte Order Mark, false otherwise.
`--------
## is_empty(array|float|int|string $str): bool
↑
Determine whether the string is considered to be empty.A variable is considered empty if it does not exist or if its value equals FALSE.
empty() does not generate a warning if the variable does not exist.**Parameters:**
- `array|float|int|string $str`**Return:**
- `boolWhether or not $str is empty().
`--------
## is_hexadecimal(string $str): bool
↑
Returns true if the string contains only hexadecimal chars, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only hexadecimal chars.
`--------
## is_html(string $str): bool
↑
Check if the string contains any HTML tags.EXAMPLE:
UTF8::is_html('lall'); // true
**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains html elements.
`--------
## is_json(string $str, bool $only_array_or_object_results_are_valid): bool
↑
Try to check if "$str" is a JSON-string.EXAMPLE:
UTF8::is_json('{"array":[1,"¥","ä"]}'); // true
**Parameters:**
- `string $strThe input string.
`
- `bool $only_array_or_object_results_are_valid [optional]Only array and objects are valid json
`
results.**Return:**
- `boolWhether or not the $str is in JSON format.
`--------
## is_lowercase(string $str): bool
↑**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only lowercase chars.
`--------
## is_printable(string $str, bool $ignore_control_characters): bool
↑
Returns true if the string contains only printable (non-invisible) chars, false otherwise.**Parameters:**
- `string $strThe input string.
`
- `bool $ignore_control_characters [optional]Ignore control characters like [LRM] or [LSEP].
`**Return:**
- `boolWhether or not $str contains only printable (non-invisible) chars.
`--------
## is_punctuation(string $str): bool
↑
Returns true if the string contains only punctuation chars, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only punctuation chars.
`--------
## is_serialized(string $str): bool
↑
Returns true if the string is serialized, false otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str is serialized.
`--------
## is_uppercase(string $str): bool
↑
Returns true if the string contains only lower case chars, false
otherwise.**Parameters:**
- `string $strThe input string.
`**Return:**
- `boolWhether or not $str contains only lower case characters.
`--------
## is_url(string $url, bool $disallow_localhost): bool
↑
Check if $url is an correct url.**Parameters:**
- `string $url`
- `bool $disallow_localhost`**Return:**
- `bool`--------
## is_utf8(int|string|string[]|null $str, bool $strict): bool
↑
Checks whether the passed input contains only byte sequences that appear valid UTF-8.EXAMPLE:
UTF8::is_utf8(['Iñtërnâtiônàlizætiøn', 'foo']); // true
//
UTF8::is_utf8(["Iñtërnâtiônàlizætiøn\xA0\xA1", 'bar']); // false**Parameters:**
- `int|string|string[]|null $strThe input to be checked.
`
- `bool $strictCheck also if the string is not UTF-16 or UTF-32.
`**Return:**
- `bool`--------
## is_utf16(string $str, bool $check_if_string_is_binary): false|int
↑
Check if the string is UTF-16.EXAMPLE:
UTF8::is_utf16(file_get_contents('utf-16-le.txt')); // 1
//
UTF8::is_utf16(file_get_contents('utf-16-be.txt')); // 2
//
UTF8::is_utf16(file_get_contents('utf-8.txt')); // false**Parameters:**
- `string $strThe input string.
`
- `bool $check_if_string_is_binary`**Return:**
- `false|int false if is't not UTF-16,
1 for UTF-16LE,
2 for UTF-16BE`--------
## is_utf32(string $str, bool $check_if_string_is_binary): false|int
↑
Check if the string is UTF-32.EXAMPLE:
UTF8::is_utf32(file_get_contents('utf-32-le.txt')); // 1
//
UTF8::is_utf32(file_get_contents('utf-32-be.txt')); // 2
//
UTF8::is_utf32(file_get_contents('utf-8.txt')); // false**Parameters:**
- `string $strThe input string.
`
- `bool $check_if_string_is_binary`**Return:**
- `false|int false if is't not UTF-32,
1 for UTF-32LE,
2 for UTF-32BE`--------
## json_decode(string $json, bool $assoc, int $depth, int $options): mixed
↑
(PHP 5 >= 5.2.0, PECL json >= 1.2.0)
Decodes a JSON stringEXAMPLE:
UTF8::json_decode('[1,"\u00a5","\u00e4"]'); // array(1, '¥', 'ä')
**Parameters:**
- `string $json
The json string being decoded.
This function only works with UTF-8 encoded strings.PHP implements a superset of
`
JSON - it will also encode and decode scalar types and NULL. The JSON standard
only supports these values when they are nested inside an array or an object.
- `bool $assoc [optional]`
When TRUE, returned objects will be converted into
associative arrays.
- `int $depth [optional]`
User specified recursion depth.
- `int $options [optional]`
Bitmask of JSON decode options. Currently only
JSON_BIGINT_AS_STRING
is supported (default is to cast large integers as floats)**Return:**
- `mixedThe value encoded in json in appropriate PHP type. Values true, false and
`
null (case-insensitive) are returned as TRUE, FALSE and NULL respectively.
NULL is returned if the json cannot be decoded or if the encoded data
is deeper than the recursion limit.--------
## json_encode(mixed $value, int $options, int $depth): false|string
↑
(PHP 5 >= 5.2.0, PECL json >= 1.2.0)
Returns the JSON representation of a value.EXAMPLE:
UTF8::json_encode(array(1, '¥', 'ä')); // '[1,"\u00a5","\u00e4"]'
**Parameters:**
- `mixed $value
The value being encoded. Can be any type except
a resource.
All string data must be UTF-8 encoded.PHP implements a superset of
`
JSON - it will also encode and decode scalar types and NULL. The JSON standard
only supports these values when they are nested inside an array or an object.
- `int $options [optional]`
Bitmask consisting of JSON_HEX_QUOT,
JSON_HEX_TAG,
JSON_HEX_AMP,
JSON_HEX_APOS,
JSON_NUMERIC_CHECK,
JSON_PRETTY_PRINT,
JSON_UNESCAPED_SLASHES,
JSON_FORCE_OBJECT,
JSON_UNESCAPED_UNICODE. The behaviour of these
constants is described on
the JSON constants page.
- `int $depth [optional]`
Set the maximum depth. Must be greater than zero.**Return:**
- `false|stringA JSON encoded string on success or
`
FALSE on failure.--------
## json_loaded(): bool
↑
Checks whether JSON is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## lcfirst(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Makes string's first char lowercase.EXAMPLE:
UTF8::lcfirst('ÑTËRNÂTIÔNÀLIZÆTIØN'); // ñTËRNÂTIÔNÀLIZÆTIØN
**Parameters:**
- `string $strThe input string
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `stringThe resulting string.
`--------
## lcwords(string $str, string[] $exceptions, string $char_list, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Lowercase for all words in the string.**Parameters:**
- `string $strThe input string.
`
- `string[] $exceptions [optional]Exclusion for some words.
`
- `string $char_list [optional]Additional chars that contains to words and do
`
not start a new word.
- `string $encoding [optional]Set the charset.
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `string`--------
## levenshtein(string $str1, string $str2, int $insertionCost, int $replacementCost, int $deletionCost): int
↑
Calculate Levenshtein distance between two strings.For better performance, in a real application with a single input string
matched against many strings from a database, you will probably want to pre-
encode the input only once and use \levenshtein().Source: https://github.com/KEINOS/mb_levenshtein
**Parameters:**
- `string $str1One of the strings being evaluated for Levenshtein distance.
`
- `string $str2One of the strings being evaluated for Levenshtein distance.
`
- `int $insertionCost [optional]Defines the cost of insertion.
`
- `int $replacementCost [optional]Defines the cost of replacement.
`
- `int $deletionCost [optional]Defines the cost of deletion.
`**Return:**
- `int`--------
## ltrim(string $str, string|null $chars): string
↑
Strip whitespace or other characters from the beginning of a UTF-8 string.EXAMPLE:
UTF8::ltrim(' 中文空白 '); // '中文空白 '
**Parameters:**
- `string $strThe string to be trimmed
`
- `string|null $charsOptional characters to be stripped
`**Return:**
- `string the string with unwanted characters stripped from the left`--------
## max(string|string[] $arg): string|null
↑
Returns the UTF-8 character with the maximum code point in the given data.EXAMPLE:
UTF8::max('abc-äöü-中文空白'); // 'ø'
**Parameters:**
- `string|string[] $argA UTF-8 encoded string or an array of such strings.
`**Return:**
- `string|null the character with the highest code point than others, returns null on failure or empty input`--------
## max_chr_width(string $str): int
↑
Calculates and returns the maximum number of bytes taken by any
UTF-8 encoded character in the given string.EXAMPLE:
UTF8::max_chr_width('Intërnâtiônàlizætiøn'); // 2
**Parameters:**
- `string $strThe original Unicode string.
`**Return:**
- `intMax byte lengths of the given chars.
`--------
## mbstring_loaded(): bool
↑
Checks whether mbstring is available on the server.**Parameters:**
__nothing__**Return:**
- `booltrue if available, false otherwise
`--------
## min(string|string[] $arg): string|null
↑
Returns the UTF-8 character with the minimum code point in the given data.EXAMPLE:
UTF8::min('abc-äöü-中文空白'); // '-'
**Parameters:**
- `string|string[] $arg A UTF-8 encoded string or an array of such strings.`**Return:**
- `string|nullThe character with the lowest code point than others, returns null on failure or empty input.
`--------
## normalize_encoding(mixed $encoding, mixed $fallback): mixed|string
↑
Normalize the encoding-"name" input.EXAMPLE:
UTF8::normalize_encoding('UTF8'); // 'UTF-8'
**Parameters:**
- `mixed $encodinge.g.: ISO, UTF8, WINDOWS-1251 etc.
`
- `string|TNormalizeEncodingFallback $fallbacke.g.: UTF-8
`**Return:**
- `mixed|stringe.g.: ISO-8859-1, UTF-8, WINDOWS-1251 etc.
`
Will return a empty string as fallback (by default)--------
## normalize_line_ending(string $str, string|string[] $replacer): string
↑
Standardize line ending to unix-like.**Parameters:**
- `string $strThe input string.
`
- `string|string[] $replacerThe replacer char e.g. "\n" (Linux) or "\r\n" (Windows). You can also use \PHP_EOL
`
here.**Return:**
- `stringA string with normalized line ending.
`--------
## normalize_msword(string $str): string
↑
Normalize some MS Word special characters.EXAMPLE:
UTF8::normalize_msword('„Abcdef…”'); // '"Abcdef..."'
**Parameters:**
- `string $strThe string to be normalized.
`**Return:**
- `stringA string with normalized characters for commonly used chars in Word documents.
`--------
## normalize_whitespace(string $str, bool $keep_non_breaking_space, bool $keep_bidi_unicode_controls, bool $normalize_control_characters): string
↑
Normalize the whitespace.EXAMPLE:
UTF8::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"
**Parameters:**
- `string $strThe string to be normalized.
`
- `bool $keep_non_breaking_space [optional]Set to true, to keep non-breaking-spaces.
`
- `bool $keep_bidi_unicode_controls [optional]Set to true, to keep non-printable (for the web)
`
bidirectional text chars.
- `bool $normalize_control_characters [optional]Set to true, to convert e.g. LINE-, PARAGRAPH-SEPARATOR with "\n" and LINE TABULATION with "\t".
`**Return:**
- `stringA string with normalized whitespace.
`--------
## ord(string $chr, string $encoding): int
↑
Calculates Unicode code point of the given UTF-8 encoded character.INFO: opposite to UTF8::chr()
EXAMPLE:
UTF8::ord('☃'); // 0x2603
**Parameters:**
- `string $chrThe character of which to calculate code point.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `intUnicode code point of the given character,
`
0 on invalid UTF-8 byte sequence--------
## parse_str(string $str, array $result, bool $clean_utf8): bool
↑
Parses the string into an array (into the the second parameter).WARNING: Unlike "parse_str()", this method does not (re-)place variables in the current scope,
if the second parameter is not set!EXAMPLE:
UTF8::parse_str('Iñtërnâtiônéàlizætiøn=測試&arr[]=foo+測試&arr[]=ການທົດສອບ', $array);
echo $array['Iñtërnâtiônéàlizætiøn']; // '測試'**Parameters:**
- `string $strThe input string.
`
- `array $resultThe result will be returned into this reference parameter.
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `boolWill return false if php can't parse the string and we haven't any $result.
`--------
## pcre_utf8_support(): bool
↑
Checks if \u modifier is available that enables Unicode support in PCRE.**Parameters:**
__nothing__**Return:**
- `bool`
true if support is available,
false otherwise--------
## range(int|string $var1, int|string $var2, bool $use_ctype, string $encoding, float|int $step): list
↑
Create an array containing a range of UTF-8 characters.EXAMPLE:
UTF8::range('κ', 'ζ'); // array('κ', 'ι', 'θ', 'η', 'ζ',)
**Parameters:**
- `int|string $var1Numeric or hexadecimal code points, or a UTF-8 character to start from.
`
- `int|string $var2Numeric or hexadecimal code points, or a UTF-8 character to end at.
`
- `bool $use_ctypeuse ctype to detect numeric and hexadecimal, otherwise we will use a simple
`
"is_numeric"
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `float|int $step [optional]`
If a step value is given, it will be used as the
increment between elements in the sequence. step
should be given as a positive number. If not specified,
step will default to 1.**Return:**
- `list`--------
## rawurldecode(string $str, bool $multi_decode): string
↑
Multi decode HTML entity + fix urlencoded-win1252-chars.EXAMPLE:
UTF8::rawurldecode('tes%20öäü%20\u00edtest+test'); // 'tes öäü ítest+test'
e.g:
'test+test' => 'test+test'
'Düsseldorf' => 'Düsseldorf'
'D%FCsseldorf' => 'Düsseldorf'
'Düsseldorf' => 'Düsseldorf'
'D%26%23xFC%3Bsseldorf' => 'Düsseldorf'
'Düsseldorf' => 'Düsseldorf'
'D%C3%BCsseldorf' => 'Düsseldorf'
'D%C3%83%C2%BCsseldorf' => 'Düsseldorf'
'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf'**Parameters:**
- `T $strThe input string.
`
- `bool $multi_decodeDecode as often as possible.
`**Return:**
- `stringThe decoded URL, as a string.
`--------
## regex_replace(string $str, string $pattern, string $replacement, string $options, string $delimiter): string
↑
Replaces all occurrences of $pattern in $str by $replacement.**Parameters:**
- `string $strThe input string.
`
- `string $patternThe regular expression pattern.
`
- `string $replacementThe string to replace with.
`
- `string $options [optional]Matching conditions to be used.
`
- `string $delimiter [optional]Delimiter the the regex. Default: '/'
`**Return:**
- `string`--------
## remove_bom(string $str): string
↑
Remove the BOM from UTF-8 / UTF-16 / UTF-32 strings.EXAMPLE:
UTF8::remove_bom("\xEF\xBB\xBFΜπορώ να"); // 'Μπορώ να'
**Parameters:**
- `string $strThe input string.
`**Return:**
- `stringA string without UTF-BOM.
`--------
## remove_duplicates(string $str, string|string[] $what): string
↑
Removes duplicate occurrences of a string in another string.EXAMPLE:
UTF8::remove_duplicates('öäü-κόσμεκόσμε-äöü', 'κόσμε'); // 'öäü-κόσμε-äöü'
**Parameters:**
- `string $strThe base string.
`
- `string|string[] $whatString to search for in the base string.
`**Return:**
- `stringA string with removed duplicates.
`--------
## remove_html(string $str, string $allowable_tags): string
↑
Remove html via "strip_tags()" from the string.**Parameters:**
- `string $strThe input string.
`
- `string $allowable_tags [optional]You can use the optional second parameter to specify tags which
`
should not be stripped. Default: null**Return:**
- `stringA string with without html tags.
`--------
## remove_html_breaks(string $str, string $replacement): string
↑
Remove all breaks [
| \r\n | \r | \n | ...] from the string.**Parameters:**
- `string $strThe input string.
`
- `string $replacement [optional]Default is a empty string.
`**Return:**
- `stringA string without breaks.
`--------
## remove_ileft(string $str, string $substring, string $encoding): string
↑
Returns a new string with the prefix $substring removed, if present and case-insensitive.**Parameters:**
- `string $strThe input string.
`
- `string $substringThe prefix to remove.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `stringA string without the prefix $substring.
`--------
## remove_invisible_characters(string $str, bool $url_encoded, string $replacement, bool $keep_basic_control_characters): string
↑
Remove invisible characters from a string.e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script.
EXAMPLE:
UTF8::remove_invisible_characters("κόσ\0με"); // 'κόσμε'
copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php
**Parameters:**
- `string $strThe input string.
`
- `bool $url_encoded [optional]`
Try to remove url encoded control character.
WARNING: maybe contains false-positives e.g. aa%0Baa -> aaaa.
Default: false
- `string $replacement [optional]The replacement character.
`
- `bool $keep_basic_control_characters [optional]Keep control characters like [LRM] or [LSEP].
`**Return:**
- `stringA string without invisible chars.
`--------
## remove_iright(string $str, string $substring, string $encoding): string
↑
Returns a new string with the suffix $substring removed, if present and case-insensitive.**Parameters:**
- `string $str`
- `string $substringThe suffix to remove.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `stringA string having a $str without the suffix $substring.
`--------
## remove_left(string $str, string $substring, string $encoding): string
↑
Returns a new string with the prefix $substring removed, if present.**Parameters:**
- `string $strThe input string.
`
- `string $substringThe prefix to remove.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `stringA string without the prefix $substring.
`--------
## remove_right(string $str, string $substring, string $encoding): string
↑
Returns a new string with the suffix $substring removed, if present.**Parameters:**
- `string $str`
- `string $substringThe suffix to remove.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `stringA string having a $str without the suffix $substring.
`--------
## replace(string $str, string $search, string $replacement, bool $case_sensitive): string
↑
Replaces all occurrences of $search in $str by $replacement.**Parameters:**
- `string $strThe input string.
`
- `string $searchThe needle to search for.
`
- `string $replacementThe string to replace with.
`
- `bool $case_sensitive [optional]Whether or not to enforce case-sensitivity. Default: true
`**Return:**
- `stringA string with replaced parts.
`--------
## replace_all(string $str, string[] $search, string|string[] $replacement, bool $case_sensitive): string
↑
Replaces all occurrences of $search in $str by $replacement.**Parameters:**
- `string $strThe input string.
`
- `string[] $searchThe elements to search for.
`
- `string|string[] $replacementThe string to replace with.
`
- `bool $case_sensitive [optional]Whether or not to enforce case-sensitivity. Default: true
`**Return:**
- `stringA string with replaced parts.
`--------
## replace_diamond_question_mark(string $str, string $replacement_char, bool $process_invalid_utf8_chars): string
↑
Replace the diamond question mark (�) and invalid-UTF8 chars with the replacement.EXAMPLE:
UTF8::replace_diamond_question_mark('中文空白�', ''); // '中文空白'
**Parameters:**
- `string $strThe input string
`
- `string $replacement_charThe replacement character.
`
- `bool $process_invalid_utf8_charsConvert invalid UTF-8 chars
`**Return:**
- `stringA string without diamond question marks (�).
`--------
## rtrim(string $str, string|null $chars): string
↑
Strip whitespace or other characters from the end of a UTF-8 string.EXAMPLE:
UTF8::rtrim('-ABC-中文空白- '); // '-ABC-中文空白-'
**Parameters:**
- `string $strThe string to be trimmed.
`
- `string|null $charsOptional characters to be stripped.
`**Return:**
- `stringA string with unwanted characters stripped from the right.
`--------
## showSupport(bool $useEcho): string|void
↑
WARNING: Print native UTF-8 support (libs) by default, e.g. for debugging.**Parameters:**
- `bool $useEcho`**Return:**
- `string|void`--------
## single_chr_html_encode(string $char, bool $keep_ascii_chars, string $encoding): string
↑
Converts a UTF-8 character to HTML Numbered Entity like "{".EXAMPLE:
UTF8::single_chr_html_encode('κ'); // 'κ'
**Parameters:**
- `T $charThe Unicode character to be encoded as numbered entity.
`
- `bool $keep_ascii_charsSet to true to keep ASCII chars.>`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe HTML numbered entity for the given character.
`--------
## spaces_to_tabs(string $str, int $tab_length): string
↑**Parameters:**
- `T $str`
- `int<1, max> $tab_length`**Return:**
- `string`--------
## str_camelize(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Returns a camelCase version of the string. Trims surrounding spaces,
capitalizes letters following digits, spaces, dashes and underscores,
and removes spaces, dashes, as well as underscores.**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Default: 'UTF-8'
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `string`--------
## str_capitalize_name(string $str): string
↑
Returns the string with the first letter of each word capitalized,
except for when the word is a name which shouldn't be capitalized.**Parameters:**
- `string $str`**Return:**
- `stringA string with $str capitalized.
`--------
## str_contains(string $haystack, string $needle, bool $case_sensitive): bool
↑
Returns true if the string contains $needle, false otherwise. By default
the comparison is case-sensitive, but can be made insensitive by setting
$case_sensitive to false.**Parameters:**
- `string $haystackThe input string.
`
- `string $needleSubstring to look for.
`
- `bool $case_sensitive [optional]Whether or not to enforce case-sensitivity. Default: true
`**Return:**
- `boolWhether or not $haystack contains $needle.
`--------
## str_contains_all(string $haystack, scalar[] $needles, bool $case_sensitive): bool
↑
Returns true if the string contains all $needles, false otherwise. By
default, the comparison is case-sensitive, but can be made insensitive by
setting $case_sensitive to false.**Parameters:**
- `string $haystackThe input string.
`
- `scalar[] $needlesSubStrings to look for.
`
- `bool $case_sensitive [optional]Whether or not to enforce case-sensitivity. Default: true
`**Return:**
- `boolWhether or not $haystack contains $needle.
`--------
## str_contains_any(string $haystack, scalar[] $needles, bool $case_sensitive): bool
↑
Returns true if the string contains any $needles, false otherwise. By
default the comparison is case-sensitive, but can be made insensitive by
setting $case_sensitive to false.**Parameters:**
- `string $haystackThe input string.
`
- `scalar[] $needlesSubStrings to look for.
`
- `bool $case_sensitive [optional]Whether or not to enforce case-sensitivity. Default: true
`**Return:**
- `boolWhether or not $str contains $needle.
`--------
## str_dasherize(string $str, string $encoding): string
↑
Returns a lowercase and trimmed string separated by dashes. Dashes are
inserted before uppercase characters (with the exception of the first
character of the string), and in place of spaces as well as underscores.**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_delimit(string $str, string $delimiter, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Returns a lowercase and trimmed string separated by the given delimiter.Delimiters are inserted before uppercase characters (with the exception
of the first character of the string), and in place of spaces, dashes,
and underscores. Alpha delimiters are not converted to lowercase.EXAMPLE:
UTF8::str_delimit('test case, '#'); // 'test#case'
UTF8::str_delimit('test -case', '**'); // 'test**case'**Parameters:**
- `T $strThe input string.
`
- `string $delimiterSequence used to separate parts of the string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ ->
`
ß**Return:**
- `string`--------
## str_detect_encoding(string $str): false|string
↑
Optimized "mb_detect_encoding()"-function -> with support for UTF-16 and UTF-32.EXAMPLE:
UTF8::str_detect_encoding('中文空白'); // 'UTF-8'
UTF8::str_detect_encoding('Abc'); // 'ASCII'**Parameters:**
- `string $strThe input string.
`**Return:**
- `false|string`
The detected string-encoding e.g. UTF-8 or UTF-16BE,
otherwise it will return false e.g. for BINARY or not detected encoding.--------
## str_ends_with(string $haystack, string $needle): bool
↑
Check if the string ends with the given substring.EXAMPLE:
UTF8::str_ends_with('BeginMiddleΚόσμε', 'Κόσμε'); // true
UTF8::str_ends_with('BeginMiddleΚόσμε', 'κόσμε'); // false**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `bool`--------
## str_ends_with_any(string $str, string[] $substrings): bool
↑
Returns true if the string ends with any of $substrings, false otherwise.- case-sensitive
**Parameters:**
- `string $strThe input string.
`
- `string[] $substringsSubstrings to look for.
`**Return:**
- `boolWhether or not $str ends with $substring.
`--------
## str_ensure_left(string $str, string $substring):
↑
Ensures that the string begins with $substring. If it doesn't, it's
prepended.**Parameters:**
- `T $strThe input string.
`
- `TSub $substringThe substring to add if not present.
`**Return:**
- `TSub is non-empty-string ? non-empty-string : (T is non-empty-string ? non-empty-string : string`--------
## str_ensure_right(string $str, string $substring): string
↑
Ensures that the string ends with $substring. If it doesn't, it's appended.**Parameters:**
- `T $strThe input string.
`
- `TSub $substringThe substring to add if not present.
`**Return:**
- `string`--------
## str_humanize(string $str): string
↑
Capitalizes the first word of the string, replaces underscores with
spaces, and strips '_id'.**Parameters:**
- `string $str`**Return:**
- `string`--------
## str_iends_with(string $haystack, string $needle): bool
↑
Check if the string ends with the given substring, case-insensitive.EXAMPLE:
UTF8::str_iends_with('BeginMiddleΚόσμε', 'Κόσμε'); // true
UTF8::str_iends_with('BeginMiddleΚόσμε', 'κόσμε'); // true**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `bool`--------
## str_iends_with_any(string $str, string[] $substrings): bool
↑
Returns true if the string ends with any of $substrings, false otherwise.- case-insensitive
**Parameters:**
- `string $strThe input string.
`
- `string[] $substringsSubstrings to look for.
`**Return:**
- `boolWhether or not $str ends with $substring.
`--------
## str_insert(string $str, string $substring, int $index, string $encoding): string
↑
Inserts $substring into the string at the $index provided.**Parameters:**
- `string $strThe input string.
`
- `string $substringString to be inserted.
`
- `int $indexThe index at which to insert the substring.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_ireplace(string|string[] $search, string|string[] $replacement, string|string[] $subject, int $count): string|string[]
↑
Case-insensitive and UTF-8 safe version of str_replace.EXAMPLE:
UTF8::str_ireplace('lIzÆ', 'lise', 'Iñtërnâtiônàlizætiøn'); // 'Iñtërnâtiônàlisetiøn'**Parameters:**
- `string|string[] $search`
Every replacement with search array is
performed on the result of previous replacement.
- `string|string[] $replacementThe replacement.
`
- `TStrIReplaceSubject $subject`
If subject is an array, then the search and
replace is performed with every entry of
subject, and the return value is an array as
well.
- `int $count [optional]`
The number of matched and replaced needles will
be returned in count which is passed by
reference.**Return:**
- `string|string[]A string or an array of replacements.
`--------
## str_ireplace_beginning(string $str, string $search, string $replacement): string
↑
Replaces $search from the beginning of string with $replacement.**Parameters:**
- `string $strThe input string.
`
- `string $searchThe string to search for.
`
- `string $replacementThe replacement.
`**Return:**
- `stringThe string after the replacement.
`--------
## str_ireplace_ending(string $str, string $search, string $replacement): string
↑
Replaces $search from the ending of string with $replacement.**Parameters:**
- `string $strThe input string.
`
- `string $searchThe string to search for.
`
- `string $replacementThe replacement.
`**Return:**
- `stringThe string after the replacement.
`--------
## str_istarts_with(string $haystack, string $needle): bool
↑
Check if the string starts with the given substring, case-insensitive.EXAMPLE:
UTF8::str_istarts_with('ΚόσμεMiddleEnd', 'Κόσμε'); // true
UTF8::str_istarts_with('ΚόσμεMiddleEnd', 'κόσμε'); // true**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `bool`--------
## str_istarts_with_any(string $str, scalar[] $substrings): bool
↑
Returns true if the string begins with any of $substrings, false otherwise.- case-insensitive
**Parameters:**
- `string $strThe input string.
`
- `scalar[] $substringsSubstrings to look for.
`**Return:**
- `boolWhether or not $str starts with $substring.
`--------
## str_isubstr_after_first_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring after the first occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_isubstr_after_last_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring after the last occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_isubstr_before_first_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring before the first occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_isubstr_before_last_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring before the last occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_isubstr_first(string $str, string $needle, bool $before_needle, string $encoding): string
↑
Gets the substring after (or before via "$before_needle") the first occurrence of the "$needle".**Parameters:**
- `string $strThe input string.
`
- `string $needleThe string to look for.
`
- `bool $before_needle [optional]Default: false
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_isubstr_last(string $str, string $needle, bool $before_needle, string $encoding): string
↑
Gets the substring after (or before via "$before_needle") the last occurrence of the "$needle".**Parameters:**
- `string $strThe input string.
`
- `string $needleThe string to look for.
`
- `bool $before_needle [optional]Default: false
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_last_char(string $str, int $n, string $encoding): string
↑
Returns the last $n characters of the string.**Parameters:**
- `string $strThe input string.
`
- `int $nNumber of characters to retrieve from the end.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_limit(string $str, int $length, string $str_add_on, string $encoding): string
↑
Limit the number of characters in a string.**Parameters:**
- `T $strThe input string.
`
- `int<1, max> $length [optional]Default: 100
`
- `string $str_add_on [optional]Default: …
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_limit_after_word(string $str, int $length, string $str_add_on, string $encoding): string
↑
Limit the number of characters in a string, but also after the next word.EXAMPLE:
UTF8::str_limit_after_word('fòô bàř fòô', 8, ''); // 'fòô bàř'
**Parameters:**
- `T $strThe input string.
`
- `int<1, max> $length [optional]Default: 100
`
- `string $str_add_on [optional]Default: …
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_longest_common_prefix(string $str1, string $str2, string $encoding): string
↑
Returns the longest common prefix between the $str1 and $str2.**Parameters:**
- `string $str1The input sting.
`
- `string $str2Second string for comparison.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_longest_common_substring(string $str1, string $str2, string $encoding): string
↑
Returns the longest common substring between the $str1 and $str2.In the case of ties, it returns that which occurs first.
**Parameters:**
- `string $str1`
- `string $str2Second string for comparison.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringA string with its $str being the longest common substring.
`--------
## str_longest_common_suffix(string $str1, string $str2, string $encoding): string
↑
Returns the longest common suffix between the $str1 and $str2.**Parameters:**
- `string $str1`
- `string $str2Second string for comparison.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string`--------
## str_matches_pattern(string $str, string $pattern): bool
↑
Returns true if $str matches the supplied pattern, false otherwise.**Parameters:**
- `string $strThe input string.
`
- `string $patternRegex pattern to match against.
`**Return:**
- `boolWhether or not $str matches the pattern.
`--------
## str_obfuscate(string $str, float $percent, string $obfuscateChar, string[] $keepChars): string
↑
Convert a string into a obfuscate string.EXAMPLE:
UTF8::str_obfuscate('[email protected]', 0.5, '*', ['@', '.']); // e.g. "l***@m**lleke*.*r*"
**Parameters:**
- `string $str`
- `float $percent`
- `string $obfuscateChar`
- `string[] $keepChars`**Return:**
- `stringThe obfuscate string.
`--------
## str_offset_exists(string $str, int $offset, string $encoding): bool
↑
Returns whether or not a character exists at an index. Offsets may be
negative to count from the last character in the string. Implements
part of the ArrayAccess interface.**Parameters:**
- `string $strThe input string.
`
- `int $offsetThe index to check.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `boolWhether or not the index exists.
`--------
## str_offset_get(string $str, int $index, string $encoding): string
↑
Returns the character at the given index. Offsets may be negative to
count from the last character in the string. Implements part of the
ArrayAccess interface, and throws an OutOfBoundsException if the index
does not exist.**Parameters:**
- `string $strThe input string.
`
- `int<1, max> $indexThe index from which to retrieve the char.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe character at the specified index.
`--------
## str_pad(string $str, int $pad_length, string $pad_string, int|string $pad_type, string $encoding): string
↑
Pad a UTF-8 string to a given length with another string.EXAMPLE:
UTF8::str_pad('中文空白', 10, '_', STR_PAD_BOTH); // '___中文空白___'
**Parameters:**
- `string $strThe input string.
`
- `int $pad_lengthThe length of return string.
`
- `string $pad_string [optional]String to use for padding the input string.
`
- `int|string $pad_type [optional]`
Can be STR_PAD_RIGHT (default), [or string "right"]
STR_PAD_LEFT [or string "left"] or
STR_PAD_BOTH [or string "both"]
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `stringReturns the padded string.
`--------
## str_pad_both(string $str, int $length, string $pad_str, string $encoding): string
↑
Returns a new string of a given length such that both sides of the
string are padded. Alias for "UTF8::str_pad()" with a $pad_type of 'both'.**Parameters:**
- `string $str`
- `int $lengthDesired string length after padding.
`
- `string $pad_str [optional]String used to pad, defaults to space. Default: ' '
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe string with padding applied.
`--------
## str_pad_left(string $str, int $length, string $pad_str, string $encoding): string
↑
Returns a new string of a given length such that the beginning of the
string is padded. Alias for "UTF8::str_pad()" with a $pad_type of 'left'.**Parameters:**
- `string $str`
- `int $lengthDesired string length after padding.
`
- `string $pad_str [optional]String used to pad, defaults to space. Default: ' '
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe string with left padding.
`--------
## str_pad_right(string $str, int $length, string $pad_str, string $encoding): string
↑
Returns a new string of a given length such that the end of the string
is padded. Alias for "UTF8::str_pad()" with a $pad_type of 'right'.**Parameters:**
- `string $str`
- `int $lengthDesired string length after padding.
`
- `string $pad_str [optional]String used to pad, defaults to space. Default: ' '
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe string with right padding.
`--------
## str_repeat(string $str, int $multiplier): string
↑
Repeat a string.EXAMPLE:
UTF8::str_repeat("°~\xf0\x90\x28\xbc", 2); // '°~ð(¼°~ð(¼'
**Parameters:**
- `T $str`
The string to be repeated.
- `int<1, max> $multiplier
Number of time the input string should be
repeated.`
multiplier has to be greater than or equal to 0.
If the multiplier is set to 0, the function
will return an empty string.**Return:**
- `stringThe repeated string.
`--------
## str_replace_beginning(string $str, string $search, string $replacement): string
↑
Replaces $search from the beginning of string with $replacement.**Parameters:**
- `string $strThe input string.
`
- `string $searchThe string to search for.
`
- `string $replacementThe replacement.
`**Return:**
- `stringA string after the replacements.
`--------
## str_replace_ending(string $str, string $search, string $replacement): string
↑
Replaces $search from the ending of string with $replacement.**Parameters:**
- `string $strThe input string.
`
- `string $searchThe string to search for.
`
- `string $replacementThe replacement.
`**Return:**
- `stringA string after the replacements.
`--------
## str_replace_first(string $search, string $replace, string $subject): string
↑
Replace the first "$search"-term with the "$replace"-term.**Parameters:**
- `string $search`
- `string $replace`
- `string $subject`**Return:**
- `string`--------
## str_replace_last(string $search, string $replace, string $subject): string
↑
Replace the last "$search"-term with the "$replace"-term.**Parameters:**
- `string $search`
- `string $replace`
- `string $subject`**Return:**
- `string`--------
## str_shuffle(string $str, string $encoding): string
↑
Shuffles all the characters in the string.INFO: uses random algorithm which is weak for cryptography purposes
EXAMPLE:
UTF8::str_shuffle('fòô bàř fòô'); // 'àòôřb ffòô '
**Parameters:**
- `T $strThe input string
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe shuffled string.
`--------
## str_slice(string $str, int $start, int|null $end, string $encoding): false|string
↑
Returns the substring beginning at $start, and up to, but not including
the index specified by $end. If $end is omitted, the function extracts
the remaining string. If $end is negative, it is computed from the end
of the string.**Parameters:**
- `string $str`
- `int $startInitial index from which to begin extraction.
`
- `int|null $end [optional]Index at which to end extraction. Default: null
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `false|stringThe extracted substring.
If str is shorter than start
characters long, FALSE will be returned.`--------
## str_snakeize(string $str, string $encoding): string
↑
Convert a string to e.g.: "snake_case"**Parameters:**
- `string $str`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringA string in snake_case.
`--------
## str_sort(string $str, bool $unique, bool $desc): string
↑
Sort all characters according to code points.EXAMPLE:
UTF8::str_sort(' -ABC-中文空白- '); // ' ---ABC中文白空'
**Parameters:**
- `string $strA UTF-8 string.
`
- `bool $uniqueSort unique. If true, repeated characters are ignored.
`
- `bool $descIf true, will sort characters in reverse code point order.
`**Return:**
- `stringA string of sorted characters.
`--------
## str_split(int|string $str, int $length, bool $clean_utf8, bool $try_to_use_mb_functions): list
↑
Convert a string to an array of unicode characters.EXAMPLE:
UTF8::str_split('中文空白'); // array('中', '文', '空', '白')
**Parameters:**
- `int|string $strThe string or int to split into array.
`
- `int<1, max> $length [optional]Max character length of each array
`
element.
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the
`
string.
- `bool $try_to_use_mb_functions [optional]Set to false, if you don't want to use
`
"mb_substr"**Return:**
- `listAn array containing chunks of chars from the input.
`--------
## str_split_array(int[]|string[] $input, int $length, bool $clean_utf8, bool $try_to_use_mb_functions): list>
↑
Convert a string to an array of Unicode characters.EXAMPLE:
UTF8::str_split_array(['中文空白', 'test'], 2); // [['中文', '空白'], ['te', 'st']]**Parameters:**
- `int[]|string[] $inputThe string[] or int[] to split into array.
`
- `int<1, max> $length [optional]Max character length of each array
`
element.
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the
`
string.
- `bool $try_to_use_mb_functions [optional]Set to false, if you don't want to use
`
"mb_substr"**Return:**
- `list>An array containing chunks of the input.
`--------
## str_split_pattern(string $str, string $pattern, int $limit): string[]
↑
Splits the string with the provided regular expression, returning an
array of strings. An optional integer $limit will truncate the
results.**Parameters:**
- `string $str`
- `string $patternThe regex with which to split the string.
`
- `int $limit [optional]Maximum number of results to return. Default: -1 === no limit
`**Return:**
- `string[]An array of strings.
`--------
## str_starts_with(string $haystack, string $needle): bool
↑
Check if the string starts with the given substring.EXAMPLE:
UTF8::str_starts_with('ΚόσμεMiddleEnd', 'Κόσμε'); // true
UTF8::str_starts_with('ΚόσμεMiddleEnd', 'κόσμε'); // false**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `bool`--------
## str_starts_with_any(string $str, scalar[] $substrings): bool
↑
Returns true if the string begins with any of $substrings, false otherwise.- case-sensitive
**Parameters:**
- `string $strThe input string.
`
- `scalar[] $substringsSubstrings to look for.
`**Return:**
- `boolWhether or not $str starts with $substring.
`--------
## str_substr_after_first_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring after the first occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_substr_after_last_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring after the last occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_substr_before_first_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring before the first occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_substr_before_last_separator(string $str, string $separator, string $encoding): string
↑
Gets the substring before the last occurrence of a separator.**Parameters:**
- `string $strThe input string.
`
- `string $separatorThe string separator.
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_substr_first(string $str, string $needle, bool $before_needle, string $encoding): string
↑
Gets the substring after (or before via "$before_needle") the first occurrence of the "$needle".**Parameters:**
- `string $strThe input string.
`
- `string $needleThe string to look for.
`
- `bool $before_needle [optional]Default: false
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_substr_last(string $str, string $needle, bool $before_needle, string $encoding): string
↑
Gets the substring after (or before via "$before_needle") the last occurrence of the "$needle".**Parameters:**
- `string $strThe input string.
`
- `string $needleThe string to look for.
`
- `bool $before_needle [optional]Default: false
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `string`--------
## str_surround(string $str, string $substring): string
↑
Surrounds $str with the given substring.**Parameters:**
- `T $str`
- `TSub $substringThe substring to add to both sides.
`**Return:**
- `stringA string with the substring both prepended and appended.
`--------
## str_titleize(string $str, string[]|null $ignore, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length, bool $use_trim_first, string|null $word_define_chars): string
↑
Returns a trimmed string with the first letter of each word capitalized.Also accepts an array, $ignore, allowing you to list words not to be
capitalized.**Parameters:**
- `string $str`
- `string[]|null $ignore [optional]An array of words not to capitalize or
`
null. Default: null
- `string $encoding [optional]Default: 'UTF-8'
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the
`
string.
- `string|null $lang [optional]Set the language for special cases: az,
`
el, lt, tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length:
`
e.g. ẞ -> ß
- `bool $use_trim_first [optional]true === trim the input string,
`
first
- `string|null $word_define_chars [optional]An string of chars that will be used as
`
whitespace separator === words.**Return:**
- `stringThe titleized string.
`--------
## str_titleize_for_humans(string $str, string[] $ignore, string $encoding): string
↑
Returns a trimmed string in proper title case.Also accepts an array, $ignore, allowing you to list words not to be
capitalized.Adapted from John Gruber's script.
**Parameters:**
- `string $str`
- `string[] $ignoreAn array of words not to capitalize.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe titleized string.
`--------
## str_to_binary(string $str): false|string
↑
Get a binary representation of a specific string.EXAPLE:
UTF8::str_to_binary('😃'); // '11110000100111111001100010000011'
**Parameters:**
- `string $strThe input string.
`**Return:**
- `false|stringfalse on error
`--------
## str_to_lines(string $str, bool $remove_empty_values, int|null $remove_short_values): string[]
↑**Parameters:**
- `string $str`
- `bool $remove_empty_valuesRemove empty values.
`
- `int|null $remove_short_valuesThe min. string length or null to disable
`**Return:**
- `string[]`--------
## str_to_words(string $str, string $char_list, bool $remove_empty_values, int|null $remove_short_values): list
↑
Convert a string into an array of words.EXAMPLE:
UTF8::str_to_words('中文空白 oöäü#s', '#') // array('', '中文空白', ' ', 'oöäü#s', '')
**Parameters:**
- `string $str`
- `string $char_listAdditional chars for the definition of "words".
`
- `bool $remove_empty_valuesRemove empty values.
`
- `int|null $remove_short_valuesThe min. string length or null to disable
`**Return:**
- `list`--------
## str_truncate(string $str, int $length, string $substring, string $encoding): string
↑
Truncates the string to a given length. If $substring is provided, and
truncating occurs, the string is further truncated so that the substring
may be appended without exceeding the desired length.**Parameters:**
- `string $str`
- `int $lengthDesired length of the truncated string.
`
- `string $substring [optional]The substring to append if it can fit. Default: ''
`
- `string $encoding [optional]Default: 'UTF-8'
`**Return:**
- `stringA string after truncating.
`--------
## str_truncate_safe(string $str, int $length, string $substring, string $encoding, bool $ignore_do_not_split_words_for_one_word): string
↑
Truncates the string to a given length, while ensuring that it does not
split words. If $substring is provided, and truncating occurs, the
string is further truncated so that the substring may be appended without
exceeding the desired length.**Parameters:**
- `string $str`
- `int $lengthDesired length of the truncated string.
`
- `string $substring [optional]The substring to append if it can fit.
`
Default:
''
- `string $encoding [optional]Default: 'UTF-8'
`
- `bool $ignore_do_not_split_words_for_one_word [optional]Default: false
`**Return:**
- `stringA string after truncating.
`--------
## str_underscored(string $str): string
↑
Returns a lowercase and trimmed string separated by underscores.Underscores are inserted before uppercase characters (with the exception
of the first character of the string), and in place of spaces as well as
dashes.**Parameters:**
- `string $str`**Return:**
- `stringThe underscored string.
`--------
## str_upper_camelize(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Returns an UpperCamelCase version of the supplied string. It trims
surrounding spaces, capitalizes letters following digits, spaces, dashes
and underscores, and removes spaces, dashes, underscores.**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Default: 'UTF-8'
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `stringA string in UpperCamelCase.
`--------
## str_word_count(string $str, int $format, string $char_list): int|string[]
↑
Get the number of words in a specific string.EXAMPLES:
// format: 0 -> return only word count (int)
//
UTF8::str_word_count('中文空白 öäü abc#c'); // 4
UTF8::str_word_count('中文空白 öäü abc#c', 0, '#'); // 3// format: 1 -> return words (array)
//
UTF8::str_word_count('中文空白 öäü abc#c', 1); // array('中文空白', 'öäü', 'abc', 'c')
UTF8::str_word_count('中文空白 öäü abc#c', 1, '#'); // array('中文空白', 'öäü', 'abc#c')// format: 2 -> return words with offset (array)
//
UTF8::str_word_count('中文空白 öäü ab#c', 2); // array(0 => '中文空白', 5 => 'öäü', 9 => 'abc', 13 => 'c')
UTF8::str_word_count('中文空白 öäü ab#c', 2, '#'); // array(0 => '中文空白', 5 => 'öäü', 9 => 'abc#c')**Parameters:**
- `string $strThe input string.
`
- `0|1|2 $format [optional]`
0 => return a number of words (default)
1 => return an array of words
2 => return an array of words with word-offset as key
- `string $char_list [optional]Additional chars that contains to words and do not start a new word.
`**Return:**
- `int|string[]The number of words in the string.
`--------
## strcasecmp(string $str1, string $str2, string $encoding): int
↑
Case-insensitive string comparison.INFO: Case-insensitive version of UTF8::strcmp()
EXAMPLE:
UTF8::strcasecmp("iñtërnâtiôn\nàlizætiøn", "Iñtërnâtiôn\nàlizætiøn"); // 0
**Parameters:**
- `string $str1The first string.
`
- `string $str2The second string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int < 0 if str1 is less than str2;
> 0 if str1 is greater than str2,
0 if they are equal`--------
## strcmp(string $str1, string $str2): int
↑
Case-sensitive string comparison.EXAMPLE:
UTF8::strcmp("iñtërnâtiôn\nàlizætiøn", "iñtërnâtiôn\nàlizætiøn"); // 0
**Parameters:**
- `string $str1The first string.
`
- `string $str2The second string.
`**Return:**
- `int < 0 if str1 is less than str2
> 0 if str1 is greater than str2
0 if they are equal`--------
## strcspn(string $str, string $char_list, int $offset, int|null $length, string $encoding): int
↑
Find length of initial segment not matching mask.**Parameters:**
- `string $str`
- `string $char_list`
- `int $offset`
- `int|null $length`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int`--------
## string(int|int[]|string|string[] $intOrHex): string
↑
Create a UTF-8 string from code points.INFO: opposite to UTF8::codepoints()
EXAMPLE:
UTF8::string(array(246, 228, 252)); // 'öäü'
**Parameters:**
- `int[]|numeric-string[]|int|numeric-string $intOrHexInteger or Hexadecimal codepoints.
`**Return:**
- `stringA UTF-8 encoded string.
`--------
## string_has_bom(string $str): bool
↑
Checks if string starts with "BOM" (Byte Order Mark Character) character.EXAMPLE:
UTF8::string_has_bom("\xef\xbb\xbf foobar"); // true
**Parameters:**
- `string $strThe input string.
`**Return:**
- `bool`
true if the string has BOM at the start,
false otherwise--------
## strip_tags(string $str, string|null $allowable_tags, bool $clean_utf8): string
↑
Strip HTML and PHP tags from a string + clean invalid UTF-8.EXAMPLE:
UTF8::strip_tags("κόσμε\xa0\xa1"); // 'κόσμε'
**Parameters:**
- `string $str`
The input string.
- `string|null $allowable_tags [optional]
You can use the optional second parameter to specify tags which should
not be stripped.`
HTML comments and PHP tags are also stripped. This is hardcoded and
can not be changed with allowable_tags.
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `stringThe stripped string.
`--------
## strip_whitespace(string $str): string
↑
Strip all whitespace characters. This includes tabs and newline
characters, as well as multibyte whitespace such as the thin space
and ideographic space.EXAMPLE:
UTF8::strip_whitespace(' Ο συγγραφέας '); // 'Οσυγγραφέας'
**Parameters:**
- `string $str`**Return:**
- `string`--------
## stripos(string $haystack, string $needle, int $offset, string $encoding, bool $clean_utf8): false|int
↑
Find the position of the first occurrence of a substring in a string, case-insensitive.INFO: use UTF8::stripos_in_byte() for the byte-length
EXAMPLE:
UTF8::stripos('aσσb', 'ΣΣ'); // 1
(σσ == ΣΣ)**Parameters:**
- `string $haystackThe string from which to get the position of the first occurrence of needle.
`
- `string $needleThe string to find in haystack.
`
- `int $offset [optional]The position in haystack to start searching.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|int Return the (int) numeric position of the first occurrence of needle in the
haystack string,
or false if needle is not found`--------
## stripos_in_byte(string $haystack, string $needle, int $offset): false|int
↑
Find the position of the first occurrence of a substring in a string, case-insensitive.**Parameters:**
- `string $haystack`
The string being checked.
- `string $needle`
The position counted from the beginning of haystack.
- `int $offset [optional]`
The search offset. If it is not specified, 0 is used.**Return:**
- `false|intThe numeric position of the first occurrence of needle in the
`
haystack string. If needle is not found, it returns false.--------
## stristr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string
↑
Returns all of haystack starting from and including the first occurrence of needle to the end.EXAMPLE:
$str = 'iñtërnâtiônàlizætiøn';
$search = 'NÂT';UTF8::stristr($str, $search)); // 'nâtiônàlizætiøn'
UTF8::stristr($str, $search, true)); // 'iñtër'**Parameters:**
- `string $haystackThe input string. Must be valid UTF-8.
`
- `string $needleThe string to look for. Must be valid UTF-8.
`
- `bool $before_needle [optional]`
If TRUE, it returns the part of the
haystack before the first occurrence of the needle (excluding the needle).
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|stringA sub-string,
`
or false if needle is not found.--------
## strlen(string $str, string $encoding, bool $clean_utf8): false|int
↑
Get the string length, not the byte-length!INFO: use UTF8::strwidth() for the char-length
EXAMPLE:
UTF8::strlen("Iñtërnâtiôn\xE9àlizætiøn")); // 20
**Parameters:**
- `string $strThe string being checked for length.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|int`
The number (int) of characters in the string $str having character encoding
$encoding.
(One multi-byte character counted as +1).
Can return false, if e.g. mbstring is not installed and we process invalid
chars.--------
## strlen_in_byte(string $str): int
↑
Get string length in byte.**Parameters:**
- `string $str`**Return:**
- `int`--------
## strnatcasecmp(string $str1, string $str2, string $encoding): int
↑
Case-insensitive string comparisons using a "natural order" algorithm.INFO: natural order version of UTF8::strcasecmp()
EXAMPLES:
UTF8::strnatcasecmp('2', '10Hello WORLD 中文空白!'); // -1
UTF8::strcasecmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // 1UTF8::strnatcasecmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // 1
UTF8::strcasecmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // -1**Parameters:**
- `string $str1The first string.
`
- `string $str2The second string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int < 0 if str1 is less than str2
> 0 if str1 is greater than str2
0 if they are equal`--------
## strnatcmp(string $str1, string $str2): int
↑
String comparisons using a "natural order" algorithmINFO: natural order version of UTF8::strcmp()
EXAMPLES:
UTF8::strnatcmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // -1
UTF8::strcmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // 1UTF8::strnatcmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // 1
UTF8::strcmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // -1**Parameters:**
- `string $str1The first string.
`
- `string $str2The second string.
`**Return:**
- `int < 0 if str1 is less than str2;
> 0 if str1 is greater than str2;
0 if they are equal`--------
## strncasecmp(string $str1, string $str2, int $len, string $encoding): int
↑
Case-insensitive string comparison of the first n characters.EXAMPLE:
UTF8::strcasecmp("iñtërnâtiôn\nàlizætiøn321", "iñtërnâtiôn\nàlizætiøn123", 5); // 0**Parameters:**
- `string $str1The first string.
`
- `string $str2The second string.
`
- `int $lenThe length of strings to be used in the comparison.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int < 0 if str1 is less than str2;
> 0 if str1 is greater than str2;
0 if they are equal`--------
## strncmp(string $str1, string $str2, int $len, string $encoding): int
↑
String comparison of the first n characters.EXAMPLE:
UTF8::strncmp("Iñtërnâtiôn\nàlizætiøn321", "Iñtërnâtiôn\nàlizætiøn123", 5); // 0**Parameters:**
- `string $str1The first string.
`
- `string $str2The second string.
`
- `int $lenNumber of characters to use in the comparison.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int < 0 if str1 is less than str2;
> 0 if str1 is greater than str2;
0 if they are equal`--------
## strpbrk(string $haystack, string $char_list): false|string
↑
Search a string for any of a set of characters.EXAMPLE:
UTF8::strpbrk('-中文空白-', '白'); // '白-'
**Parameters:**
- `string $haystackThe string where char_list is looked for.
`
- `string $char_listThis parameter is case-sensitive.
`**Return:**
- `false|stringThe string starting from the character found, or false if it is not found.
`--------
## strpos(string $haystack, int|string $needle, int $offset, string $encoding, bool $clean_utf8): false|int
↑
Find the position of the first occurrence of a substring in a string.INFO: use UTF8::strpos_in_byte() for the byte-length
EXAMPLE:
UTF8::strpos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 8
**Parameters:**
- `string $haystackThe string from which to get the position of the first occurrence of needle.
`
- `int|string $needleThe string to find in haystack.
`
Or a code point as int.
- `int $offset [optional]The search offset. If it is not specified, 0 is used.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|int The (int) numeric position of the first occurrence of needle in the haystack
string.
If needle is not found it returns false.`--------
## strpos_in_byte(string $haystack, string $needle, int $offset): false|int
↑
Find the position of the first occurrence of a substring in a string.**Parameters:**
- `string $haystack`
The string being checked.
- `string $needle`
The position counted from the beginning of haystack.
- `int $offset [optional]`
The search offset. If it is not specified, 0 is used.**Return:**
- `false|intThe numeric position of the first occurrence of needle in the
`
haystack string. If needle is not found, it returns false.--------
## strrchr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string
↑
Find the last occurrence of a character in a string within another.EXAMPLE:
UTF8::strrchr('κόσμεκόσμε-äöü', 'κόσμε'); // 'κόσμε-äöü'
**Parameters:**
- `string $haystackThe string from which to get the last occurrence of needle.
`
- `string $needleThe string to find in haystack
`
- `bool $before_needle [optional]`
Determines which portion of haystack
this function returns.
If set to true, it returns all of haystack
from the beginning to the last occurrence of needle.
If set to false, it returns all of haystack
from the last occurrence of needle to the end,
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|stringThe portion of haystack or false if needle is not found.
`--------
## strrev(string $str, string $encoding): string
↑
Reverses characters order in the string.EXAMPLE:
UTF8::strrev('κ-öäü'); // 'üäö-κ'
**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringThe string with characters in the reverse sequence.
`--------
## strrichr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string
↑
Find the last occurrence of a character in a string within another, case-insensitive.EXAMPLE:
UTF8::strrichr('Aκόσμεκόσμε-äöü', 'aκόσμε'); // 'Aκόσμεκόσμε-äöü'
**Parameters:**
- `string $haystackThe string from which to get the last occurrence of needle.
`
- `string $needleThe string to find in haystack.
`
- `bool $before_needle [optional]`
Determines which portion of haystack
this function returns.
If set to true, it returns all of haystack
from the beginning to the last occurrence of needle.
If set to false, it returns all of haystack
from the last occurrence of needle to the end,
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|stringThe portion of haystack or
`
false if needle is not found.--------
## strripos(string $haystack, int|string $needle, int $offset, string $encoding, bool $clean_utf8): false|int
↑
Find the position of the last occurrence of a substring in a string, case-insensitive.EXAMPLE:
UTF8::strripos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 13
**Parameters:**
- `string $haystackThe string to look in.
`
- `int|string $needleThe string to look for.
`
- `int $offset [optional]Number of characters to ignore in the beginning or end.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|intThe (int) numeric position of the last occurrence of needle in the haystack
`
string.
If needle is not found, it returns false.--------
## strripos_in_byte(string $haystack, string $needle, int $offset): false|int
↑
Finds position of last occurrence of a string within another, case-insensitive.**Parameters:**
- `string $haystack`
The string from which to get the position of the last occurrence
of needle.
- `string $needle`
The string to find in haystack.
- `int $offset [optional]`
The position in haystack
to start searching.**Return:**
- `false|inteturn the numeric position of the last occurrence of needle in the
`
haystack string, or false if needle is not found.--------
## strrpos(string $haystack, int|string $needle, int $offset, string $encoding, bool $clean_utf8): false|int
↑
Find the position of the last occurrence of a substring in a string.EXAMPLE:
UTF8::strrpos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 13
**Parameters:**
- `string $haystackThe string being checked, for the last occurrence of needle
`
- `int|string $needleThe string to find in haystack.
`
Or a code point as int.
- `int $offset [optional]May be specified to begin searching an arbitrary number of characters
`
into the string. Negative values will stop searching at an arbitrary point prior to
the end of the string.
- `string $encoding [optional]Set the charset.
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|intThe (int) numeric position of the last occurrence of needle in the haystack
`
string.
If needle is not found, it returns false.--------
## strrpos_in_byte(string $haystack, string $needle, int $offset): false|int
↑
Find the position of the last occurrence of a substring in a string.**Parameters:**
- `string $haystack`
The string being checked, for the last occurrence
of needle.
- `string $needle`
The string to find in haystack.
- `int $offset [optional]May be specified to begin searching an arbitrary number of characters into
`
the string. Negative values will stop searching at an arbitrary point
prior to the end of the string.**Return:**
- `false|intThe numeric position of the last occurrence of needle in the
`
haystack string. If needle is not found, it returns false.--------
## strspn(string $str, string $mask, int $offset, int|null $length, string $encoding): false|int
↑
Finds the length of the initial segment of a string consisting entirely of characters contained within a given
mask.EXAMPLE:
UTF8::strspn('iñtërnâtiônàlizætiøn', 'itñ'); // '3'
**Parameters:**
- `string $strThe input string.
`
- `string $maskThe mask of chars
`
- `int $offset [optional]`
- `int|null $length [optional]`
- `string $encoding [optional]Set the charset.
`**Return:**
- `false|int`--------
## strstr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string
↑
Returns part of haystack string from the first occurrence of needle to the end of haystack.EXAMPLE:
$str = 'iñtërnâtiônàlizætiøn';
$search = 'nât';UTF8::strstr($str, $search)); // 'nâtiônàlizætiøn'
UTF8::strstr($str, $search, true)); // 'iñtër'**Parameters:**
- `string $haystackThe input string. Must be valid UTF-8.
`
- `string $needleThe string to look for. Must be valid UTF-8.
`
- `bool $before_needle [optional]`
If TRUE, strstr() returns the part of the
haystack before the first occurrence of the needle (excluding the needle).
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|stringA sub-string,
`
or false if needle is not found.--------
## strstr_in_byte(string $haystack, string $needle, bool $before_needle): false|string
↑
Finds first occurrence of a string within another.**Parameters:**
- `string $haystack`
The string from which to get the first occurrence
of needle.
- `string $needle`
The string to find in haystack.
- `bool $before_needle [optional]`
Determines which portion of haystack
this function returns.
If set to true, it returns all of haystack
from the beginning to the first occurrence of needle.
If set to false, it returns all of haystack
from the first occurrence of needle to the end,**Return:**
- `false|stringThe portion of haystack,
`
or false if needle is not found.--------
## strtocasefold(string $str, bool $full, bool $clean_utf8, string $encoding, string|null $lang, bool $lower): string
↑
Unicode transformation for case-less matching.EXAMPLE:
UTF8::strtocasefold('ǰ◌̱'); // 'ǰ◌̱'
**Parameters:**
- `string $strThe input string.
`
- `bool $full [optional]`
true, replace full case folding chars (default)
false, use only limited static array [UTF8::$COMMON_CASE_FOLD]
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string $encoding [optional]Set the charset.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt, tr
`
- `bool $lower [optional]Use lowercase string, otherwise use uppercase string. PS: uppercase
`
is for some languages better ...**Return:**
- `string`--------
## strtolower(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Make a string lowercase.EXAMPLE:
UTF8::strtolower('DÉJÀ Σσς Iıİi'); // 'déjà σσς iıii'
**Parameters:**
- `string $strThe string being lowercased.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `stringString with all alphabetic characters converted to lowercase.
`--------
## strtoupper(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Make a string uppercase.EXAMPLE:
UTF8::strtoupper('Déjà Σσς Iıİi'); // 'DÉJÀ ΣΣΣ IIİI'
**Parameters:**
- `string $strThe string being uppercased.
`
- `string $encoding [optional]Set the charset.
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `stringString with all alphabetic characters converted to uppercase.
`--------
## strtr(string $str, string|string[] $from, string|string[] $to): string
↑
Translate characters or replace sub-strings.EXAMPLE:
$array = [
'Hello' => '○●◎',
'中文空白' => 'earth',
];
UTF8::strtr('Hello 中文空白', $array); // '○●◎ earth'**Parameters:**
- `string $strThe string being translated.
`
- `string|string[] $fromThe string replacing from.
`
- `string|string[] $to [optional]The string being translated to to.
`**Return:**
- `stringThis function returns a copy of str, translating all occurrences of each character in "from"
`
to the corresponding character in "to".--------
## strwidth(string $str, string $encoding, bool $clean_utf8): int
↑
Return the width of a string.INFO: use UTF8::strlen() for the byte-length
EXAMPLE:
UTF8::strwidth("Iñtërnâtiôn\xE9àlizætiøn")); // 21
**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `int`--------
## substr(string $str, int $offset, int|null $length, string $encoding, bool $clean_utf8): false|string
↑
Get part of a string.EXAMPLE:
UTF8::substr('中文空白', 1, 2); // '文空'
**Parameters:**
- `string $strThe string being checked.
`
- `int $offsetThe first position used in str.
`
- `int|null $length [optional]The maximum length of the returned string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|string The portion of str specified by the offset and
length parameters.If str is shorter than offset
characters long, FALSE will be returned.`--------
## substr_compare(string $str1, string $str2, int $offset, int|null $length, bool $case_insensitivity, string $encoding): int
↑
Binary-safe comparison of two strings from an offset, up to a length of characters.EXAMPLE:
UTF8::substr_compare("○●◎\r", '●◎', 0, 2); // -1
UTF8::substr_compare("○●◎\r", '◎●', 1, 2); // 1
UTF8::substr_compare("○●◎\r", '●◎', 1, 2); // 0**Parameters:**
- `string $str1The main string being compared.
`
- `string $str2The secondary string being compared.
`
- `int $offset [optional]The start position for the comparison. If negative, it starts
`
counting from the end of the string.
- `int|null $length [optional]The length of the comparison. The default value is the largest
`
of the length of the str compared to the length of main_str less the
offset.
- `bool $case_insensitivity [optional]If case_insensitivity is TRUE, comparison is case
`
insensitive.
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int < 0 if str1 is less than str2;
> 0 if str1 is greater than str2,
0 if they are equal`--------
## substr_count(string $haystack, string $needle, int $offset, int|null $length, string $encoding, bool $clean_utf8): false|int
↑
Count the number of substring occurrences.EXAMPLE:
UTF8::substr_count('中文空白', '文空', 1, 2); // 1
**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`
- `int $offset [optional]The offset where to start counting.
`
- `int|null $length [optional]`
The maximum length after the specified offset to search for the
substring. It outputs a warning if the offset plus the length is
greater than the haystack length.
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `false|intThis functions returns an integer or false if there isn't a string.
`--------
## substr_count_in_byte(string $haystack, string $needle, int $offset, int|null $length): false|int
↑
Count the number of substring occurrences.**Parameters:**
- `string $haystack`
The string being checked.
- `string $needle`
The string being found.
- `int $offset [optional]`
The offset where to start counting
- `int|null $length [optional]`
The maximum length after the specified offset to search for the
substring. It outputs a warning if the offset plus the length is
greater than the haystack length.**Return:**
- `false|intThe number of times the
`
needle substring occurs in the
haystack string.--------
## substr_count_simple(string $str, string $substring, bool $case_sensitive, string $encoding): int
↑
Returns the number of occurrences of $substring in the given string.By default, the comparison is case-sensitive, but can be made insensitive
by setting $case_sensitive to false.**Parameters:**
- `string $strThe input string.
`
- `string $substringThe substring to search for.
`
- `bool $case_sensitive [optional]Whether or not to enforce case-sensitivity. Default: true
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `int`--------
## substr_ileft(string $haystack, string $needle): string
↑
Removes a prefix ($needle) from the beginning of the string ($haystack), case-insensitive.EXMAPLE:
UTF8::substr_ileft('ΚόσμεMiddleEnd', 'Κόσμε'); // 'MiddleEnd'
UTF8::substr_ileft('ΚόσμεMiddleEnd', 'κόσμε'); // 'MiddleEnd'**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `stringReturn the sub-string.
`--------
## substr_in_byte(string $str, int $offset, int|null $length): false|string
↑
Get part of a string process in bytes.**Parameters:**
- `string $strThe string being checked.
`
- `int $offsetThe first position used in str.
`
- `int|null $length [optional]The maximum length of the returned string.
`**Return:**
- `false|stringThe portion of str specified by the offset and
length parameters.If str is shorter than offset
`
characters long, FALSE will be returned.--------
## substr_iright(string $haystack, string $needle): string
↑
Removes a suffix ($needle) from the end of the string ($haystack), case-insensitive.EXAMPLE:
UTF8::substr_iright('BeginMiddleΚόσμε', 'Κόσμε'); // 'BeginMiddle'
UTF8::substr_iright('BeginMiddleΚόσμε', 'κόσμε'); // 'BeginMiddle'**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `stringReturn the sub-string.
`
--------
## substr_left(string $haystack, string $needle): string
↑
Removes a prefix ($needle) from the beginning of the string ($haystack).EXAMPLE:
UTF8::substr_left('ΚόσμεMiddleEnd', 'Κόσμε'); // 'MiddleEnd'
UTF8::substr_left('ΚόσμεMiddleEnd', 'κόσμε'); // 'ΚόσμεMiddleEnd'**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`**Return:**
- `stringReturn the sub-string.
`--------
## substr_replace(string|string[] $str, string|string[] $replacement, int|int[] $offset, int|int[]|null $length, string $encoding): string|string[]
↑
Replace text within a portion of a string.EXAMPLE:
UTF8::substr_replace(array('Iñtërnâtiônàlizætiøn', 'foo'), 'æ', 1); // array('Iæñtërnâtiônàlizætiøn', 'fæoo')
source: https://gist.github.com/stemar/8287074
**Parameters:**
- `TSubReplace $strThe input string or an array of stings.
`
- `string|string[] $replacementThe replacement string or an array of stings.
`
- `int|int[] $offset`
If start is positive, the replacing will begin at the start'th offset
into string.
If start is negative, the replacing will begin at the start'th character
from the end of string.
- `int|int[]|null $length [optional]If given and is positive, it represents the length of the
`
portion of string which is to be replaced. If it is negative, it
represents the number of characters from the end of string at which to
stop replacing. If it is not given, then it will default to strlen(
string ); i.e. end the replacing at the end of string. Of course, if
length is zero then this function will have the effect of inserting
replacement into string at the given start offset.
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `string|string[]The result string is returned. If string is an array then array is returned.
`--------
## substr_right(string $haystack, string $needle, string $encoding): string
↑
Removes a suffix ($needle) from the end of the string ($haystack).EXAMPLE:
UTF8::substr_right('BeginMiddleΚόσμε', 'Κόσμε'); // 'BeginMiddle'
UTF8::substr_right('BeginMiddleΚόσμε', 'κόσμε'); // 'BeginMiddleΚόσμε'**Parameters:**
- `string $haystackThe string to search in.
`
- `string $needleThe substring to search for.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`**Return:**
- `stringReturn the sub-string.
`--------
## swapCase(string $str, string $encoding, bool $clean_utf8): string
↑
Returns a case swapped version of the string.EXAMPLE:
UTF8::swapCase('déJÀ σσς iıII'); // 'DÉjà ΣΣΣ IIii'
**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`**Return:**
- `stringEach character's case swapped.
`--------
## symfony_polyfill_used(): bool
↑
Checks whether symfony-polyfills are used.**Parameters:**
__nothing__**Return:**
- `booltrue if in use, false otherwise
`--------
## tabs_to_spaces(string $str, int $tab_length): string
↑**Parameters:**
- `string $str`
- `int $tab_length`**Return:**
- `string`--------
## titlecase(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Converts the first character of each word in the string to uppercase
and all other chars to lowercase.**Parameters:**
- `string $strThe input string.
`
- `string $encoding [optional]Set the charset for e.g. "mb_" function
`
- `bool $clean_utf8 [optional]Remove non UTF-8 chars from the string.
`
- `string|null $lang [optional]Set the language for special cases: az, el, lt,
`
tr
- `bool $try_to_keep_the_string_length [optional]true === try to keep the string length: e.g. ẞ
`
-> ß**Return:**
- `stringA string with all characters of $str being title-cased.
`--------
## to_ascii(string $str, string $unknown, bool $strict): string
↑
Convert a string into ASCII.EXAMPLE:
UTF8::to_ascii('déjà σσς iıii'); // 'deja sss iiii'
**Parameters:**
- `string $strThe input string.
`
- `string $unknown [optional]Character use if character unknown. (default is ?)
`
- `bool $strict [optional]Use "transliterator_transliterate()" from PHP-Intl | WARNING: bad
`
performance**Return:**
- `string`--------
## to_boolean(bool|float|int|string $str): bool
↑**Parameters:**
- `bool|float|int|string $str`**Return:**
- `bool`--------
## to_filename(string $str, bool $use_transliterate, string $fallback_char): string
↑
Convert given string to safe filename (and keep string case).**Parameters:**
- `string $str`
- `bool $use_transliterate No transliteration, conversion etc. is done by default - unsafe characters are
simply replaced with hyphen.`
- `string $fallback_char`**Return:**
- `string`--------
## to_int(string $str): int|null
↑
Returns the given string as an integer, or null if the string isn't numeric.**Parameters:**
- `string $str`**Return:**
- `int|nullnull if the string isn't numeric
`--------
## to_iso8859(string|string[] $str): string|string[]
↑
Convert a string into "ISO-8859"-encoding (Latin-1).EXAMPLE:
UTF8::to_utf8(UTF8::to_iso8859(' -ABC-中文空白- ')); // ' -ABC-????- '
**Parameters:**
- `TToIso8859 $str`**Return:**
- `string|string[]`--------
## to_string(float|int|object|string|null $input): string|null
↑
Returns the given input as string, or null if the input isn't int|float|string
and do not implement the "__toString()" method.**Parameters:**
- `float|int|object|string|null $input`**Return:**
- `string|nullnull if the input isn't int|float|string and has no "__toString()" method
`--------
## to_utf8(string|string[] $str, bool $decode_html_entity_to_utf8): string|string[]
↑
This function leaves UTF-8 characters alone, while converting almost all non-UTF8 to UTF8.
- It decode UTF-8 codepoints and Unicode escape sequences.
- It assumes that the encoding of the original string is either WINDOWS-1252 or ISO-8859.
- WARNING: It does not remove invalid UTF-8 characters, so you maybe need to use "UTF8::clean()" for this
case.
EXAMPLE: UTF8::to_utf8(["\u0063\u0061\u0074"]); // array('cat')
**Parameters:**
- `TToUtf8 $str
Any string or array of strings.
`- `bool $decode_html_entity_to_utf8
Set to true, if you need to decode html-entities.
`**Return:**
- `string|string[]
The UTF-8 encoded string
`--------
## to_utf8_string(string $str, bool $decode_html_entity_to_utf8): string
↑
This function leaves UTF-8 characters alone, while converting almost all non-UTF8 to UTF8.
- It decode UTF-8 codepoints and Unicode escape sequences.
- It assumes that the encoding of the original string is either WINDOWS-1252 or ISO-8859.
- WARNING: It does not remove invalid UTF-8 characters, so you maybe need to use "UTF8::clean()" for this
case.
EXAMPLE: UTF8::to_utf8_string("\u0063\u0061\u0074"); // 'cat'
**Parameters:**
- `T $str
Any string.
`- `bool $decode_html_entity_to_utf8
Set to true, if you need to decode html-entities.
`**Return:**
- `string
The UTF-8 encoded string
`--------
## trim(string $str, string|null $chars): string
↑
Strip whitespace or other characters from the beginning and end of a UTF-8 string.
INFO: This is slower then "trim()"
We can only use the original-function, if we use <= 7-Bit in the string / chars
but the check for ASCII (7-Bit) cost more time, then we can safe here.
EXAMPLE: UTF8::trim(' -ABC-中文空白- '); // '-ABC-中文空白-'
**Parameters:**
- `string $str
The string to be trimmed
`- `string|null $chars [optional]
Optional characters to be stripped
`**Return:**
- `string
The trimmed string.
`--------
## ucfirst(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string
↑
Makes string's first char uppercase.
EXAMPLE: UTF8::ucfirst('ñtërnâtiônàlizætiøn foo'); // 'Ñtërnâtiônàlizætiøn foo'
**Parameters:**
- `string $str
The input string.
`- `string $encoding [optional]
Set the charset for e.g. "mb_" function
`- `bool $clean_utf8 [optional]
Remove non UTF-8 chars from the string.
`- `string|null $lang [optional]
Set the language for special cases: az, el, lt,
tr
- `bool $try_to_keep_the_string_length [optional]
true === try to keep the string length: e.g. ẞ
-> ß
**Return:**
- `string
The resulting string with with char uppercase.
`--------
## ucwords(string $str, string[] $exceptions, string $char_list, string $encoding, bool $clean_utf8): string
↑
Uppercase for all words in the string.
EXAMPLE: UTF8::ucwords('iñt ërn âTi ônà liz æti øn'); // 'Iñt Ërn ÂTi Ônà Liz Æti Øn'
**Parameters:**
- `string $str
The input string.
`- `string[] $exceptions [optional]
Exclusion for some words.
`- `string $char_list [optional]
Additional chars that contains to words and do not start a new
word.
- `string $encoding [optional]
Set the charset.
`- `bool $clean_utf8 [optional]
Remove non UTF-8 chars from the string.
`**Return:**
- `string`
--------
## urldecode(string $str, bool $multi_decode): string
↑
Multi decode HTML entity + fix urlencoded-win1252-chars.
EXAMPLE: UTF8::urldecode('tes%20öäü%20\u00edtest+test'); // 'tes öäü ítest test'
e.g:
'test+test' => 'test test'
'Düsseldorf' => 'Düsseldorf'
'D%FCsseldorf' => 'Düsseldorf'
'Düsseldorf' => 'Düsseldorf'
'D%26%23xFC%3Bsseldorf' => 'Düsseldorf'
'Düsseldorf' => 'Düsseldorf'
'D%C3%BCsseldorf' => 'Düsseldorf'
'D%C3%83%C2%BCsseldorf' => 'Düsseldorf'
'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf'
**Parameters:**
- `T $str
The input string.
`- `bool $multi_decode
Decode as often as possible.
`**Return:**
- `string`
--------
## utf8_decode(string $str, bool $keep_utf8_chars): string
↑
Decodes a UTF-8 string to ISO-8859-1.
EXAMPLE: UTF8::encode('UTF-8', UTF8::utf8_decode('-ABC-中文空白-')); // '-ABC-????-'
**Parameters:**
- `string $str
The input string.
`- `bool $keep_utf8_chars`
**Return:**
- `string`
--------
## utf8_encode(string $str): string
↑
Encodes an ISO-8859-1 string to UTF-8.
EXAMPLE: UTF8::utf8_decode(UTF8::utf8_encode('-ABC-中文空白-')); // '-ABC-中文空白-'
**Parameters:**
- `string $str
The input string.
`**Return:**
- `string`
--------
## whitespace_table(): string[]
↑
Returns an array with all utf8 whitespace characters.
**Parameters:**
__nothing__
**Return:**
- `string[] An array with all known whitespace characters as values and the type of whitespace as keys
as defined in above URL`
--------
## words_limit(string $str, int $limit, string $str_add_on): string
↑
Limit the number of words in a string.
EXAMPLE: UTF8::words_limit('fòô bàř fòô', 2, ''); // 'fòô bàř'
**Parameters:**
- `string $str
The input string.
`- `int<1, max> $limit
The limit of words as integer.
`- `string $str_add_on
Replacement for the striped string.
`**Return:**
- `string`
--------
## wordwrap(string $str, int $width, string $break, bool $cut): string
↑
Wraps a string to a given number of characters
EXAMPLE: UTF8::wordwrap('Iñtërnâtiônàlizætiøn', 2, '
', true)); // 'Iñ
të
rn
ât
iô
nà
li
zæ
ti
øn'
**Parameters:**
- `string $str
The input string.
`- `int<1, max> $width [optional]
The column width.
`- `string $break [optional]
The line is broken using the optional break parameter.
`- `bool $cut [optional]
If the cut is set to true, the string is
always wrapped at or before the specified width. So if you have
a word that is larger than the given width, it is broken apart.
**Return:**
- `string
The given string wrapped at the specified column.
`--------
## wordwrap_per_line(string $str, int $width, string $break, bool $cut, bool $add_final_break, string|null $delimiter): string
↑
Line-Wrap the string after $limit, but split the string by "$delimiter" before ...
... so that we wrap the per line.
**Parameters:**
- `string $str
The input string.
`- `int<1, max> $width [optional]
The column width.
`- `string $break [optional]
The line is broken using the optional break parameter.
`- `bool $cut [optional]
If the cut is set to true, the string is
always wrapped at or before the specified width. So if you have
a word that is larger than the given width, it is broken apart.
- `bool $add_final_break [optional]
If this flag is true, then the method will add a $break at the end
of the result string.
- `non-empty-string|null $delimiter [optional]
You can change the default behavior, where we split the string by newline.
**Return:**
- `string`
--------
## ws(): string[]
↑
Returns an array of Unicode White Space characters.
**Parameters:**
__nothing__
**Return:**
- `string[]
An array with numeric code point as key and White Space Character as value.
`--------
## Unit Test
1) [Composer](https://getcomposer.org) is a prerequisite for running the tests.
```
composer install
```
2) The tests can be executed by running this command from the root directory:
```bash
./vendor/bin/phpunit
```
### Support
For support and donations please visit [GitHub](https://github.com/voku/portable-utf8/) | [Issues](https://github.com/voku/portable-utf8/issues) | [PayPal](https://paypal.me/moelleken) | [Patreon](https://www.patreon.com/voku).
For status updates and release announcements please visit [Releases](https://github.com/voku/portable-utf8/releases) | [Twitter](https://twitter.com/suckup_de) | [Patreon](https://www.patreon.com/voku/posts).
For professional support please contact [me](https://about.me/voku).
### Thanks
- Thanks to [GitHub](https://github.com) (Microsoft) for hosting the code and a good infrastructure including Issues-Management, etc.
- Thanks to [IntelliJ](https://www.jetbrains.com) as they make the best IDEs for PHP and they gave me an open source license for PhpStorm!
- Thanks to [Travis CI](https://travis-ci.com/) for being the most awesome, easiest continuous integration tool out there!
- Thanks to [StyleCI](https://styleci.io/) for the simple but powerful code style check.
- Thanks to [PHPStan](https://github.com/phpstan/phpstan) && [Psalm](https://github.com/vimeo/psalm) for really great Static analysis tools and for discovering bugs in the code!
### License and Copyright
"Portable UTF8" is free software; you can redistribute it and/or modify it under
the terms of the (at your option):
- [Apache License v2.0](http://apache.org/licenses/LICENSE-2.0.txt), or
- [GNU General Public License v2.0](http://gnu.org/licenses/gpl-2.0.txt).
Unicode handling requires tedious work to be implemented and maintained on the
long run. As such, contributions such as unit tests, bug reports, comments or
patches licensed under both licenses are really welcomed.
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fvoku%2Fportable-utf8.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fvoku%2Fportable-utf8?ref=badge_large)