https://github.com/tdebatty/php-language-processing
A PHP library for language processing. Includes string distance function (Levenshtein, Jaro-Winkler,...), stemming, etc.
https://github.com/tdebatty/php-language-processing
Last synced: 7 months ago
JSON representation
A PHP library for language processing. Includes string distance function (Levenshtein, Jaro-Winkler,...), stemming, etc.
- Host: GitHub
- URL: https://github.com/tdebatty/php-language-processing
- Owner: tdebatty
- License: mit
- Created: 2013-08-13T06:41:18.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2017-09-06T07:04:54.000Z (over 8 years ago)
- Last Synced: 2025-05-13T09:46:35.828Z (8 months ago)
- Language: PHP
- Size: 18.6 KB
- Stars: 27
- Watchers: 5
- Forks: 7
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# php-language-processing
[](https://packagist.org/packages/webd/language) [](https://packagist.org/packages/webd/language)
A PHP library for language processing. Includes string distance function
(Levenshtein, Jaro-Winkler, LCS-distance...), stemming, hashing etc.
Installation using Composer
---------------------------
in composer.json :
```
"require": {
"webd/language": "dev-master"
}
```
Then
```
composer install
```
Usage
-----
```php
use webd\language\StringDistance;
$string1 = "You won 10000$";
$string2 = "You won 15500$";
echo "Edit distance : " . StringDistance::EditDistance($string1, $string2);
echo "Levenshtein : " . StringDistance::Levenshtein($string1, $string2);
echo "Jaro-Winkler : " . StringDistance::JaroWinkler($string1, $string2);
echo "Jaro-Winkler (prefix scale = 0.2) : " . StringDistance::JaroWinkler($string1, $string2, 0.2);
use webd\language\PorterStemmer;
echo "analyzing => " . PorterStemmer::Stem("analyzing");
echo "abandoned => " . PorterStemmer::Stem("abandoned");
echo "inclination => " . PorterStemmer::Stem("inclination");
$lcs = new \webd\language\LCS($str1, $str2);
echo $lcs->value();
echo $lcs->length();
echo $lcs->distance();
// SpamSum, aka ssdeep, aka Context-Triggered Piecewize Hashing (CTPH):
$s = new \webd\language\SpamSum;
echo $s->HashString(file_get_contents($f));
```