Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/seb35/wikidiff2

PHP extension wikidiff2 from Debian packaging
https://github.com/seb35/wikidiff2

Last synced: 6 days ago
JSON representation

PHP extension wikidiff2 from Debian packaging

Awesome Lists containing this project

README

        

Wikidiff2 is partly based on the original wikidiff, partly on DifferenceEngine.php, and partly my own work. It performs word-level (space-delimited) diffs on general text, and character-level diffs on text composed of characters from the Japanese and Thai alphabets and the unified han. Japanese, Chinese and Thai do not use spaces to separate words. The input is assumed to be UTF-8 encoded. Invalid UTF-8 may cause undesirable operation, such as truncation of the output, so the input should be validated by the application. The input text should have unix-style line endings.

The output is an HTML fragment -- a number of HTML table rows with the rest of the document structure omitted. The characters "<", ">" and "&" will be HTML-escaped in the output.

A number of test files are provided:

* test-a1 and test-a2, expected output test-a.diff
* test-b1 and test-b2, expected output test-b.diff

These pairs together give 85.9% coverage of DiffEngine.h. The remaining untested part is mostly in _DiffEngine::_shift_boundaries().

* chinese-reverse-1.txt and chinese-reverse-2.txt (in chinese-reverse.zip)

These files are 2.3MB each, and give a worst-case performance test. Performance in the worst case is sensitive to the performance of the associative array class used to cross-reference the strings. I tried using an STL map and a Judy array. The Judy array gave an 11% improvement in execution time over the map, which could probably be increased to 15% with further optimisation work. I don't consider that to be a sufficient improvement to warrant adding a library dependency, but the code has been left in for the benefit of Judy fans and performance perfectionists. It can be enabled by compiling with -DUSE_JUDY. The C++ wrapper for JudyHS might be of use to someone.

Wikidiff2 is a PHP extension.

It requires the following library:

* libthai, a Thai language support library
http://linux.thai.net/plone/TLWG/libthai/

To compile and install it:

$ phpize
$ ./configure
$ make
$ sudo make install

== Changelog ==

2013-08-07
* (bug 52616) Use and for insertions and deletions

2012-01-03
* (bug 26354) Add class to dummy cells containing  
* (bug 25697) Add   to empty context lines

2010-09-30
* Automatically use CXXFLAGS=-Wno-write-strings

2010-06-14
* Converted the extension from swig to PHP native.
* Added Thai word break support
* Added PHP allocator support

vim: wrap