Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ap/udcode

Does a set of code words form a uniquely decodable code?
https://github.com/ap/udcode

perl

Last synced: about 1 month ago
JSON representation

Does a set of code words form a uniquely decodable code?

Awesome Lists containing this project

README

        

use strict; use warnings;

package UDCode;

our $VERSION = '1.04';

BEGIN { require Exporter; *import = \&Exporter::import }
our @EXPORT = qw(is_udcode ud_pair);

=head1 NAME

UDCode - Does a set of code words form a uniquely decodable code?

=head1 SYNOPSIS

use UDCode;

if (is_udcode(@words)) { ... }

my ($x1, $x2) = ud_pair(@words);

=head1 DESCRIPTION

A I is a set of strings, called the I. A code is
I if any string I that is a concatenation of
code words is so in I.

For example, the code C<('ab', 'abba', 'b')> is I uniquely
decodable, because C<'abba' . 'b' eq 'ab' . 'b' . 'ab'>. But the code
C<('a', 'ab', 'abb')> I uniquely decodable, because there is no such
pair of sequences of code words.

This module provides a pair of functions to tell whether a set of
code words is a uniquely decodable code, and to find an example of
sequences of code words whose concatenations are the same, if there is
such a pair.

=head1 INTERFACE

=head2 C

C returns true if and only if the specified code is
uniquely decodable.

=cut

sub is_udcode {
my $N = my ($a, $b) = ud_pair(@_);
return $N == 0;
}

=head2 C

If C<@words> is not a uniquely decodable code, then C
returns a proof of that fact, in the form of two distinct sequences of
code words whose concatenations are equal.

If C<@words> is not uniquely decodable, then C returns
references to two arrays of code words, C<$a>, and C<$b>, such that:

join("", @$a) eq join("", @$b)

For example, given C<@words = qw(ab abba b)>, C might return
the two arrays C<["ab", "b", "ab"]> and C<["abba", "b"]>.

If C<@words> is uniquely decodable, C returns false.

=cut

sub ud_pair {
# Code words
my @c = @_;

# $h{$x} = [$y, $z] means that $x$y eq $z
my %h;

# Queue
my @q;

for my $c1 (@c) {
for my $c2 (@c) {
next if $c1 eq $c2;
if (is_prefix_of($c1, $c2)) {
my $x = subtract($c1, $c2);
$h{$x} = [[$c1], [$c2]];
push @q, $x;
}
}
}

while (@q) {
my $x = shift @q;
return unless defined $x;

my ($a, $b) = @{$h{$x}};
for my $c (@c) {
die unless defined $b; # Can't happen
# $a$x eq $b

my $y;
if (is_prefix_of($c, $x)) {
$y = subtract($c, $x);
next if exists $h{$y}; # already tried this
$h{$y} = [[@$a, $c], $b];
push @q, $y;
} elsif (is_prefix_of($x, $c)) {
$y = subtract($x, $c);
next if exists $h{$y}; # already tried this
$h{$y} = [$b, [@$a, $c]];
push @q, $y;
}

return @{$h{""}} if defined($y) && $y eq "";
}
}
return; # failure
}

sub is_prefix_of {
index($_[1], $_[0]) == 0;
}

sub subtract {
substr($_[1], length($_[0]));
}

=head1 AUTHOR

Mark Jason Dominus

=head1 COPYRIGHT AND LICENSE

This software is hereby released into the public domain. You may use,
modify, or distribute it for any purpose whatsoever without restriction.

=cut

1;