https://github.com/nunorc/lingua-idsplitter

split identifiers into words
https://github.com/nunorc/lingua-idsplitter

Last synced: about 1 month ago
JSON representation

split identifiers into words

Host: GitHub
URL: https://github.com/nunorc/lingua-idsplitter
Owner: nunorc
Created: 2014-10-05T20:33:49.000Z (over 10 years ago)
Default Branch: master
Last Pushed: 2016-07-01T10:52:54.000Z (almost 9 years ago)
Last Synced: 2025-04-09T21:54:43.493Z (about 1 month ago)
Language: Perl
Size: 12.7 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: Changes

Awesome Lists containing this project

README

# NAME

Lingua::IdSplitter - split identifiers into words

# VERSION

version 0.03

# SYNOPSIS

use Lingua::IdSplitter;

my $splitter = Lingua::IdSplitter->new;
$splitter->split($identifier);

# DESCRIPTION

This module implements an algorithm to identify and split multi-word
identifier in their individual words. For example, "UserFind" in "user"
and "find", or "timesort" in "time" and "sort".

For more details on the algorithm check the following
[article](http://www.sciencedirect.com/science/article/pii/S0164121214002179)
(also available [here](http://hdl.handle.net/10198/11577)).

# FUNCTIONS

## new

Create a new splitter object. A list of specific dictionaries is optional,
check the `bin/id-splitter` command for an example on how to use more
dictionaries.

## soft\_split

Perform a soft split of the identifier, ie split words without using
explicit markers (eg, the underscore character, or CamelCase notation).

## hard\_split

Perform a hard split of the identifier, ie split words using
explicit markers (eg, the underscore character, or CamelCase notation).

## split

Perform a split applying first a hard split, and the applying a soft split
to the resulting set of the first split.

## explain

Show the computed ranked (including scores) for a split.

# AUTHOR

Nuno Carvalho

# COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nunorc/lingua-idsplitter

Awesome Lists containing this project

README