Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/iLanguage/iLanguage
A semi-unsupervised language independent morphological analyzer useful for stemming unknown language text, or getting a rough estimate of possible parses for morphemes in a word. Input: a corpus. Uses compression, maximum entropy and fieldlinguistics.
https://github.com/iLanguage/iLanguage
Last synced: about 2 months ago
JSON representation
A semi-unsupervised language independent morphological analyzer useful for stemming unknown language text, or getting a rough estimate of possible parses for morphemes in a word. Input: a corpus. Uses compression, maximum entropy and fieldlinguistics.
- Host: GitHub
- URL: https://github.com/iLanguage/iLanguage
- Owner: iLanguage
- License: apache-2.0
- Created: 2012-05-12T13:59:43.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2017-11-28T05:10:12.000Z (about 7 years ago)
- Last Synced: 2024-04-15T14:22:55.717Z (9 months ago)
- Language: JavaScript
- Homepage:
- Size: 238 MB
- Stars: 21
- Watchers: 24
- Forks: 8
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- low-resource-languages - iLanguage - A semi-unsupervised language independent morphological analyzer useful for stemming unknown language text, or getting a rough estimate of possible parses for morphemes in a word. Input: a corpus. Uses compression, maximum entropy and fieldlinguistics. (Software / Utilities)
README
iLanguage
============[![NPM version][npm-image]][npm-url] [![Build Status][travis-image]][travis-url] [![Dependency Status][daviddm-url]][daviddm-image]
A semi-unsupervised language independent morphological analyzer useful for stemming unknown language text, or getting a rough estimate of possible parses for morphemes in a word. Uses compression, maximum entropy and fieldlinguistics.
## Install
```bash
$ npm install --save ilanguage
```## Usage
[More examples](https://github.com/iLanguage/iLanguage/tree/master/test)
```javascript
var ILanguage = require('ilanguage').ILanguage;
var lang = new ILanguage();
var textToTest = {
orthography: "this will not have any stop words or morphemes, vulgar words or unrepresentative words like banana",
nonContentWordsArray: "not any",
userSpecifiedNonContentWords: true,
userRemovedWordsForThisDocumentArray: ['banana'],
userRemovedWordsForAllDocumentsArray: ['vulgar'],
// morphemes: /(^un|^pre|s$|ed$|ing$)/,
morphemesArray: ["un-", " pre-", " -s", " -ed", " -ing"]
};
NonContentWords.processNonContentWords(textToTest);
expect(NonContentWords.filterText(textToTest).filteredText)
.toEqual('thi will have stop word or morpheme word or representative word like ');
```# Lab Members
* [Gina Chiodo](http://gina.ilanguage.ca/) (U Delaware, Concordia)
* [Theresa Deering](http://trisapeace.angelfire.com/) (McGill, Aquafadas)
* [Josh Horner](http://jdhorner.com/) (Amilia)
* [Mathieu Legault](https://plus.google.com/116488045482047329710/about) (HEC, UQAM, Pivot88)
* [Hisako Noguchi](http://linguistics.concordia.ca/gazette.html) (Concordia)
* [Tobin Skinner](http://tobinskinner.com) (McGill, Amilia)# Post Docs
* [M.E. Cathcart](http://udel.edu/~mdotedot/) 2012 (U Delaware)
* [Alexandra Marquis](http://www.uqam.ca/entrevues/entrevue.php?id=968?hebdo) 2013-2014 (UQAM, U de Montréal)
* [Joel Dunham](http://www.jrwdunham.com) 2014-2015 (UBC)# Interns
* [Siddartha Kattoju](https://plus.google.com/109959990932959598572/posts) Summer 2011 (Concordia, Electrical Engineering)
* [Curtis Mesher](http://dragonsandgulls.wordpress.com/) Fall 2011, Spring 2012 (Concordia, Theoretical Linguistics)
* [Diana Olepeka](http://dragonsandgulls.wordpress.com/) Fall 2011 (Concordia, Theoretical Linguistics)
* [Elise McClay](http://migmaq.org/wp-content/uploads/2013/02/mcclayundergradthesis.pdf) Fall 2012, Spring 2013 (McGill, Field Linguistics)
* [Bahar Sateli](https://twitter.com/BaharSateli) Spring 2012 (Concordia, Software Engineering)
* [Yuliya Manyakina](http://ymanyakina.github.io) Summer 2012 (McGill, Field Linguistics)
* [Xianli Sun](http://myaamiacenter.org/) Spring 2013 (Miami University, Software Engineering)
* [Louisa Bielig](https://github.com/louisa-bielig) Summer 2013, Fall 2013, Summer 2015 (McGill, Theoretical Linguistics)
* [Dominique Bédard](http://www.eoa.umontreal.ca/) Fall 2013, Spring 2014 (U de Montréal, Speech Language Pathology)
* [Alexandre Herbay](https://twitter.com/Hafsloo) Spring 2014 (U de Montréal, Psycho-linguistics & Toulouse III, Computer Science)
* [Veronica Cook-Vilbrin](http://github.com/vronvali) Summer 2015 (Norwich University, Psychology)## Release History
* v1.0 April 16 2009 - Initial implementation in bash and perl
* v2.0 Jul 3 2010 - Implementation in C++
* v3.0 April 30 2011 - Implementation in Groovy
* v4.0 July 20 2012 - Implementation in JavaScript Map Reduce
* v4.1 Nov 29 2013 - Added more high level functions for gloss lookup
* v5.0 Jan 9 2014 - Implementation in CommonJS# License
This project is released under the [Apache 2.0](http://www.apache.org/licenses/LICENSE-2.0.html) license, which is an very non-restrictive open source license which basically says you can adapt the code to any use you see fit.
# How to edit the code
### Code style
[Sublime](http://www.sublimetext.com/3) will manage this for you if you format (CMD+SHIFT+P, format) your code when you save. You can refer to `.editorconfig` and `.jshintrc` for specific options.### Breakpointing while you work
You can open the `test/SpecRunner.html` in an _actual_ browser to run the unit test file(s) or breakpoint the code.## Modifying the code
In general, you should always ensure that you have the latest [Node.js](http://nodejs.org/) and [npm](http://npmjs.org/) installed. On Mac you can do this by
```
brew update
brew upgrade node
```Test that Gulp is installed by running `gulp --version`. If the command isn't found, run `npm install -g gulp`. For more information about installing the tools, see the [getting started with Gulp guide](http://gulpjs.com).
1. Fork the repo.
1. Clone the repo to your computer.
1. Run `npm install` to install all build dependencies.
1. Run `gulp` to build this project.Assuming that it looks something like this, you're ready to go.
![screen shot 2015-06-21 at 5 58 21 pm](https://cloud.githubusercontent.com/assets/196199/8273513/343dc03a-183f-11e5-8b2c-89586f6d48a3.png)## Contributing changes
Easy way
1. [Signup for a GitHub account](https://github.com/signup/free) (GitHub is free for OpenSource)
1. Click on the "Fork" button to create your own copy.
1. Leave us a note in our [issue tracker](https://github.com/iLanguage/iLanguage/issues) to tell us a bit about the bug/feature you want to work on.
1. You can [follow the 4 GitHub Help Tutorials](http://help.github.com/) to install and use Git on your computer.
1. Feel free to ask us questions in our [issue tracker](https://github.com/iLanguage/iLanguage/issues), we're friendly and welcome Open Source newbies.
1. Edit the code on your computer, commit it referencing the issue #xx you created ($ git commit -m "fixes #xx i changed blah blah...") and push to your origin ($ git push origin master).
1. Click on the "Pull Request" button, and leave us a note about what you changed. We will look at your changes and help you bring them into the project!Advanced way
1. Create a new branch for new fixes or features, this is easier to build a fix/feature specific pull request than if you work in your `master` branch directly.
1. Run `gulp watch` which will run the tests as you make changes.
1. Add failing tests for the change you want to make. Run `gulp test` to see the tests fail.
1. Fix stuff.
1. Look at the terminal output (assuming you ran `gulp watch`) to see if the tests pass. Repeat steps 2-4 until done.
1. Open `test/SpecRunner.html` unit test file(s) in actual browser(s) (Chrome Canary, Firefox, Safari) to ensure tests pass everywhere.
1. Update the documentation to reflect any changes.
1. Push to your fork and submit a pull request.[npm-url]: https://npmjs.org/package/ilanguage
[npm-image]: https://badge.fury.io/js/ilanguage.svg
[travis-url]: https://travis-ci.org/iLanguage/iLanguage
[travis-image]: https://travis-ci.org/iLanguage/iLanguage.svg?branch=master
[daviddm-url]: https://david-dm.org/iLanguage/iLanguage.svg?theme=shields.io
[daviddm-image]: https://david-dm.org/iLanguage/iLanguage