Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lazywei/mockingbird
Programming Language Classifier
https://github.com/lazywei/mockingbird
Last synced: 22 days ago
JSON representation
Programming Language Classifier
- Host: GitHub
- URL: https://github.com/lazywei/mockingbird
- Owner: lazywei
- License: mit
- Created: 2015-05-16T03:50:39.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2015-08-03T11:22:21.000Z (over 9 years ago)
- Last Synced: 2024-06-19T02:04:27.886Z (5 months ago)
- Language: Go
- Homepage:
- Size: 239 KB
- Stars: 22
- Watchers: 8
- Forks: 3
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: License.md
Awesome Lists containing this project
README
Mockingbird [![Build Status](https://travis-ci.org/lazywei/mockingbird.svg?branch=master)](https://travis-ci.org/lazywei/mockingbird)
===========# Introduction
[Linguist](https://github.com/github/linguist)'s Classifier in Go.
Linguist can be used as a Go package by
```golang
import "github.com/lazywei/linguist"
```and it also has a CLI (command line interface) in `cli/`
```
$ cd cli/
$ ./build.sh
$ ./mockingbird --help
```# Command Line Interface Usage
## Preparing LIBSVM format dataset
### Collect Rosetta Code
1. Clone the [RosettaCodeData](https://github.com/acmeism/RosettaCodeData)
```
git clone [email protected]:acmeism/RosettaCodeData.git
```2. Build this `cli` executable
```
cd cli/
./build.sh
```3. Run the `collectRosetta` according to the cloned RosettaCodeData, and collect
files to `../samples````
./mockingbird collectRosetta path/to/clones/RosettaCodeData ../samples
```### Build Bag-of-Words and Convert Samples to Libsvm
Build from scratch
```
./mockingbird convertLibsvm ../samples ../
```
This will save `libsvm.samples` and `bow.gob` to `../`. The `bow.gob` is the
parameters for constructing bag-of-words. This can be used afterward:```
./mockingbird convertLibsvm ../samples ../ --bowPath ../bow.gob
```## Train and Predict
### Train
For example, train a logisitic regression classifier:
```
./mockingbird train --sample=./test_fixture/test_samples.libsvm --solver 1
```This will save a model file in `$PWD/model/lr.model`, which can be used in
later prediction.Full usage:
```
usage: mockingbird train []Train Classifier
Flags:
--help Show help (also see --help-long and --help-man).
--sample="samples.libsvm"
Path for samples (in libsvm format)
--output="model" Path for saving trained model
--solver=0 0 = NaiveBayes, 1 = LogisticRegression
```### Predict
For example, make prediction via previously trained logisitic regression
classifier:```
./mockingbird predict --model=./model/lr.model --data=./test_fixture/test_samples.libsvm --solver=1
```Full usage:
```
usage: mockingbird predict --data=DATA []Predict via trained Classifier
Flags:
--help Show help (also see --help-long and --help-man).
--model="./model/naive_bayes.gob"
Path for loading saved model
--data=DATA Path for testing data (in libsvm format)
--solver=0 0 = NaiveBayes, 1 = LogisticRegression
```