Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/djsutherland/skl-groups

scikit-learn addon to operate on set/"group"-based features
https://github.com/djsutherland/skl-groups

Last synced: 1 day ago
JSON representation

scikit-learn addon to operate on set/"group"-based features

Awesome Lists containing this project

README

        

|Travis|_

.. |Travis| image:: https://api.travis-ci.org/dougalsutherland/skl-groups.png?branch=master
.. _Travis: https://travis-ci.org/dougalsutherland/skl-groups

skl-groups
==========

skl-groups is a package to perform machine learning on sets (or "groups") of
features in Python. It extends the `scikit-learn `_
library with support for either transforming sets into feature vectors that
can be operated on with standard scikit-learn constructs or obtaining
pairwise similarity/etc matrices that can be turned into kernels for use in
scikit-learn.

For an introduction to the package, why you might want to use it, and how to
do so, check out
`the documentation `_.

skl-groups is still in fairly early development.
The precursor package, `py-sdm `_,
is still somewhat easier to use for some tasks (though it has less functionality
and less documentation); skl-groups will hopefully match it in the next few weeks.
Feel free to get in touch ([email protected]) if you're interested.

Installation
------------

Full instructions are
`in the documentation `_,
but the short version is to do::

$ conda install -c dougal -c r skl-groups

if you use conda, or::

$ pip install skl-groups

if not. If you pip install and want to use the kNN divergence estimator,
you'll need to install either
`cyflann `_
or the regular pyflann bindings to FLANN,
and you'll want a version of FLANN with OpenMP support.

A much faster version of the kNN estimator is enabled by the
skl-groups-accel package, which you can get via::

$ pip install skl-groups-accel

It requires cyflann and a working C compiler with OpenMP support
(i.e. gcc, not clang).