https://github.com/civisanalytics/civisml-extensions

scikit-learn-compatible estimators from Civis Analytics
https://github.com/civisanalytics/civisml-extensions

Last synced: 7 months ago
JSON representation

scikit-learn-compatible estimators from Civis Analytics

Host: GitHub
URL: https://github.com/civisanalytics/civisml-extensions
Owner: civisanalytics
License: bsd-3-clause
Created: 2017-09-11T17:20:15.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2021-11-04T02:19:22.000Z (about 4 years ago)
Last Synced: 2025-04-15T14:04:35.977Z (7 months ago)
Language: Python
Size: 118 KB
Stars: 59
Watchers: 74
Forks: 19
Open Issues: 3
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

awesome-python-data-science - civisml-extensions - scikit-learn-compatible estimators from Civis Analytics. (Machine Learning Frameworks)

README

          civisml-extensions

==================

.. image:: https://www.travis-ci.org/civisanalytics/civisml-extensions.svg?branch=master

    :target: https://www.travis-ci.org/civisanalytics/civisml-extensions

scikit-learn-compatible estimators from Civis Analytics

Installation

------------

Installation with ``pip`` is recommended::

    $ pip install civisml-extensions

For development, a few additional dependencies are needed::

    $ pip install -r dev-requirements.txt

Contents and Usage

------------------

This package contains `scikit-learn`_-compatible estimators for stacking (

``StackedClassifier``, ``StackedRegressor``), non-negative linear regression (

``NonNegativeLinearRegression``), preprocessing pandas_ ``DataFrames`` (

``DataFrameETL``), and using Hyperband_ for cross-validating hyperparameters (

``HyperbandSearchCV``).

Usage of these estimators follows the standard sklearn conventions. Here is an

example of using the ``StackedClassifier``:

    .. code-block:: python

        >>> from sklearn.linear_model import LogisticRegression

        >>> from sklearn.ensemble import RandomForestClassifier

        >>> from civismlext.stacking import StackedClassifier

        >>> 

        >>> # Define some Train data and labels

        >>> Xtrain, ytrain = , 

        >>> 

        >>> # Note that the final estimator 'metalr' is the meta-estimator

        >>> estlist = [('rf', RandomForestClassifier()),

        >>>            ('lr', LogisticRegression()),

        >>>            ('metalr', LogisticRegression())]

        >>> 

        >>> mysm = StackedClassifier(estlist)

        >>> # Set some parameters, if you didn't set them at instantiation

        >>> mysm.set_params(rf__random_state=7, lr__random_state=8,

        >>>                 metalr__random_state=9, metalr__C=10**7)

        >>> 

        >>> # Fit

        >>> mysm.fit(Xtrain, ytrain)

        >>> 

        >>> # Predict!

        >>> ypred = mysm.predict_proba(Xtest)

You can learn more about stacking and see an example use of the  ``StackedRegressor`` and ``NonNegativeLinearRegression`` estimators in `a talk presented at PyData NYC`_ in November, 2017.

See the doc strings of the various estimators for more information.

Contributing

------------

Please see ``CONTRIBUTING.md`` for information about contributing to this project.

License

-------

BSD-3

See ``LICENSE.md`` for details.

.. _scikit-learn: http://scikit-learn.org/

.. _pandas: http://pandas.pydata.org/

.. _Hyperband: https://arxiv.org/abs/1603.06560

.. _a talk presented at PyData NYC: https://www.youtube.com/watch?v=3gpf1lGwecA

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/civisanalytics/civisml-extensions

Awesome Lists containing this project

README