Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/EducationalTestingService/skll
SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.
https://github.com/EducationalTestingService/skll
hacktoberfest machine-learning python scikit-learn
Last synced: 3 months ago
JSON representation
SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.
- Host: GitHub
- URL: https://github.com/EducationalTestingService/skll
- Owner: EducationalTestingService
- License: other
- Created: 2013-08-02T14:31:46.000Z (over 11 years ago)
- Default Branch: main
- Last Pushed: 2024-10-21T13:56:47.000Z (3 months ago)
- Last Synced: 2024-10-21T22:44:07.184Z (3 months ago)
- Topics: hacktoberfest, machine-learning, python, scikit-learn
- Language: Python
- Homepage: http://skll.readthedocs.org
- Size: 34.9 MB
- Stars: 551
- Watchers: 46
- Forks: 67
- Open Issues: 20
-
Metadata Files:
- Readme: README.rst
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-python-machine-learning - SKLL - This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. (Uncategorized / Uncategorized)
- awesome-python-machine-learning-resources - GitHub - 7% open · ⏱️ 21.12.2021): (工作流程和实验跟踪)
README
SciKit-Learn Laboratory
-----------------------.. image:: https://gitlab.com/EducationalTestingService/skll/badges/main/pipeline.svg
:target: https://gitlab.com/EducationalTestingService/skll/-/pipelines
:alt: Gitlab CI status.. image:: https://dev.azure.com/EducationalTestingService/SKLL/_apis/build/status/EducationalTestingService.skll
:target: https://dev.azure.com/EducationalTestingService/SKLL/_build?view=runs
:alt: Azure Pipelines status.. image:: https://codecov.io/gh/EducationalTestingService/skll/branch/main/graph/badge.svg
:target: https://codecov.io/gh/EducationalTestingService/skll.. image:: https://img.shields.io/pypi/v/skll.svg
:target: https://pypi.org/project/skll/
:alt: Latest version on PyPI.. image:: https://img.shields.io/pypi/l/skll.svg
:alt: License.. image:: https://img.shields.io/conda/v/ets/skll.svg
:target: https://anaconda.org/ets/skll
:alt: Conda package for SKLL.. image:: https://img.shields.io/pypi/pyversions/skll.svg
:target: https://pypi.org/project/skll/
:alt: Supported python versions for SKLL.. image:: https://img.shields.io/badge/DOI-10.5281%2Fzenodo.12825-blue.svg
:target: http://dx.doi.org/10.5281/zenodo.12825
:alt: DOI for citing SKLL 1.0.0.. image:: https://mybinder.org/badge_logo.svg
:target: https://mybinder.org/v2/gh/EducationalTestingService/skll/main?filepath=examples%2FTutorial.ipynbThis Python package provides command-line utilities to make it easier to run
machine learning experiments with scikit-learn. One of the primary goals of
our project is to make it so that you can run scikit-learn experiments without
actually needing to write any code other than what you used to generate/extract
the features.Installation
~~~~~~~~~~~~You can install using either ``pip`` or ``conda``. See details `here `__.
Requirements
~~~~~~~~~~~~- Python 3.10, 3.11, or 3.12.
- `beautifulsoup4 `__
- `gridmap `__ (only required if you plan
to run things in parallel on a DRMAA-compatible cluster)
- `joblib `__
- `pandas `__
- `ruamel.yaml `__
- `scikit-learn `__
- `seaborn `__
- `tabulate `__Command-line Interface
~~~~~~~~~~~~~~~~~~~~~~The main utility we provide is called ``run_experiment`` and it can be used to
easily run a series of learners on datasets specified in a configuration file
like:.. code:: ini
[General]
experiment_name = Titanic_Evaluate_Tuned
# valid tasks: cross_validate, evaluate, predict, train
task = evaluate[Input]
# these directories could also be absolute paths
# (and must be if you're not running things in local mode)
train_directory = train
test_directory = dev
# Can specify multiple sets of feature files that are merged together automatically
featuresets = [["family.csv", "misc.csv", "socioeconomic.csv", "vitals.csv"]]
# List of scikit-learn learners to use
learners = ["RandomForestClassifier", "DecisionTreeClassifier", "SVC", "MultinomialNB"]
# Column in CSV containing labels to predict
label_col = Survived
# Column in CSV containing instance IDs (if any)
id_col = PassengerId[Tuning]
# Should we tune parameters of all learners by searching provided parameter grids?
grid_search = true
# Function to maximize when performing grid search
objectives = ['accuracy'][Output]
# Also compute the area under the ROC curve as an additional metric
metrics = ['roc_auc']
# The following can also be absolute paths
logs = output
results = output
predictions = output
probability = true
models = outputFor more information about getting started with ``run_experiment``, please check
out `our tutorial `__, or
`our config file specs `__.You can also follow this `interactive Jupyter tutorial `__.
We also provide utilities for:
- `converting between machine learning toolkit formats `__
(e.g., ARFF, CSV)
- `filtering feature files `__
- `joining feature files `__
- `other common tasks `__Python API
~~~~~~~~~~If you just want to avoid writing a lot of boilerplate learning code, you can
also use our simple Python API which also supports pandas DataFrames.
The main way you'll want to use the API is through
the ``Learner`` and ``Reader`` classes. For more details on our API, see
`the documentation `__.While our API can be broadly useful, it should be noted that the command-line
utilities are intended as the primary way of using SKLL. The API is just a nice
side-effect of our developing the utilities.A Note on Pronunciation
~~~~~~~~~~~~~~~~~~~~~~~.. image:: doc/skll.png
:alt: SKLL logo
:align: right.. container:: clear
.. image:: doc/spacer.png
SciKit-Learn Laboratory (SKLL) is pronounced "skull": that's where the learning
happens.Talks
~~~~~- *Simpler Machine Learning with SKLL 1.0*, Dan Blanchard, PyData NYC 2014 (`video `__ | `slides `__)
- *Simpler Machine Learning with SKLL*, Dan Blanchard, PyData NYC 2013 (`video `__ | `slides `__)Citing
~~~~~~
If you are using SKLL in your work, you can cite it as follows: "We used scikit-learn (Pedragosa et al, 2011) via the SKLL toolkit (https://github.com/EducationalTestingService/skll)."Books
~~~~~SKLL is featured in `Data Science at the Command Line `__
by `Jeroen Janssens `__.Changelog
~~~~~~~~~See `GitHub releases `__.
Contribute
~~~~~~~~~~Thank you for your interest in contributing to SKLL! See `CONTRIBUTING.md `__ for instructions on how to get started.