{"id":13418351,"url":"https://github.com/EducationalTestingService/skll","last_synced_at":"2025-03-15T03:31:04.441Z","repository":{"id":9850999,"uuid":"11845170","full_name":"EducationalTestingService/skll","owner":"EducationalTestingService","description":"SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.","archived":false,"fork":false,"pushed_at":"2024-10-28T02:06:34.000Z","size":36615,"stargazers_count":551,"open_issues_count":19,"forks_count":67,"subscribers_count":46,"default_branch":"main","last_synced_at":"2024-10-29T15:17:25.787Z","etag":null,"topics":["hacktoberfest","machine-learning","python","scikit-learn"],"latest_commit_sha":null,"homepage":"http://skll.readthedocs.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EducationalTestingService.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2013-08-02T14:31:46.000Z","updated_at":"2024-10-21T13:56:52.000Z","dependencies_parsed_at":"2023-02-11T11:30:53.545Z","dependency_job_id":"9e9490a2-217f-4b57-9360-ba044e234040","html_url":"https://github.com/EducationalTestingService/skll","commit_stats":{"total_commits":3085,"total_committers":42,"mean_commits":73.45238095238095,"dds":0.5679092382495948,"last_synced_commit":"231867dec958c7c418d506a3c777afb5c44eb522"},"previous_names":[],"tags_count":77,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EducationalTestingService%2Fskll","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EducationalTestingService%2Fskll/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EducationalTestingService%2Fskll/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EducationalTestingService%2Fskll/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EducationalTestingService","download_url":"https://codeload.github.com/EducationalTestingService/skll/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243681024,"owners_count":20330152,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","machine-learning","python","scikit-learn"],"created_at":"2024-07-30T22:01:01.320Z","updated_at":"2025-03-15T03:31:03.257Z","avatar_url":"https://github.com/EducationalTestingService.png","language":"Python","readme":"SciKit-Learn Laboratory\n-----------------------\n\n.. image:: https://gitlab.com/EducationalTestingService/skll/badges/main/pipeline.svg\n   :target: https://gitlab.com/EducationalTestingService/skll/-/pipelines\n   :alt: Gitlab CI status\n\n.. image:: https://dev.azure.com/EducationalTestingService/SKLL/_apis/build/status/EducationalTestingService.skll\n   :target: https://dev.azure.com/EducationalTestingService/SKLL/_build?view=runs\n   :alt: Azure Pipelines status\n\n.. image:: https://codecov.io/gh/EducationalTestingService/skll/branch/main/graph/badge.svg\n  :target: https://codecov.io/gh/EducationalTestingService/skll\n\n.. image:: https://img.shields.io/pypi/v/skll.svg\n   :target: https://pypi.org/project/skll/\n   :alt: Latest version on PyPI\n\n.. image:: https://img.shields.io/pypi/l/skll.svg\n   :alt: License\n\n.. image:: https://img.shields.io/conda/v/ets/skll.svg\n   :target: https://anaconda.org/ets/skll\n   :alt: Conda package for SKLL\n\n.. image:: https://img.shields.io/pypi/pyversions/skll.svg\n   :target: https://pypi.org/project/skll/\n   :alt: Supported python versions for SKLL\n\n.. image:: https://img.shields.io/badge/DOI-10.5281%2Fzenodo.12825-blue.svg\n   :target: http://dx.doi.org/10.5281/zenodo.12825\n   :alt: DOI for citing SKLL 1.0.0\n\n.. image:: https://mybinder.org/badge_logo.svg\n :target: https://mybinder.org/v2/gh/EducationalTestingService/skll/main?filepath=examples%2FTutorial.ipynb\n\n\nThis Python package provides command-line utilities to make it easier to run\nmachine learning experiments with scikit-learn.  One of the primary goals of\nour project is to make it so that you can run scikit-learn experiments without\nactually needing to write any code other than what you used to generate/extract\nthe features.\n\nInstallation\n~~~~~~~~~~~~\n\nYou can install using either ``pip`` or ``conda``. See details `here \u003chttps://skll.readthedocs.io/en/latest/getting_started.html\u003e`__.\n\nRequirements\n~~~~~~~~~~~~\n\n-  Python 3.10, 3.11, or 3.12.\n-  `beautifulsoup4 \u003chttp://www.crummy.com/software/BeautifulSoup/\u003e`__\n-  `gridmap \u003chttps://pypi.org/project/gridmap/\u003e`__ (only required if you plan\n   to run things in parallel on a DRMAA-compatible cluster)\n-  `joblib \u003chttps://pypi.org/project/joblib/\u003e`__\n-  `pandas \u003chttp://pandas.pydata.org\u003e`__\n-  `ruamel.yaml \u003chttp://yaml.readthedocs.io/en/latest/overview.html\u003e`__\n-  `scikit-learn \u003chttp://scikit-learn.org/stable/\u003e`__\n-  `seaborn \u003chttp://seaborn.pydata.org\u003e`__\n-  `tabulate \u003chttps://pypi.org/project/tabulate/\u003e`__\n\nCommand-line Interface\n~~~~~~~~~~~~~~~~~~~~~~\n\nThe main utility we provide is called ``run_experiment`` and it can be used to\neasily run a series of learners on datasets specified in a configuration file\nlike:\n\n.. code:: ini\n\n  [General]\n  experiment_name = Titanic_Evaluate_Tuned\n  # valid tasks: cross_validate, evaluate, predict, train\n  task = evaluate\n\n  [Input]\n  # these directories could also be absolute paths\n  # (and must be if you're not running things in local mode)\n  train_directory = train\n  test_directory = dev\n  # Can specify multiple sets of feature files that are merged together automatically\n  featuresets = [[\"family.csv\", \"misc.csv\", \"socioeconomic.csv\", \"vitals.csv\"]]\n  # List of scikit-learn learners to use\n  learners = [\"RandomForestClassifier\", \"DecisionTreeClassifier\", \"SVC\", \"MultinomialNB\"]\n  # Column in CSV containing labels to predict\n  label_col = Survived\n  # Column in CSV containing instance IDs (if any)\n  id_col = PassengerId\n\n  [Tuning]\n  # Should we tune parameters of all learners by searching provided parameter grids?\n  grid_search = true\n  # Function to maximize when performing grid search\n  objectives = ['accuracy']\n\n  [Output]\n  # Also compute the area under the ROC curve as an additional metric\n  metrics = ['roc_auc']\n  # The following can also be absolute paths\n  logs = output\n  results = output\n  predictions = output\n  probability = true\n  models = output\n\nFor more information about getting started with ``run_experiment``, please check\nout `our tutorial \u003chttps://skll.readthedocs.org/en/latest/tutorial.html\u003e`__, or\n`our config file specs \u003chttps://skll.readthedocs.org/en/latest/run_experiment.html\u003e`__.\n\nYou can also follow this `interactive Jupyter tutorial \u003chttps://mybinder.org/v2/gh/AVajpayeeJr/skll/feature/448-interactive-binder?filepath=examples\u003e`__.\n\nWe also provide utilities for:\n\n-  `converting between machine learning toolkit formats \u003chttps://skll.readthedocs.org/en/latest/utilities.html#skll-convert\u003e`__\n   (e.g., ARFF, CSV)\n-  `filtering feature files \u003chttps://skll.readthedocs.org/en/latest/utilities.html#filter-features\u003e`__\n-  `joining feature files \u003chttps://skll.readthedocs.org/en/latest/utilities.html#join-features\u003e`__\n-  `other common tasks \u003chttps://skll.readthedocs.org/en/latest/utilities.html\u003e`__\n\n\nPython API\n~~~~~~~~~~\n\nIf you just want to avoid writing a lot of boilerplate learning code, you can\nalso use our simple Python API which also supports pandas DataFrames.\nThe main way you'll want to use the API is through\nthe ``Learner`` and ``Reader`` classes. For more details on our API, see\n`the documentation \u003chttps://skll.readthedocs.org/en/latest/api.html\u003e`__.\n\nWhile our API can be broadly useful, it should be noted that the command-line\nutilities are intended as the primary way of using SKLL.  The API is just a nice\nside-effect of our developing the utilities.\n\n\nA Note on Pronunciation\n~~~~~~~~~~~~~~~~~~~~~~~\n\n.. image:: doc/skll.png\n   :alt: SKLL logo\n   :align: right\n\n.. container:: clear\n\n  .. image:: doc/spacer.png\n\nSciKit-Learn Laboratory (SKLL) is pronounced \"skull\": that's where the learning\nhappens.\n\nTalks\n~~~~~\n\n-  *Simpler Machine Learning with SKLL 1.0*, Dan Blanchard, PyData NYC 2014 (`video \u003chttps://www.youtube.com/watch?v=VEo2shBuOrc\u0026feature=youtu.be\u0026t=1s\u003e`__ | `slides \u003chttp://www.slideshare.net/DanielBlanchard2/py-data-nyc-2014\u003e`__)\n-  *Simpler Machine Learning with SKLL*, Dan Blanchard, PyData NYC 2013 (`video \u003chttp://vimeo.com/79511496\u003e`__ | `slides \u003chttp://www.slideshare.net/DanielBlanchard2/simple-machine-learning-with-skll\u003e`__)\n\nCiting\n~~~~~~\nIf you are using SKLL in your work, you can cite it as follows: \"We used scikit-learn (Pedragosa et al, 2011) via the SKLL toolkit (https://github.com/EducationalTestingService/skll).\"\n\nBooks\n~~~~~\n\nSKLL is featured in `Data Science at the Command Line \u003chttp://datascienceatthecommandline.com\u003e`__\nby `Jeroen Janssens \u003chttp://jeroenjanssens.com\u003e`__.\n\nChangelog\n~~~~~~~~~\n\nSee `GitHub releases \u003chttps://github.com/EducationalTestingService/skll/releases\u003e`__.\n\nContribute\n~~~~~~~~~~\n\nThank you for your interest in contributing to SKLL! See `CONTRIBUTING.md \u003chttps://github.com/EducationalTestingService/skll/blob/main/CONTRIBUTING.md\u003e`__ for instructions on how to get started.\n","funding_links":[],"categories":["Multipurpose","Python","工作流程和实验跟踪","AutoML","Uncategorized"],"sub_categories":["General-Purpose Machine Learning","Uncategorized"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FEducationalTestingService%2Fskll","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FEducationalTestingService%2Fskll","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FEducationalTestingService%2Fskll/lists"}