An open API service indexing awesome lists of open source software.

awesome-python-data-science

A curated list of Python libraries used for data science.
https://github.com/thomasjpfan/awesome-python-data-science

Last synced: 2 days ago
JSON representation

  • Machine Learning Frameworks

    • Xgboost - Scalable, Portable and Distributed Gradient Boosting.
  • Scientific

    • Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
    • Numba - NumPy aware dynamic Python compiler using LLVM.
  • Deep Learning Tools

    • lightly - Lightly is a computer vision framework for self-supervised learning.
    • TorchDrift - TorchDrift is a data and concept drift library for PyTorch.
  • Visualization

    • PyGWalker - Turns pandas and polars dataframes into a Tableau-like user interface for visual exploration.
  • Exploration

    • fitter - simple class to identify the distribution from which a data samples is generated from.
    • Dora - Exploratory data analysis.
  • Feature Extraction

    • General Feature Extraction

      • dirty_cat - Encoding methods for dirty categorical variables.
    • Images and Video

    • Text/NLP

      • preprocessing - Simple interface for the CMU Pronouncing Dictionary.
      • BlingFire - A lightning fast Finite State machine and REgular expression manipulation library.
      • Fuzzy - Soundex, NYSIIS, Double Metaphone.
  • Profiling

    • Ranking/Recommender

  • Python Tools

    • Ranking/Recommender

      • devpi - PyPI server and packaging/testing/release tool.
      • sacred - Reproduce computational experiments.
  • Deep Learning Frameworks

    • tensorlayer - A Deep Learning and Reinforcement Learning Library for Researchers and Engineers.