{"id":13747788,"url":"https://github.com/IntelPython/sdc","last_synced_at":"2025-05-09T10:32:08.585Z","repository":{"id":22026189,"uuid":"93080202","full_name":"IntelPython/sdc","owner":"IntelPython","description":"Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler","archived":true,"fork":false,"pushed_at":"2023-11-09T22:51:13.000Z","size":16601,"stargazers_count":646,"open_issues_count":57,"forks_count":61,"subscribers_count":34,"default_branch":"master","last_synced_at":"2024-11-15T22:33:34.642Z","etag":null,"topics":["big-data","compilers","machine-learning","numpy","pandas","parallel-computing","python"],"latest_commit_sha":null,"homepage":"https://intelpython.github.io/sdc-doc/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IntelPython.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-01T16:59:41.000Z","updated_at":"2024-08-04T16:10:34.000Z","dependencies_parsed_at":"2022-08-07T10:01:27.433Z","dependency_job_id":"a3159ce0-4e2b-4a79-913d-72036ec96ac5","html_url":"https://github.com/IntelPython/sdc","commit_stats":null,"previous_names":[],"tags_count":38,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsdc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsdc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsdc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsdc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IntelPython","download_url":"https://codeload.github.com/IntelPython/sdc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253234178,"owners_count":21875561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["big-data","compilers","machine-learning","numpy","pandas","parallel-computing","python"],"created_at":"2024-08-03T07:00:21.152Z","updated_at":"2025-05-09T10:32:06.866Z","avatar_url":"https://github.com/IntelPython.png","language":"Python","readme":"*****\nSdc\n*****\n\nIntel® Scalable Dataframe Compiler\n###################################################\n\n.. image:: https://travis-ci.com/IntelPython/sdc.svg?branch=master\n    :target: https://travis-ci.com/IntelPython/sdc\n    :alt: Travis CI\n\n.. image:: https://dev.azure.com/IntelPython/HPAT/_apis/build/status/IntelPython.sdc?branchName=master\n    :target: https://dev.azure.com/IntelPython/HPAT/_build/latest?definitionId=2\u0026branchName=master\n    :alt: Azure Pipelines\n\n.. _Numba*: https://numba.pydata.org/\n.. _Pandas*: https://pandas.pydata.org/\n.. _Sphinx*: https://www.sphinx-doc.org/\n\nNumba* Extension For Pandas* Operations Compilation\n###################################################\n\nIntel® Scalable Dataframe Compiler (Intel® SDC) is an extension of `Numba*`_\nthat enables compilation of `Pandas*`_ operations. It automatically vectorizes and parallelizes\nthe code by leveraging modern hardware instructions and by utilizing all available cores.\n\nIntel® SDC documentation can be found `here \u003chttps://intelpython.github.io/sdc-doc/\u003e`__.\n\n.. note::\n    For maximum performance and stability, please use numba from ``intel/label/beta`` channel.\n\nInstalling Binary Packages (conda and wheel)\n--------------------------------------------\n\nIntel® SDC is available on the Anaconda Cloud ``intel/label/beta`` channel.\nDistribution includes Intel® SDC for Python 3.6 and Python 3.7 for Windows and Linux platforms.\n\nIntel® SDC conda package can be installed using the steps below::\n\n    \u003e conda create -n sdc-env python=\u003c3.7 or 3.6\u003e -c anaconda -c conda-forge\n    \u003e conda activate sdc-env\n    \u003e conda install sdc -c intel/label/beta -c intel -c defaults -c conda-forge --override-channels\n\nIntel® SDC wheel package can be installed using the steps below::\n\n    \u003e conda create -n sdc-env python=\u003c3.7 or 3.6\u003e pip -c anaconda -c conda-forge\n    \u003e conda activate sdc-env\n    \u003e pip install --index-url https://pypi.anaconda.org/intel/label/beta/simple --extra-index-url https://pypi.anaconda.org/intel/simple --extra-index-url https://pypi.org/simple sdc\n\n\nBuilding Intel® SDC from Source on Linux\n----------------------------------------\n\nWe use `Anaconda \u003chttps://www.anaconda.com/download/\u003e`_ distribution of\nPython for setting up Intel® SDC build environment.\n\nIf you do not have conda, we recommend using Miniconda3::\n\n    wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh\n    chmod +x miniconda.sh\n    ./miniconda.sh -b\n    export PATH=$HOME/miniconda3/bin:$PATH\n\n.. note::\n    For maximum performance and stability, please use numba from ``intel/label/beta`` channel.\n\nIt is possible to build Intel® SDC via conda-build or setuptools. Follow one of the\ncases below to install Intel® SDC and its dependencies on Linux.\n\nBuilding on Linux with conda-build\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n::\n\n    PYVER=\u003c3.6 or 3.7\u003e\n    NUMPYVER=\u003c1.16 or 1.17\u003e\n    conda create -n conda-build-env python=$PYVER conda-build\n    source activate conda-build-env\n    git clone https://github.com/IntelPython/sdc.git\n    cd sdc\n    conda build --python $PYVER --numpy $NUMPYVER --output-folder=\u003coutput_folder\u003e -c intel/label/beta -c defaults -c intel -c conda-forge --override-channels conda-recipe\n\nBuilding on Linux with setuptools\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n::\n\n    export PYVER=\u003c3.6 or 3.7\u003e\n    export NUMPYVER=\u003c1.16 or 1.17\u003e\n    conda create -n sdc-env -q -y -c intel/label/beta -c defaults -c intel -c conda-forge python=$PYVER numpy=$NUMPYVER tbb-devel tbb4py numba=0.54.1 pandas=1.3.4 pyarrow=4.0.1 gcc_linux-64 gxx_linux-64\n    source activate sdc-env\n    git clone https://github.com/IntelPython/sdc.git\n    cd sdc\n    python setup.py install\n\nIn case of issues, reinstalling in a new conda environment is recommended.\n\nBuilding Intel® SDC from Source on Windows\n------------------------------------------\n\nBuilding Intel® SDC on Windows requires Build Tools for Visual Studio 2019 (with component MSVC v140 - VS 2015 C++ build tools (v14.00)):\n\n* Install `Build Tools for Visual Studio 2019 (with component MSVC v140 - VS 2015 C++ build tools (v14.00)) \u003chttps://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2019\u003e`_.\n* Install `Miniconda for Windows \u003chttps://repo.continuum.io/miniconda/Miniconda3-latest-Windows-x86_64.exe\u003e`_.\n* Start 'Anaconda prompt'.\n\nIt is possible to build Intel® SDC via conda-build or setuptools. Follow one of the\ncases below to install Intel® SDC and its dependencies on Windows.\n\nBuilding on Windows with conda-build\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n::\n\n    set PYVER=\u003c3.6 or 3.7\u003e\n    set NUMPYVER=\u003c1.16 or 1.17\u003e\n    conda create -n conda-build-env -q -y python=%PYVER% conda-build conda-verify vc vs2015_runtime vs2015_win-64\n    conda activate conda-build-env\n    git clone https://github.com/IntelPython/sdc.git\n    cd sdc\n    conda build --python %PYVER% --numpy %NUMPYVER% --output-folder=\u003coutput_folder\u003e -c intel/label/beta -c defaults -c intel -c conda-forge --override-channels conda-recipe\n\nBuilding on Windows with setuptools\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n::\n\n    set PYVER=\u003c3.6 or 3.7\u003e\n    set NUMPYVER=\u003c1.16 or 1.17\u003e\n    conda create -n sdc-env -c intel/label/beta -c defaults -c intel -c conda-forge python=%PYVER% numpy=%NUMPYVER% tbb-devel tbb4py numba=0.54.1 pandas=1.3.4 pyarrow=4.0.1\n    conda activate sdc-env\n    set INCLUDE=%INCLUDE%;%CONDA_PREFIX%\\Library\\include\n    set LIB=%LIB%;%CONDA_PREFIX%\\Library\\lib\n    git clone https://github.com/IntelPython/sdc.git\n    cd sdc\n    python setup.py install\n\n.. \"C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\vcvarsall.bat\" amd64\n\nTroubleshooting Windows Build\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n* If the ``cl`` compiler throws the error fatal ``error LNK1158: cannot run 'rc.exe'``,\n  add Windows Kits to your PATH (e.g. ``C:\\Program Files (x86)\\Windows Kits\\8.0\\bin\\x86``).\n* Some errors can be mitigated by ``set DISTUTILS_USE_SDK=1``.\n* For setting up Visual Studio, one might need go to registry at\n  ``HKEY_LOCAL_MACHINE\\SOFTWARE\\WOW6432Node\\Microsoft\\VisualStudio\\SxS\\VS7``,\n  and add a string value named ``14.0`` whose data is ``C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\``.\n* Sometimes if the conda version or visual studio version being used are not latest then\n  building Intel® SDC can throw some vague error about a keyword used in a file.\n  So make sure you are using the latest versions.\n\nBuilding documentation\n----------------------\n\nBuilding Intel® SDC User's Guide documentation requires pre-installed Intel® SDC package\nalong with compatible `Pandas*`_ version as well as `Sphinx*`_ 2.2.1 or later.\n\nIntel® SDC documentation includes Intel® SDC examples output which is pasted to functions description in the API Reference.\n\nUse ``pip`` to install `Sphinx*`_ and extensions:\n::\n\n    pip install sphinx sphinxcontrib-programoutput\n\nCurrently the build precedure is based on ``make`` located at ``./sdc/docs/`` folder.\nWhile it is not generally required we recommended that you clean up the system from previous documentaiton build by running:\n::\n\n    make clean\n\nTo build HTML documentation you will need to run:\n::\n\n    make html\n\nThe built documentation will be located in the ``./sdc/docs/build/html`` directory.\nTo preview the documentation open ``index.html`` file.\n\n\nMore information about building and adding documentation can be found `here \u003cdocs/README.rst\u003e`__.\n\n\nRunning unit tests\n------------------\n::\n\n    python sdc/tests/gen_test_data.py\n    python -m unittest\n\nReferences\n##########\n\nIntel® SDC follows ideas and initial code base of High-Performance Analytics Toolkit (HPAT). These academic papers describe ideas and methods behind HPAT:\n\n- `HPAT paper at ICS'17 \u003chttp://dl.acm.org/citation.cfm?id=3079099\u003e`_\n- `HPAT at HotOS'17 \u003chttp://dl.acm.org/citation.cfm?id=3103004\u003e`_\n- `HiFrames on arxiv \u003chttps://arxiv.org/abs/1704.02341\u003e`_\n","funding_links":[],"categories":["Python","Tools"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FIntelPython%2Fsdc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FIntelPython%2Fsdc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FIntelPython%2Fsdc/lists"}