{"id":19720848,"url":"https://github.com/quantopian/warp_prism","last_synced_at":"2025-04-29T21:31:06.651Z","repository":{"id":62588217,"uuid":"77668834","full_name":"quantopian/warp_prism","owner":"quantopian","description":"Quickly move data from postgres to numpy or pandas.","archived":false,"fork":false,"pushed_at":"2023-04-07T02:45:31.000Z","size":60,"stargazers_count":65,"open_issues_count":2,"forks_count":29,"subscribers_count":25,"default_branch":"master","last_synced_at":"2025-04-04T08:43:06.613Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quantopian.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-12-30T07:18:49.000Z","updated_at":"2024-12-27T19:51:31.000Z","dependencies_parsed_at":"2024-11-11T23:12:44.728Z","dependency_job_id":"e7c119c3-2a57-499f-9b24-4a796fff8b5a","html_url":"https://github.com/quantopian/warp_prism","commit_stats":{"total_commits":39,"total_committers":2,"mean_commits":19.5,"dds":0.02564102564102566,"last_synced_commit":"dbd61bf30de717abca63cc661b4565aff16c3376"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantopian%2Fwarp_prism","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantopian%2Fwarp_prism/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantopian%2Fwarp_prism/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantopian%2Fwarp_prism/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quantopian","download_url":"https://codeload.github.com/quantopian/warp_prism/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251585771,"owners_count":21613277,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T23:12:42.542Z","updated_at":"2025-04-29T21:31:01.639Z","avatar_url":"https://github.com/quantopian.png","language":"C","readme":"warp_prism\n==========\n\nQuickly move data from postgres to numpy or pandas.\n\nAPI\n---\n\n``to_arrays(query, *, bind=None)``\n``````````````````````````````````\n\n.. code-block::\n\n   Run the query returning a the results as np.ndarrays.\n\n   Parameters\n   ----------\n   query : sa.sql.Selectable\n       The query to run. This can be a select or a table.\n   bind : sa.Engine, optional\n       The engine used to create the connection. If not provided\n       ``query.bind`` will be used.\n\n   Returns\n   -------\n   arrays : dict[str, (np.ndarray, np.ndarray)]\n       A map from column name to the result arrays. The first array holds the\n       values and the second array is a boolean mask for NULLs. The values\n       where the mask is False are 0 interpreted by the type.\n\n\n``to_dataframe(query, *, bind=None, null_values=None)``\n```````````````````````````````````````````````````````\n\n.. code-block::\n\n   Run the query returning a the results as a pd.DataFrame.\n\n   Parameters\n   ----------\n   query : sa.sql.Selectable\n       The query to run. This can be a select or a table.\n   bind : sa.Engine, optional\n       The engine used to create the connection. If not provided\n       ``query.bind`` will be used.\n   null_values : dict[str, any]\n       The null values to use for each column. This falls back to\n       ``warp_prism.null_values`` for columns that are not specified.\n\n   Returns\n   -------\n   df : pd.DataFrame\n       A pandas DataFrame holding the results of the query. The columns\n       of the DataFrame will be named the same and be in the same order as the\n       query.\n\n\n``register_odo_dataframe_edge()``\n`````````````````````````````````\n\n.. code-block::\n\n   Register an odo edge for sqlalchemy selectable objects to dataframe.\n\n   This edge will have a lower cost that the default edge so it will be\n   selected as the fasted path.\n\n   If the selectable is not in a postgres database, it will fallback to the\n   default odo edge.\n\n\nComparisons\n-----------\n\nA quick comparison between ``warp_prism``, ``odo``, and ``pd.read_sql_table``.\n\nIn this example we will read real data for VIX from quandl stored in a local\npostgres database using ``warp_prism``, ``odo``, and ``pd.read_sql_table``.\nAfter that, we will use ``odo`` to create a table with two float columns and\n1000000 rows and query it with the tree tools again.\n\n.. code-block:: python\n\n   In [1]: import warp_prism\n\n   In [2]: from odo import odo, resource\n\n   In [3]: import pandas as pd\n\n   In [4]: table = resource(\n      ...:     'postgresql://localhost/bz::yahoo_index_vix',\n      ...:     schema='quandl',\n      ...: )\n\n   In [5]: warp_prism.to_dataframe(table).head()\n   Out[5]:\n      asof_date      open_       high        low      close  volume  \\\n   0 2016-01-08  22.959999  27.080000  22.480000  27.010000     0.0\n   1 2015-12-04  17.430000  17.650000  14.690000  14.810000     0.0\n   2 2015-10-29  14.800000  15.460000  14.330000  14.610000     0.0\n   3 2015-12-21  19.639999  20.209999  18.700001  18.700001     0.0\n   4 2015-10-26  14.760000  15.430000  14.680000  15.290000     0.0\n\n      adjusted_close                  timestamp\n   0       27.010000 2016-01-11 23:14:54.682220\n   1       14.810000 2016-01-11 23:14:54.682220\n   2       14.610000 2016-01-11 23:14:54.682220\n   3       18.700001 2016-01-11 23:14:54.682220\n   4       15.290000 2016-01-11 23:14:54.682220\n\n   In [6]: odo(table, pd.DataFrame).head()\n   Out[6]:\n      asof_date      open_       high        low      close  volume  \\\n   0 2016-01-08  22.959999  27.080000  22.480000  27.010000     0.0\n   1 2015-12-04  17.430000  17.650000  14.690000  14.810000     0.0\n   2 2015-10-29  14.800000  15.460000  14.330000  14.610000     0.0\n   3 2015-12-21  19.639999  20.209999  18.700001  18.700001     0.0\n   4 2015-10-26  14.760000  15.430000  14.680000  15.290000     0.0\n\n      adjusted_close                  timestamp\n   0       27.010000 2016-01-11 23:14:54.682220\n   1       14.810000 2016-01-11 23:14:54.682220\n   2       14.610000 2016-01-11 23:14:54.682220\n   3       18.700001 2016-01-11 23:14:54.682220\n   4       15.290000 2016-01-11 23:14:54.682220\n\n   In [7]: pd.read_sql_table(table.name, table.bind, table.schema).head()\n   Out[7]:\n      asof_date      open_       high        low      close  volume  \\\n   0 2016-01-08  22.959999  27.080000  22.480000  27.010000     0.0\n   1 2015-12-04  17.430000  17.650000  14.690000  14.810000     0.0\n   2 2015-10-29  14.800000  15.460000  14.330000  14.610000     0.0\n   3 2015-12-21  19.639999  20.209999  18.700001  18.700001     0.0\n   4 2015-10-26  14.760000  15.430000  14.680000  15.290000     0.0\n\n      adjusted_close                  timestamp\n   0       27.010000 2016-01-11 23:14:54.682220\n   1       14.810000 2016-01-11 23:14:54.682220\n   2       14.610000 2016-01-11 23:14:54.682220\n   3       18.700001 2016-01-11 23:14:54.682220\n   4       15.290000 2016-01-11 23:14:54.682220\n\n   In [8]: len(warp_prism.to_dataframe(table))\n   Out[8]: 6565\n\n   In [9]: %timeit warp_prism.to_dataframe(table)\n   100 loops, best of 3: 7.55 ms per loop\n\n   In [10]: %timeit odo(table, pd.DataFrame)\n   10 loops, best of 3: 49.9 ms per loop\n\n   In [11]: %timeit pd.read_sql_table(table.name, table.bind, table.schema)\n   10 loops, best of 3: 61.8 ms per loop\n\n   In [12]: big_table = odo(\n       ...:     pd.DataFrame({\n       ...:         'a': np.random.rand(1000000),\n       ...:         'b': np.random.rand(1000000)},\n       ...:     ),\n       ...:     'postgresql://localhost/test::largefloattest',\n       ...: )\n\n   In [13]: %timeit warp_prism.to_dataframe(big_table)\n   1 loop, best of 3: 248 ms per loop\n\n   In [14]: %timeit odo(big_table, pd.DataFrame)\n   1 loop, best of 3: 1.51 s per loop\n\n   In [15]: %timeit pd.read_sql_table(big_table.name, big_table.bind)\n   1 loop, best of 3: 1.9 s per loop\n\n\nInstallation\n------------\n\nWarp Prism can be pip installed but requires numpy to build its C extensions:\n\n.. code-block::\n\n   $ pip install numpy\n   $ pip install warp_prism\n\n\nLicense\n-------\n\nWarp Prism is licensed under the Apache 2.0.\n\nWarp Prism is sponsored by `Quantopian \u003chttps://www.quantopian.com\u003e`_ where it\nis used to fetch data for use in `Zipline \u003chttp://www.zipline.io/\u003e`_ through the\n`Pipeline API \u003chttps://www.quantopian.com/tutorials/pipeline\u003e`_ or interactively\nwith `Blaze \u003chttp://blaze.readthedocs.io/en/latest/index.html\u003e`_.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantopian%2Fwarp_prism","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquantopian%2Fwarp_prism","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantopian%2Fwarp_prism/lists"}