{"id":20809553,"url":"https://github.com/segasai/sqlutilpy","last_synced_at":"2025-05-07T08:04:15.806Z","repository":{"id":43780488,"uuid":"130428771","full_name":"segasai/sqlutilpy","owner":"segasai","description":"Python module to efficiently query SQL databases and return numpy arrays","archived":false,"fork":false,"pushed_at":"2025-04-12T14:14:58.000Z","size":270,"stargazers_count":10,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-07T08:04:00.406Z","etag":null,"topics":["database","numpy","python","query","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/segasai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-04-21T01:28:16.000Z","updated_at":"2025-04-12T14:15:01.000Z","dependencies_parsed_at":"2025-04-09T20:32:08.216Z","dependency_job_id":"f9668685-7fc6-4d62-a2f7-7fc21d27beb4","html_url":"https://github.com/segasai/sqlutilpy","commit_stats":{"total_commits":279,"total_committers":6,"mean_commits":46.5,"dds":0.5985663082437276,"last_synced_commit":"2192586ce57c904c59c5da7da2d1688eef76f5aa"},"previous_names":[],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/segasai%2Fsqlutilpy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/segasai%2Fsqlutilpy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/segasai%2Fsqlutilpy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/segasai%2Fsqlutilpy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/segasai","download_url":"https://codeload.github.com/segasai/sqlutilpy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252839292,"owners_count":21812089,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","numpy","python","query","sql"],"created_at":"2024-11-17T20:14:22.328Z","updated_at":"2025-05-07T08:04:15.790Z","avatar_url":"https://github.com/segasai.png","language":"Python","readme":"[![Build Status](https://github.com/segasai/sqlutilpy/workflows/Testing/badge.svg)](https://github.com/segasai/sqlutilpy/actions)\n[![Documentation Status](https://readthedocs.org/projects/sqlutilpy/badge/?version=latest)](http://sqlutilpy.readthedocs.io/en/latest/?badge=latest)\n[![Coverage Status](https://coveralls.io/repos/github/segasai/sqlutilpy/badge.svg?branch=master)](https://coveralls.io/github/segasai/sqlutilpy?branch=master)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6867957.svg)](https://doi.org/10.5281/zenodo.6867957)\n\n# sqlutilpy\nPython module to query SQL databases and return numpy arrays, upload\ntables and run join queries involving local arrays and the tables in the DB.\nThis module is optimized to deal efficiently with query results with millions of rows.\nThe module works with PostgreSQL, SQLite and DuckDB databases.\n\nThe full documentation is available [here](http://sqlutilpy.readthedocs.io/en/latest/)\n\nAuthor: Sergey Koposov (Uni of Cambridge/CMU/Uni of Edinburgh)\n\n## Installation\nTo install the package you just need to do pip install. \n\n```\npip install sqlutilpy\n```\n## Authentication\n\nThroughout this readme, I will assume that if you are using PostgreSQL, then\nthe .pgpass file ( https://www.postgresql.org/docs/11/libpq-pgpass.html ) \nhas been created with your login/password details for Postgresql. If that is not the case, many of the \ncommands given below will also need user='....' and password='...' options.\n\n## Connection information\n\nMost of the `sqlutilpy` commands require hostname, database name.  \nIf you don't want to always type it, you can use standard PostgreSQL environment variables\nlike PGPORT, PGDATABASE, PGUSER, PGHOST for the port, database name, user name and hostname\nof the connection. \n\n\n## Querying the database and retrieving the results\n\nThis command will run the query and put the columns into variables ra,dec:\n\n```python\nimport sqlutilpy\nra,dec = squtilpy.get('select ra,dec from mytable', \n                 host='HOST_NAME_OF_MY_PG_SERVER', \n                 db='THE_NAME_OF_MY_DB')\n```\n\nBy default `sqlutilpy.get` executes the query and returns the tuple with \narrays. One array for each column in the query result. \nYou can return the results as dictionary using `asDict` option.\n\n## Uploading your arrays as column in a table\n\nYou can use `sqlutilpy.upload` to upload your arrays as columns in a table.\n\n```python\nx = np.arange(10)                                                   \ny = x**.5                                                           \nsqlutilpy.upload('mytable',(x,y),('xcol','ycol'))    \n``` \nThis will create a table called `mytable` with columns `xcol` and `ycol` \n\n## Join query involving your local data and the database table\n\nSometimes it is beneficial to run a join query involving your local data and the data in the database.\n\nImagine you have arrays `myid` and `y` and you want to extract all the \ninformation from `somebigtable` for objects with `id=myid`. In principle,\nyou could upload the arrays in the DB and run a query, but `local_join` function does that for you.\n\n```python\nmyid = np.arange(10)\ny = np.random.uniform(size=10)\n\nR=sqlutilpy.local_join('''select * from mytmptable as m, \n           somebigtable as s where s.id=m.myid order by m.myid''',                                              \n           'mytmptable',(myid, y),('myid','ycol'))\n```\n\nIt executes a query as if your arrays were in `mytmptable`. What happens behind the scenes\nis that it uploads the data to the database and runs a query against it.\n\n## Keeping the connection open. \n\nOften it is beneficial to preserve an open connection to the database. You can do that if you first \nobtain the connection using `sqlutilpy.getConnection()` and then provide it directly\nto `sqlutil.get()` and similar commands using `conn=` keyword:\nconn = sqlutilpy.getConnection(db='mydb', user='meuser', password='something', host='hostname')\nR= sqlutilpy.get('select 1', conn=conn)\nR1= sqlutilpy.get('select 1', conn=conn)\n```\n\n# How to cite the software\n\nIf you use this package, please cite it through Zenodo https://doi.org/10.5281/zenodo.5160118\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsegasai%2Fsqlutilpy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsegasai%2Fsqlutilpy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsegasai%2Fsqlutilpy/lists"}