{"id":18373036,"url":"https://github.com/coleifer/sweepea","last_synced_at":"2025-04-06T19:32:14.062Z","repository":{"id":57472716,"uuid":"82849342","full_name":"coleifer/sweepea","owner":"coleifer","description":"Fast, lightweight Python database toolkit for SQLite, built with Cython.","archived":false,"fork":false,"pushed_at":"2023-11-15T14:59:17.000Z","size":2044,"stargazers_count":42,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-30T16:44:40.274Z","etag":null,"topics":["dank","python","query-builder","sqlite","sqlite3"],"latest_commit_sha":null,"homepage":"https://sweepea.readthedocs.io/","language":"Cython","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coleifer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-22T20:29:13.000Z","updated_at":"2024-09-15T15:56:35.000Z","dependencies_parsed_at":"2022-09-19T08:41:32.340Z","dependency_job_id":null,"html_url":"https://github.com/coleifer/sweepea","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coleifer%2Fsweepea","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coleifer%2Fsweepea/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coleifer%2Fsweepea/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coleifer%2Fsweepea/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coleifer","download_url":"https://codeload.github.com/coleifer/sweepea/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247539256,"owners_count":20955280,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dank","python","query-builder","sqlite","sqlite3"],"created_at":"2024-11-06T00:08:18.970Z","updated_at":"2025-04-06T19:32:12.131Z","avatar_url":"https://github.com/coleifer.png","language":"Cython","funding_links":[],"categories":["Cython"],"sub_categories":[],"readme":"![](http://media.charlesleifer.com/blog/photos/sweepea-fast.png)\n\n## swee'pea\n\nFast, lightweight Python database toolkit for SQLite, built with Cython.\n\nLike it's cousin `peewee \u003chttp://docs.peewee-orm.com/\u003e`_, ``swee'pea`` is\ncomprised of a database connection abstraction and query-building / execution\nAPIs. This project is a pet project of mine, so tailor expectations\naccordingly.\n\nFeatures:\n\n* Implemented in Cython for performance and to expose advanced features of the\n  SQLite database library.\n* Composable and consistent APIs for building queries using Python.\n* Layered APIs allow you to work as close to the database as you want.\n* No magic.\n* No bullshit.\n\nIssue tracker and code are hosted on GitHub: https://github.com/coleifer/sweepea.\n\nDocumentation hosted on RT**F**D: https://sweepea.readthedocs.io/\n\n### Dependencies\n\nCython.\n\nThis project is designed to work with the standard library `sqlite3` driver, or\nalternatively, the latest version of `pysqlite2`.\n\n### Installation\n\nFirst install Cython and ensure that you have a SQLite library (standard\nlibrary ``sqlite3`` or `pysqlite \u003chttps://github.com/ghaering/pysqlite\u003e`_).\nThen:\n\n```\n$ pip install sweepea\n```\n\nOr\n\n```\n$ pip install -e git+https://github.com/coleifer/sweepea#egg=sweepea\n```\n\n-----------------------------------------------------------------\n\n## Database Helper\n\n## Dynamic Tables\n\nSQLite makes it easy to define scalar and aggregate functions, but it is more\nchallenging to create functions that return multiple values. Scalar functions\naccept zero or more parameters and return a single value. Aggregate functions\naccept parameters from any number of input rows, and then generate a final\nscalar value.\n\nTo create functions that return multiple values, it is necessary to create a\n[virtual table](http://sqlite.org/vtab.html). SQLite has the concept of\n\"eponymous\" virtual tables, which are virtual tables that can be called like a\nfunction and do not require explicit creation using DDL statements.\n\nThe `vtfunc` module abstracts away the complexity of creating an eponymous\nvirtual table, allowing you to write your own multi-value SQLite functions in\nPython.\n\n### Example\n\nSuppose we want to create a function that, given a regular expression and an\ninput string, returns all matching subgroups in the input string. For instance,\nif our regex was `'[0-9]+'` and our input string was `'123 xxx 456 yyy\n789 zzz 0'`, the function should return four rows:\n\n* `123`\n* `456`\n* `789`\n* `0`\n\nWith the `vtab` module it is very easy to implement this:\n\n```python\nimport re\n\nfrom vtfunc import TableFunction\n\n\nclass RegexSearch(TableFunction):\n    params = ['regex', 'search_string']\n    columns = ['match']\n    name = 'regex_search'\n\n    def initialize(self, regex=None, search_string=None):\n        self._iter = re.finditer(regex, search_string)\n\n    def iterate(self, idx):\n        # We do not need `idx`, so just ignore it.\n        return (next(self._iter).group(0),)\n```\n\nTo use our function, we need to register the module with a SQLite connection,\nthen call it using a `SELECT` query:\n\n```python\n\nimport sqlite3\n\nconn = sqlite3.connect(':memory:')  # Create an in-memory database.\n\nRegexSearch.register(conn)  # Register our module.\n\nquery_params = ('[0-9]+', '123 xxx 456 yyy 789 zzz 0')\ncursor = conn.execute('SELECT * FROM regex_search(?, ?);', query_params)\nprint cursor.fetchall()\n```\n\nLet's say we have a table that contains a list of arbitrary messages and we\nwant to capture all the e-mail addresses from that table. This is also easy\nusing our table-valued function. We will query the `messages` table and pass\nthe message body into our table-valued function. Then, for each email address\nwe find, we'll return a row containing the message ID and the matching email\naddress:\n\n```python\n\nemail_regex = '[\\w]+@[\\w]+\\.[\\w]{2,3}'  # Stupid simple email regex.\nquery = ('SELECT messages.id, regex_search.match '\n         'FROM messages, regex_search(?, messages.body)')\ncursor = conn.execute(query, (email_regex,))\n```\n\nThe resulting rows will look something like:\n\n```\n\nmessage id |         email\n-----------+-----------------------\n     1     | charlie@example.com\n     1     | huey@kitty.cat\n     1     | zaizee@morekitties.cat\n     3     | mickey@puppies.dog\n     3     | huey@throwaway.cat\n    ...    |         ...\n```\n\n#### Important note\n\nIn the above example you will note that the parameters for our query actually\nchange (because each row in the messages table has a different search string).\nThis means that for this particular query, the `RegexSearch.initialize()`\nfunction will be called once for each row in the `messages` table.\n\n### How it works\n\nBehind-the-scenes, `vtfunc` is creating a [Virtual Table](http://sqlite.org/vtab.html)\nand filling in the various callbacks with wrappers around your user-defined\nfunction. There are two important methods that the wrapped virtual table\nimplements:\n\n* xBestIndex\n* xFilter\n\nWhen SQLite attempts to execute a query, it will call the xBestIndex method of\nthe virtual table (possibly multiple times) trying to come up with the best\nquery plan. The `vtfunc` module optimizes for those query plans which include\nvalues for all the parameters of the user-defined function. Since some\nuser-defined functions may have optional parameters, query plans with only a\nsubset of param values will be slightly penalized.\n\nSince we have no visibility into what parameters the user *actually* passed in,\nand we don't know ahead of time which query plan SQLite suggests will be\nbest, `vtfunc` just does its best to optimize for plans with the highest\nnumber of usable parameter values.\n\nIf you encounter a situation where you pass your function multiple parameters,\nbut it doesn't receive all of them, it's the case that a less-than-optimal\nplan was used.\n\nAfter the plan is chosen by calling xBestIndex, the query will execute by\ncalling xFilter (possibly multiple times). xFilter has access to the actual\nquery parameters, and it's responsibility is to initialize the cursor and call\nthe user's initialize() callback with the parameters passed in.\n\n## Query Builder\n\nThe query builder is designed to allow users to construct queries with reusable\nPython objects. Instead of working with string query fragments, you can\nconstruct queries using more Pythonic components which are then compiled into\nSQL.\n\nThe query-builder is designed with consistency and composability as the primary\ngoals. Consistency enables one to learn once, then apply everywhere, while\ncomposability ensures that large systems can be built one piece at a time, from\nthe bottom-up.\n\nExample of constructing a simple query:\n\n```python\n\ndb = Database('app.db')\n\nEmployee = Table('employees', ('id', 'name', 'start_date', 'manager_id'))\n\n# Get list of employees and their manager's name, sorted by tenure.\nManager = Employee.alias('manager')\nquery = (Employee\n         .select(\n             Employee.name,\n             Employee.start_date,\n             Manager.name.alias('manager_name'))\n         .join(Manager, JOIN.LEFT_OUTER, on=(Employee.manager_id == Manager.id))\n         .order_by(Employee.start_date))\n\nfor row in query.execute(db):\n    print row['name'], row['start_date'], (row['manager_name'] or 'no mgr')\n```\n\nUnlike an ORM, the query builder has no opinions on your data-model, nor does\nit encourage inefficiency (e.g. the [n+1 problem](http://docs.peewee-orm.com/en/latest/peewee/querying.html#avoiding-n-1-queries)).\nThe query builder can be integrated into an already-running system with hardly\nany code needing to be written.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoleifer%2Fsweepea","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoleifer%2Fsweepea","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoleifer%2Fsweepea/lists"}